Visit our new documentation site! This documentation page is no longer updated.

Writing Job Results into Salesforce.com (SFDC)

This article explains how to write job results to your Salesforce.com organization.

Table of Contents

Prerequisites

  • Basic knowledge of Treasure Data, including the toolbelt
  • Salesforce.com organization and username, password, and security token for API integration
  • User has “API Enabled” permission
  • Target Salesforce.com Object should exist with read/write permissions for the User

Architecture

A front-end application streams data to be collected in Treasure Data via Log/data collector daemon or Mobile SDKs. You can also bulk import your data using Bulk Import from the CLI. A scheduled query is setup in Treasure Data to run periodically on the data and write the result of each query execution into your Salesforce.com Object.



The above is a fairly common architecture; below are a few examples.

Example 1: Ranking: What are the “Top N of X?”

Every social/mobile application calculates the “top N of X” (ex: top 5 movies watched today). Treasure Data already handles the raw data warehousing; the “write-to-Salesforce.com” feature enables Treasure Data to find the “top N” data as well.

Example 2: Dashboard Application

If you’re a data scientist, you need to keep track of a range of metrics every hour/day/month and make them accessible via visualizations. Using this “write-to-Salesforce.com” feature, you can streamline the process and focus on building visualizations of your query results via Reports and Dashboards on the Salesforce.com organization.

Result Output URL Format

Format

The result output target is represented by a URL with the following format:

sfdc://<username>:<password><security_token>@<hostname>/<object_name>

where:

  • sfdc is an identifier for result output to Salesforce.com;
  • username and password are the credential to your Salesforce.com organization;
  • security_token is the additional credential for API access;
  • hostname is the host name of the Salesforce.com organization. Usually this is ‘login.salesforce.com’ for production environments and ‘test.salesforce.com’ for sandbox environments. In case where you configure custom domain for your organization specify the hostname you’re using for login;
  • object_name is the target Salesforce.com Object API name (e.g. ResultOutput__c). Please note that the Object and columns for data integration must be defined beforehand;

For example with:

  • username: user@treasure-data.com
  • password: PASSWORD
  • security_token: 7SMvicR9ojdPz0XLtlWi3Rtw

The URL will look like:

sfdc://user%40treasure-data.com:PASSWORD7SMvicR9ojdPz0XLtlWi3Rtw@login.salesforce.com/Account
Untitled-3
Make sure that you escape the '@' in the username with '%40'.

Options

Result output to Salesforce.com supports various options that can be specified as optional URL parameters. The options are compatible with each other and can be combined.
Where applicable, the default behavior is indicated.

Update mode option

Controls the various ways of modifying the database data.

  • Append
  • Truncate
  • Update
mode=append (default)

The append mode is the default which is used when no mode option is provided in the URL. In this mode, the query results are appended to the object.

Because mode=append is the default behavior, these two URLs are equivalent:

sfdc://.../Contact
sfdc://.../Contact?mode=append
mode=truncate

With the truncate mode the system first truncates the existing records in the Salesforce.com Object and moves them into the Trashbin, then inserts the query results.

Example:

sfdc://.../CustomObject__c?mode=truncate
Untitled-3
You can specify the hard_delete=true option for mode=truncate to delete records instead of moving it to the Trashbin. To use this option, the user must have the 'Bulk API Hard Delete' permission.
sfdc://.../CustomObject__c?mode=truncate&hard_delete=true
mode=update

With the update mode, a row is inserted unless it would cause a duplicate value in the external key columns specified in the “unique” parameter. In such case, an update is performed instead. The “unique” parameter is required with this mode and must be defined as an external key when used with the update mode.

Example:

sfdc://.../Contact?mode=update&unique=CustomerId__c

The default behavior for the ‘update’ mode is actually ‘upsert’. If you do not want to “upsert” but only “update”, you can add the upsert=false option. Then it updates existing records based on “unique” parameter match and not insert new records.

sfdc://.../Contact?mode=update&unique=CustomerId__c&upsert=false

Upload concurrency_mode option

The concurrency_mode option controls how the data is uploaded to the Salesforce.com organization. The default mode is parallel; it is the recommended method for most situations.

concurrency_mode=parallel (default)

With the parallel method, data is uploaded in parallel. This is the most reliable and effective method and it is recommended for most situations.

Because concurrency_mode=parallel is the default behavior, these two URLs are equivalent:

sfdc://.../CustomObject__c
sfdc://.../CustomObject__c?concurrency_mode=parallel
concurrency_mode=serial

Uploading records in parallel is recommended. However, if you see “UNABLE_TO_LOCK_ROW” in an error message, try concurrency_mode=serial instead.

sfdc://.../CustomObject__c?concurrency_mode=serial

Updating A Salesforce.com Object acquires a lock for the Object and parent Object referenced by columns. If you upload objects in parallel and multiple objects have reference to the same parent object, Salesforce.com is not able to acquire the lock for insert/update and returns an ‘UNABLE_TO_LOCK_ROW’ error. In such cases, specify the concurrency_mode=serial option.

Authentication session_id option

If you have a Salesforce.com Session ID, you can authenticate with the session_id option instead of username, password, and security token (that is, username, password, and security token can be omitted from the URL).

sfdc://login.salesforce.com/Contact?session_id=3deT2aQjYQbIRN0M...jB1tHBb7UW0K!M

Retry option

This options sets the number of attemps the Treasure Data export worker makes to write the result to the configured Salesforce.com destination, if errors occur. If the export fails more than the set number of retries, the query fails.
The default number of retries is retry=2 but one can virtually set it to any number. Please note that the number of retries affect the overall duration of a query.

sfdc://.../CustomObject__c?retry=5

Split Records options

The Treasure Data result export splits the records in the result of a query in chunks of 10000 records by default and bulk upload one chunk at a time. The split_records option configures the size of this chunk, if required.

sfdc://.../CustomObject__c?split_records=100

Usage

CLI

To output the result of a single query to Salesforce.com organization add the -r / —result option to the td query command. After the job is finished, the results are written into your Salesforce.com organization Object:

$ td query -w -d testdb \
  --result 'sfdc://login.salesforce.com/CustomObject__c?session_id=.....' \
  "SELECT code as Code__c, COUNT(1) as Count__c FROM www_access GROUP BY code"

To create a scheduled query whose output is systematically written to Salesforce.com organization add the -r / —result option when creating the schedule through the td sched:create command:

$ td sched:create hourly_count_example "0 * * * *" -d testdb \
  --result 'sfdc://user%40treasure-data.com:PASSWORDsecuritytoken@login.salesforce.com/CustomObject__c' \
  "SELECT COUNT(*) as Count__c FROM www_access"

Console

Using new query page, create a query with the result set that you would like to write to Salesforce.

Untitled-3
To avoid any issues with result export, define column aliases in your query such that resulting column names from the query match the Salesforce field names for default fields and API names (usually ending with __c) for custom fields.

Console-Export to Salesforce query

To export the result of a query or schedule to Salesforce.com, specify Result Export information for a new query or an existing job/query:

Console - Export to Salesforce.com

Troubleshooting

If you get the following type of error due to the Result Output to SFDC job, you can see the error of the job on SFDC. As shown in the following example, check “XXXXXXXXXXX” on SFDC. You can figure out a detail of the error.

:::terminal 17/05/01 03:35:05 INFO sfdc.BulkAPIJob: Job XXXXXXXXXXX finished: Total 1, Completed 0, Failed 1 17/05/01 03:35:05 INFO sfdc.BulkAPIClient: Batch jobs failed (1/1)

Ref. View Bulk Data Load Job Details.


Last modified: Oct 23 2017 17:52:32 UTC

If this article is incorrect or outdated, or omits critical information, let us know. For all other issues, access our support channels.