Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: links and added excerpts where it was easy

You can import Media Campaign, Paid Search Campaign, Site Campaign, User Audience Segment Map, Segment Mapping File or Dissent Lists from Salesforce DMP (Krux) into Treasure Data.

Table of Contents
maxLevel3

Prerequisites

  • Basic knowledge of Treasure Data, including the Toolbelt and JavaScript SDK

  • S3 credential with access key id and secret access key.

  • Client name from Salesforce DMP

...

Code Block
linenumberstrue
(function(window, document, td){

var kruxProperties = {};
for ( var k in window.localStorage ) {
    if ( k.startsWith('<YOUR KRUX PREFIX HERE>') ) {
        kruxProperties[k] = window.localStorage.getItem(k)
    }
}
td.trackEvent('<TD TABLE NAME FOR TRACKING KRUX ID/TD ID map>', kruxProperties);
  
var successCb = function(tdGlobalId) {
  // This is createImage in TDWrapper
  var el = document.createElement('img');
  el.src = '//beacon.krxd.net/usermatch.gif?partner=treasuredata&partner_uid=' + tdGlobalId;
  el.width=1;
  el.height=1;
  el.style.display='none';
  document.body.appendChild(el);
}
  
function isSafari() {
  var ua = window.navigator.userAgent.toLowerCase();
  return ua.indexOf('safari') !== -1 && ua.indexOf('chrome') === -1 && ua.indexOf('edge') === -1;
}

if ( isSafari() ) {
  // TODO: Safari-specific handling due to ITP 2.1
} else {
  td.fetchGlobalID(successCb, function(err) { console.log(err) });
}

})(window, document, td);


The preceding code sample does not include cookie-syncing for Safari browsers. Safari's Intelligent Tracking Prevention (ITP) feature makes 3rd party domain cookie-based visitor identification less reliable. We are actively planning a solution around this.


Use the TD Console to Create Your Connection

...

Edit the client name, access key id, and secret access key that you retrieved from Salesforce DMP.

...

For the Source, choose Media Campaign, Paid Search Campaign, Site Campaign, or Dissent Lists.

Parameters:

  • Start Date: Import data that has been created since this date.

  • End Date: Import data that has been created up to this date.

  • Incremental Loading: When importing data based on a schedule, the time window of the fetched data automatically shifts forward on each run. For example, if you specify the initial start date as January 1 and end date as January 10, the first run fetches data from January 1 to January 10, the second run fetches from January 11 to January 20, and so on.

Preview

Excerpt Include
Data Preview
Data Preview

You’ll see a preview of your data. To make changes, select Advanced Settings, otherwise select Next.

nopaneltrue
Image Removed


Advanced Settings

You can specify the following parameters:

...

In the When tab, you can specify a one-time transfer, or schedule an automated recurring transfer.

...

The configuration file includes an in: section where you specify what comes into the connector from Salesforce DMP and an out: section where you specify what the connector puts out to the database in Treasure Data. For more details on available out modes, see the Appendix.

The following example shows how to specify import Media Campaign, without incremental scheduling.

...

It is recommended to specify --time-column option , because Treasure Data’s storage is partitioned by time. If the option is not given, the data connector selects the first long or timestamp column as the partitioning time. The type of the column, specified by --time-column, must be either of long or timestamp type (use Preview results to check for the available column name and type. Generally, most data types have a last_modified_date column).

If your data doesn’t have a time column, you can add the column by using the add_time filter option. See details at add_time filter plugin.

The td connector:issue assumes you have already created a database (sample_db) and a table (sample_table). If the database or the table do does not exist in TD, td connector:issue will fail. Therefore, you must create the database and table manually or use --auto-create-table with td connector:issue to automatically create the database and table.

...

From the command line, submit the load job. Processing might take a couple of hours depending on the data size.

Scheduled

...

Running of the Integration

You can schedule periodic data connector execution for periodic Media Campaign, Paid Search Campaign, Site Campaign import. We configure our scheduler carefully to ensure high availability. By using this feature, you no longer need a cron daemon on your local data center.

...

  • incremental This configuration is used to control the load mode, which governs how the data connector fetches data from Salesforce DMP based on one of the native timestamp fields associated with each object.

  • columns This configuration is used to define a custom schema for data to be imported into Treasure Data. You can define only columns that you are interested in here but make sure they exist in the object that you are fetching. Otherwise, these columns aren’t available in the result.

  • last_record This configuration is used to control the last record from the previous load job. It requires the object to include a key for the column name and a value for the column’s value. The key needs to match the Salesforce DMP Data column name.

See How Incremental Loading works for details and examples.

Create the

...

Schedule

A new schedule can be created using the td connector:create command. The name of the schedule, cron-style schedule, the database and table where their data will be stored, and the data connector configuration file are required.

The `cron` parameter accepts these options: `@hourly`, `@daily`, and `@monthly`.

By default, the schedule is

setup

set up in the UTC timezone. You can set the schedule in a timezone using -t or --timezone option. The `--timezone` option only supports extended timezone formats like 'Asia/Tokyo', 'America/Los_Angeles', etc. Timezone abbreviations like PST, CST are *not* supported and may lead to unexpected schedules.


Code Block
linenumberstrue
$ td connector:create \
    daily_import \
    "10 0 * * *" \
    td_sample_db \
    td_sample_table \
    load.yml

It’s also recommended to specify the --time-column option , since because Treasure Data’s storage is partitioned by time.

...

Code Block
linenumberstrue
$ td connector:delete daily_import

...

Modes for the out plugin

You can specify file import mode in the out section of the load.yml file.

The out: section controls how data is imported into a Treasure Data table.
For example, you may choose to append data or replace data in an existing table in Treasure Data.

Output modes are ways to modify the data as the data is placed in Treasure Data.

  • Append (default): Records are appended to the target table.

  • Replace (available In td 0.11.10 and later): Replaces data in the target table. Any manual schema changes made to the target table remain intact.

Examples:

Code Block
linenumberstrue
in:
  ...
out:
  mode: append
Code Block
linenumberstrue
in:
  ...
out:
  mode: replace


Excerpt Include
Import Modes for the Out Section of the Load.yml File
Import Modes for the Out Section of the Load.yml File
nopaneltrue


How Incremental Loading works

Incremental loading uses the last imported date of files to load records monotonically, inserting or updating files after the most recent execution.

At the first execution, this connector loads all files matching the Filename Regex and Modified After. If incremental : true is set, the latest modified DateTime will be saved as a new Modified After value.

Example:

...

Then the files: File0011.csv, File0012.csv, File0013.csv, and File0014.csv are imported as they match the Filename Regex, and all having the last update > 2019-05-01T10:00:00.00Z.

...

At the next execution, only files having the last update > 2019-05-08T10:00:00.123Z are imported.

...