# Salesforce DMP Krux Import Integration

You can import Media Campaign, Paid Search Campaign, Site Campaign, User Audience Segment Map, Segment Mapping File or Dissent Lists from Salesforce DMP (Krux) into Treasure Data.

## Prerequisites

- Basic knowledge of Treasure Data, including the [Toolbelt](https://toolbelt.treasuredata.com/) and [JavaScript SDK](https://docs.treasuredata.com/smart/project-product-documentation/getting-started-with-website-tracking)
- S3 credential with access key id and secret access key.
- Client name from Salesforce DMP


## Integration Overview

This integration has two parts:

1. **Cookie-syncing between Salesforce DMP and Treasure Data CDP**: Required to create a mapping between Salesforce DMP ID and Treasure Data ID's td_global_id & td_client_id
2. **Data import from Salesforce DMP into Treasure Data CDP:** There are various data feeds that can be brought in. For the purpose of data enrichment, the key file is the mapping between Segment IDs and their names.


![](/assets/image-20190923-192248.037c2a6dff6b3b60c8a7538021bc396756b5712ea081e6ab29c40c4415cfd243.d081385e.png)

## Implement a Cookie-Syncing Tag

You must first set up Treasure Data's JavaScript tag as documented [in Getting Started with Website Tracking](/products/customer-data-platform/integration-hub/streaming/td-javascript-sdk/getting-started-with-tracking-and-the-td-javascript-sdk) under "Setting up website tracking and install the Treasure Data JavaScript SDK".

Next, add the following piece of code into the website where Salesforce DMP's tag is already installed.


```javascript
(function(window, document, td){

var kruxProperties = {};
for ( var k in window.localStorage ) {
    if ( k.startsWith('YOUR KRUX PREFIX HERE') ) {
        kruxProperties[k] = window.localStorage.getItem(k)
    }
}
td.trackEvent('<TD TABLE NAME FOR TRACKING KRUX ID/TD ID map>', kruxProperties);

var successCb = function(tdGlobalId) {
  // This is createImage in TDWrapper
  var el = document.createElement('img');
  el.src = '//beacon.krxd.net/usermatch.gif?partner=treasuredata&partner_uid=' + tdGlobalId;
  el.width=1;
  el.height=1;
  el.style.display='none';
  document.body.appendChild(el);
}

function isSafari() {
  var ua = window.navigator.userAgent.toLowerCase();
  return ua.indexOf('safari') !== -1 && ua.indexOf('chrome') === -1 && ua.indexOf('edge') === -1;
}

if (isSafari() ) {
  // TODO: Safari-specific handling due to ITP 2.1
} else {
  td.fetchGlobalID(successCb, function(err) { console.log(err) });
}

})(window, document, td);
```

The preceding code sample does not include cookie-syncing for Safari browsers. Safari's Intelligent Tracking Prevention (ITP) feature makes 3rd party domain cookie-based visitor identification less reliable. We are actively planning a solution around this.

## Use the TD Console to Create Your Connection

### Create a New Connection

Go to Integrations Hub > Catalog and search and select Salesforce DMP.

![](/assets/image-20190923-192557.6c99bd10dd317889671b412d06bdef80e38689e2d74d8cda7c505eaee34c2546.d081385e.png)

Select **Create.** You are creating an authenticated connection.

The following dialog opens.

![](/assets/image-20190923-192618.915080c94fa2666ffba006e78f7878c8775d6adc8a6b76fc702ea352866e68d9.d081385e.png)

Edit the client name, access key id, and secret access key that you retrieved from Salesforce DMP.

Select **Continue**.

![](/assets/image-20190923-192703.fa238aa47270a321506bdf6ca59a252c45aa7e208dc8c0db8f819840eb983743.d081385e.png)

Name your new Salesforce DMP Connection. Select **Done**.

### Transfer Your Data to Treasure Data

After creating the authenticated connection, you are automatically taken to the Authentications tab. Look for the connection you created and select **New Source**.

Specify the data that you want to import:

- Segment Mapping File
- User Audience Segment Map
- Media Campaign, Paid Search Campaign, Site Campaign or Dissent Lists


### Import Segment Mapping File

For the Source, choose Segment Mapping File.

![](/assets/mceclip6.84539345d3104f7ba8fea0da416cd42e52159467ee6af627c19f4cbad1f6b790.d081385e.png)

### Import User Audience Segment Map

For the `Source`, choose User Audience Segment Map.

![](/assets/image-20190923-192736.c4342b0d6c3059d373e070f6b1af33c0d8c2f149e0e694956f1fc310f81b8a04.d081385e.png)

Parameters:

- **Import Date**: Import data created from this date.


### Import Media Campaign, Paid Search Campaign, Site Campaign, Dissent Lists

For the `Source`, choose Media Campaign, Paid Search Campaign, Site Campaign, or Dissent Lists.

![](/assets/image-20190923-192809.d9c6a496fcb11e4e13a5dd38d791e513dbbd0bc06240d160c38a1d3374bed22f.d081385e.png)

Parameters:

- **Start Date**: Import data that has been created since this date.
- **End Date**: Import data that has been created up to this date.
- **Incremental Loading**: When importing data based on a schedule, the time window of the fetched data automatically shifts forward on each run. For example, if you specify the initial start date as January 1 and end date as January 10, the first run fetches data from January 1 to January 10, the second run fetches from January 11 to January 20, and so on.


### Preview

Data preview is optional, and you can safely click **Next** to go to the next page of the dialog if you would like.

1. Display a preview of your data before running the import by selecting **Generate Preview**.
The data shown in the data preview is approximated from your source. It is not the actual data that is imported.
2. Verify that the data looks approximately like you expect it to.
![](/assets/snippet-data-preview-2024-02-09.27dc5fd8772fca4f7f44ab28c00476ae1894744fe1e75d06932628929cc7bff1.4e139be3.png)
3. Select **Next**.


### Advanced Settings

![](/assets/image-20190923-193141.27f12a027e6398552418665e03dcf3532cab2158f3e260d642ab6f38d7d47b61.d081385e.png)

You can specify the following parameters:

- Maximum retry times. Specifies the maximum retry times for each API call.
  - Type: number
  - Default: 7
- Initial retry interval millisecond. Specifies the wait time for the first retry.
  - Type: number
  - Default: 1000
- Maximum retry interval milliseconds. Specifies the maximum time between retries.
  - Type: number
  - Default: 120000


### Choose the Target Database and Table

Choose existing ones or create a new database and table.

![](/assets/image-20190923-193250.c96b094df49f14d2621028c14175a20717b9a6d70bf6fdadb7cfb3f936aa4f96.d081385e.png)

Create a new database and give your database a name. Complete similar steps for **Create new table.**

Select whether to **append** records to an existing table or **replace** your existing table.

If you want to set a different **partition key seed** rather than use the default key, you can specify one using the popup menu.

### Scheduling

In the **When** tab, you can specify a one-time transfer, or schedule an automated recurring transfer.

Parameters

- **Once now**: set one time job.
- **Repeat…**
  - **Schedule**: accepts these three options: *@hourly*, *@daily* and *@monthly* and custom *cron*.
  - **Delay Transfer**: add a delay of execution time.
- **TimeZone**: supports extended timezone formats like ‘Asia/Tokyo’.
![](/assets/image-20190923-194054.e58d3768035b844efdc69f8fff99a48edcc931b40428fd406fbb3c2965274ffc.d081385e.png)


### Details

Name your Transfer and select **Done** to start.

![](/assets/mceclip13.5ca8bfbad25bde2c23b6bfbf04bbf8f9f368e270270b3390f08502f263c47fac.d081385e.png)

After your transfer has run, you can see the results of your transfer in the **Databases** tab.

## Use the Command Line to create your Salesforce DMP connection

You can use the TD Console to configure your connection.

### Install the Treasure Data Toolbelt

Install the newest [TD Toolbelt](https://toolbelt.treasuredata.com/).

### Create a Configuration File (load.yml)

The configuration file includes an in: section where you specify what comes into the connector from Salesforce DMP and an out: section where you specify what the connector puts out to the database in Treasure Data. For more details on available out modes, see the Appendix.

The following example shows how to specify import Media Campaign, without incremental scheduling.


```yaml
in:
  type: krux_dmp
  access_key_id: xxxxxxxxxxx
  secret_access_key: xxxxxxxxxxx  client_name: xxxxxxxxxxx
  target: mc
  start_date: 2019-01-17
  end_date: 2019-01-27
  incremental: false
out: mode: append
```

The following example shows how to specify import Media Campaign, with incremental scheduling.


```yaml
in:
  type: krux_dmp
  access_key_id: xxxxxxxxxxx
  secret_access_key: xxxxxxxxxxx
  client_name: xxxxxxxxxxx
  target: mc
  start_date: 2019-01-17
  end_date: 2019-01-27
  incremental: true
out: mode: append
```

The following example shows how to specify import Paid Search Campaign, without incremental scheduling.


```yaml
in:
  type: krux_dmp
  access_key_id: xxxxxxxxxxx
  secret_access_key: xxxxxxxxxxx
  client_name: xxxxxxxxxxx
  target: psc
  start_date: 2019-01-17
  end_date: 2019-01-27
  incremental: false
out: mode: append
```

The following example shows how to specify import Paid Search Campaign, with incremental scheduling.


```yaml
in:
  type: krux_dmp
  access_key_id: xxxxxxxxxxx
  secret_access_key: xxxxxxxxxxx
  client_name: xxxxxxxxxxx
  target: psc
  start_date: 2019-01-17
  end_date: 2019-01-27
  incremental: true
out: mode: append
```

The following example shows how to specify import Site Campaign, without incremental scheduling.


```yaml
in:
  type: krux_dmp
  access_key_id: xxxxxxxxxxx
  secret_access_key: xxxxxxxxxxx
  client_name: xxxxxxxxxxx
  target: sc
  start_date: 2019-01-17
  end_date: 2019-01-27
  incremental: false
out:
  mode: append
```

The following example shows how to specify import Site Campaign, with incremental scheduling.


```yaml
in:
  type: krux_dmp
  access_key_id: xxxxxxxxxxx
  secret_access_key: xxxxxxxxxxx
  client_name: xxxxxxxxxxx
  target: sc
  start_date: 2019-01-17
  end_date: 2019-01-27
  incremental: true
out:
  mode: append
```

The following example shows how to specify import Dissent Lists, without incremental scheduling.


```yaml
in:
  type: krux_dmp
  access_key_id: xxxxxxxxxxx
  secret_access_key: xxxxxxxxxxx
  client_name: xxxxxxxxxxx
  target: dl
  start_date: 2019-01-17
  end_date: 2019-01-27
  incremental: false
out:
  mode: append
```

The following example shows how to specify import Dissent Lists, with incremental scheduling.


```yaml
in:
  type: krux_dmp
  access_key_id: xxxxxxxxxxx
  secret_access_key: xxxxxxxxxxx
  client_name: xxxxxxxxxxx
  target: dl
  start_date: 2019-01-17
  end_date: 2019-01-27
  incremental: true
out:
  mode: append
```

The following example shows how to specify import User Audience Segment Map.


```yaml
in:
  type: krux_dmp
  access_key_id: xxxxxxxxxxx
  secret_access_key: xxxxxxxxxxx
  client_name: xxxxxxxxxxx
  target: uasm
  import_date: 2019-01-17
out:
  mode: append
```

The following example shows how to specify import Segment Mapping File.


```yaml
in:
  type: krux_dmp
  access_key_id: xxxxxxxxxxx
  secret_access_key: xxxxxxxxxxx
  client_name: xxxxxxxxxxx
  target: smf
out:
  mode: append
```

### Preview the Data to be Imported (Optional)

You can preview data to be imported using the command td connector:preview.


```bash
$ td connector:preview load.yml
```

### Execute the Load Job

You use td connector:issue to execute the job.

You must specify the database and table where you want to store the data before you execute the load job. Ex td_sample_db, td_sample_table


```
$ td connector:issue load.yml \      --database td_sample_db \      --table td_sample_table \      --time-column date_time_column
```

It is recommended to specify --time-column option because Treasure Data’s storage is partitioned by time. If the option is not given, the data connector selects the first long or timestamp column as the partitioning time. The type of the column, specified by --time-column, must be either of long or timestamp type (use Preview results to check for the available column name and type. Generally, most data types have a last_modified_date column).

If your data doesn’t have a time column, you can add the column by using the add_time filter option. See details at [add_time filter](https://docs.treasuredata.com/smart/project-product-documentation/add_time-filter-function) plugin.

The td connector:issue assumes you have already created a database (sample_db) and a table (sample_table). If the database or the table does not exist in TD, td connector:issue will fail. Therefore, you must create the database and table manually or use --auto-create-table with td connector:issue to automatically create the database and table.


```
 $ td connector:issue load.yml \       --database td_sample_db \       --table td_sample_table \       --time-column date_time_column \      --auto-create-table
```

From the command line, submit the load job. Processing might take a couple of hours depending on the data size.

### Scheduled Running of the Integration

You can schedule periodic data connector execution for periodic Media Campaign, Paid Search Campaign, Site Campaign import. We configure our scheduler carefully to ensure high availability. By using this feature, you no longer need a cron daemon on your local data center.

Scheduled execution supports configuration parameters that control the behavior of the data connector during its periodic attempts to fetch data from Salesforce DMP:

- `incremental` This configuration is used to control the load mode, which governs how the data connector fetches data from Salesforce DMP based on one of the native timestamp fields associated with each object.
- `columns` This configuration is used to define a custom schema for data to be imported into Treasure Data. You can define only columns that you are interested in here but make sure they exist in the object that you are fetching. Otherwise, these columns aren’t available in the result.
- `last_record` This configuration is used to control the last record from the previous load job. It requires the object to include a `key` for the column name and a `value` for the column’s value. The `key` needs to match the Salesforce DMP Data column name.


See How Incremental Loading works for details and examples.

### Create the Schedule

A new schedule can be created using the td connector:create command. The name of the schedule, cron-style schedule, the database and table where their data will be stored, and the data connector configuration file are required.

The `cron` parameter accepts these options: `@hourly`, `@daily`, and `@monthly`.

By default, the schedule is set up in the UTC timezone. You can set the schedule in a timezone using -t or --timezone option. The `--timezone` option only supports extended timezone formats like 'Asia/Tokyo', 'America/Los_Angeles', etc. Timezone abbreviations like PST, CST are *not* supported and may lead to unexpected schedules.


```
$ td connector:create \
    daily_import \
    "10 0 * * *" \
    td_sample_db \
    td_sample_table \
    load.yml
```

It’s also recommended to specify the *--time-column* option because Treasure Data’s storage is partitioned by time.


```
$ td connector:create \
    daily_import \
    "10 0 * * *" \
    td_sample_db \
    td_sample_table \
    load.yml \
    --time-column created_at
```

### List the Schedules

You can see the list of scheduled entries by entering the command td connector:list.


```
$ td connector:list
```

### Show the Schedule Settings and History of Schedules

td connector:show shows the execution setting of a schedule entry.


```bash
td connector:show daily_import
```

td connector:history shows the execution history of a schedule entry. To investigate the results of each individual execution, use td job jobid.


```bash
td connector:history daily_import
```

### Delete the Schedule

td connector:delete removes the schedule.


```bash
$ td connector:delete daily_import
```

### How Incremental Loading works

Incremental loading uses the last imported date of files to load records monotonically, inserting or updating files after the most recent execution.

At the first execution, this connector loads all files matching the **Filename Regex** and **Modified After**. If **incremental** `: true is` set, the latest modified DateTime will be saved as a new **Modified After** value.

Example:

- Import folder contains files:

```
+--------------+--------------------------+
|   Filename   |     Last update          |
+--------------+--------------------------+
| File0001.csv | 2019-05-04T10:00:00.123Z |
| File0011.csv | 2019-05-05T10:00:00.123Z |
| File0012.csv | 2019-05-06T10:00:00.123Z |
| File0013.csv | 2019-05-07T10:00:00.123Z |
| File0014.csv | 2019-05-08T10:00:00.123Z |
```
- Filename Regex: File001.*.csv
- Modified After: 2019-05-01T10:00:00.00Z


Then the files: **File0011.csv**, **File0012.csv**, **File0013.csv,** and **File0014.csv** are imported as they match the Filename Regex, and all having the last update > 2019-05-01T10:00:00.00Z.

After the job finished, new **Modified After = 2019-05-08T10:00:00.123Z** is saved.

At the next execution, only files having the last update > **2019-05-08T10:00:00.123Z** are imported.

Example:

- Import folder has newly updated and added files:

```
+--------------+--------------------------+
|   Filename   |     Last update          |
+--------------+--------------------------+
| File0001.csv | 2019-05-04T10:00:00.123Z |
| File0011.csv | 2019-05-05T10:00:00.123Z |
| File0012.csv | 2019-05-06T10:00:00.123Z |
| File0013.csv | 2019-05-09T10:00:00.123Z |
| File0014.csv | 2019-05-08T10:00:00.123Z | 
| File0015.csv | 2019-05-09T10:00:00.123Z |
```
- Filename Regex: File001.*.csv
- Modified After: **2019-05-08T10:00:00.123Z**


Then only files: **File0013.csv** and **File0015.csv** are imported.