Skip to content
Last updated

Sansan Data Hub Export Integration

Sansan Data Hub helps you organize and integrate customer data within your company and transform it into optimal data for marketing purposes. With this integration, you can export job results from Treasure Data into Sansan Data Hub.

What can you do with this Integration?

  • Send multiple upload requests to upload your data from Treasure Data to Sansan Data Hub.
  • Check the status of each upload request.

Prerequisites

  • Basic Knowledge of Treasure Data.
  • Basic knowledge of Sansan Data Hub

Static IP Address of Treasure Data Integration

If your security policy requires IP whitelisting, you must add Treasure Data's IP addresses to your allowlist to ensure a successful connection.

Please find the complete list of static IP addresses, organized by region, at the following link:
https://api-docs.treasuredata.com/en/overview/ip-addresses-integrations-result-workers/

Use the TD Console to Create a Connection

In Treasure Data, you must create and configure the data connection before running your query. As part of the data connection, you provide authentication to access the integration.

Create a New Authentication

  1. Open TD Console.
  2. Navigate to Integrations Hub > Catalog.
  3. Search for Sansan Data Hub and select Sansan Data Hub (Output).
  4. Select Create Authentication.
  5. Type the credentials to authenticate:
ParameterDescription
Client IDYour account client ID
Client SecretYour account client Secret
  1. Select Continue.
  2. Type a name for your connection.
  3. Select Done.

Define your Query

  1. Navigate to Data Workbench > Queries.
  2. Select New Query.
  3. Run the query to validate the result set.

The Sansan Data Hub Output Connector skips any data that contains the comma(,) character. Example:

marketoid,company,createdday,updatedday
1,company1,2022-06-07 12:18:06,2022-06-07 12:18:06
2,company2,2022-06-07 12:18:06,2022-06-07 12:18:06
3,company3withcomma,,2022-06-07 12:18:06,2022-06-07 12:18:06 #This row is skipped
4,company4,withcomma,2022-06-07 12:18:06,2022-06-07 12:18:06 #This row is skipped

Specify the Result Export Target

  1. Select Export Results.

  2. You can select an existing authentication for the external service for output. Choose one of the following:

Use Existing Integration

FieldDescription
Data source IDThe source target to export data
Get the status of the file import jobGet the status of each batch upload request
Skip invalid recordsSkip failed to validate records Skip failed to upload batches
Maximum RetryMaximum retry times
Seconds to wait for first retryWaiting time for the first retry in seconds
Seconds for max retry waitMax waiting time for the retry in seconds
HTTP Connection TimeoutHTTP connection timeout
HTTP Read TimeoutHTTP read timeout
HTTP Write TimeoutHTTP write timeout

Example Query

SELECT 
 * 
FROM 
 www_access

(Optional) Schedule Query Export Jobs

You can use Scheduled Jobs with Result Export to periodically write the output result to a target destination that you specify.

Treasure Data's scheduler feature supports periodic query execution to achieve high availability.

When two specifications provide conflicting schedule specifications, the specification requesting to execute more often is followed while the other schedule specification is ignored.

For example, if the cron schedule is '0 0 1 * 1', then the 'day of month' specification and 'day of week' are discordant because the former specification requires it to run every first day of each month at midnight (00:00), while the latter specification requires it to run every Monday at midnight (00:00). The latter specification is followed.

Scheduling your Job Using TD Console

  1. Navigate to Data Workbench > Queries

  2. Create a new query or select an existing query.

  3. Next to Schedule, select None.

  4. In the drop-down, select one of the following schedule options:

    Drop-down ValueDescription
    Custom cron...Review Custom cron... details.
    @daily (midnight)Run once a day at midnight (00:00 am) in the specified time zone.
    @hourly (:00)Run every hour at 00 minutes.
    NoneNo schedule.

Custom cron... Details

Cron ValueDescription
0 * * * *Run once an hour.
0 0 * * *Run once a day at midnight.
0 0 1 * *Run once a month at midnight on the morning of the first day of the month.
""Create a job that has no scheduled run time.
 *    *    *    *    *
 -    -    -    -    -
 |    |    |    |    |
 |    |    |    |    +----- day of week (0 - 6) (Sunday=0)
 |    |    |    +---------- month (1 - 12)
 |    |    +--------------- day of month (1 - 31)
 |    +-------------------- hour (0 - 23)
 +------------------------- min (0 - 59)

The following named entries can be used:

  • Day of Week: sun, mon, tue, wed, thu, fri, sat.
  • Month: jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec.

A single space is required between each field. The values for each field can be composed of:

Field ValueExampleExample Description
A single value, within the limits displayed above for each field.
A wildcard '*' to indicate no restriction based on the field.'0 0 1 * *'Configures the schedule to run at midnight (00:00) on the first day of each month.
A range '2-5', indicating the range of accepted values for the field.'0 0 1-10 * *'Configures the schedule to run at midnight (00:00) on the first 10 days of each month.
A list of comma-separated values '2,3,4,5', indicating the list of accepted values for the field.0 0 1,11,21 * *'Configures the schedule to run at midnight (00:00) every 1st, 11th, and 21st day of each month.
A periodicity indicator '*/5' to express how often based on the field's valid range of values a schedule is allowed to run.'30 */2 1 * *'Configures the schedule to run on the 1st of every month, every 2 hours starting at 00:30. '0 0 */5 * *' configures the schedule to run at midnight (00:00) every 5 days starting on the 5th of each month.
A comma-separated list of any of the above except the '*' wildcard is also supported '2,*/5,8-10'.'0 0 5,*/10,25 * *'Configures the schedule to run at midnight (00:00) every 5th, 10th, 20th, and 25th day of each month.
  1. (Optional) You can delay the start time of a query by enabling the Delay execution.

Execute the Query

Save the query with a name and run, or just run the query. Upon successful completion of the query, the query result is automatically exported to the specified destination.

Scheduled jobs that continuously fail due to configuration errors may be disabled on the system side after several notifications.

(Optional) You can delay the start time of a query by enabling the Delay execution.

Activate a Segment in Audience Studio

You can also send segment data to the target platform by creating an activation in the Audience Studio.

  1. Navigate to Audience Studio.
  2. Select a parent segment.
  3. Open the target segment, right-mouse click, and then select Create Activation.
  4. In the Details panel, enter an Activation name and configure the activation according to the previous section on Configuration Parameters.
  5. Customize the activation output in the Output Mapping panel.

  • Attribute Columns
    • Select Export All Columns to export all columns without making any changes.
    • Select + Add Columns to add specific columns for the export. The Output Column Name pre-populates with the same Source column name. You can update the Output Column Name. Continue to select + Add Columnsto add new columns for your activation output.
  • String Builder
    • + Add string to create strings for export. Select from the following values:
      • String: Choose any value; use text to create a custom value.
      • Timestamp: The date and time of the export.
      • Segment Id: The segment ID number.
      • Segment Name: The segment name.
      • Audience Id: The parent segment number.
  1. Set a Schedule.

  • Select the values to define your schedule and optionally include email notifications.
  1. Select Create.

If you need to create an activation for a batch journey, review Creating a Batch Journey Activation.

(Optional) Configure Export Results in Workflow

Within Treasure Workflow, you can specify the use of this integration to export data.

Learn more at Using Workflows to Export Data with the TD Toolbelt.

(Optional) Export Integration Using the CLI

You can also use CLI(Toolbelt) for Result Export to SFTP.

You need to specify the information for export to your SFTP server as --result option of td query command. About td query command, you can refer to this article.

The format of the option is JSON and the general structure is as follows.

{
  "type": "sansan_datahub",
  "client_id": "xxx",
  "client_secret": "xxx",
  "data_source_id": "xxxx",
  "job_status": true,
  "skip_on_invalid_records": true,
  "max_retry": 7,
  "initial_retry_wait": 15,
  "max_retry_wait": 180,
  "connection_timeout": 300,
  "write_timeout": 300,
  "read_timeout": 300
}

Parameters

NameDescriptionValueDefault ValueRequired
typeDescribe the name of the service as the destination of export.sansan_datahubN/AYes
client_idThe client ID provided by Sansanclient IDN/AYes
client_secretThe client secret provided by Sansanclient secretN/AYes
data_source_idThe source target to export dataThe source target to export dataN/AYes
job_statusCheck the status of each batch uploadtrue or falsefalseNo
skip_on_invalid_recordsSkip invalid records and don't fail the jobtrue or falsefalseNo
max_retryThe maximum retry timesThe time in second7No
initial_retry_waitThe init retry waitThe time in second15No
max_retry_waitThe max retry waitThe time in second180No
connection_timeoutHTTP connection timeoutHTTP connection timeout300No
write_timeoutHTTP write timeoutHTTP write timeout300No
read_timeoutHTTP read timeoutHTTP read timeout300No

Example for Usage

$ td query --result '{"type":"sansan_datahub","client_id":"xxx","port":22,"client_secret":"xxx","data_source_id":"id","job_status":true, "skip_on_invalid_records":true}' -d sample_datasets "select * from www_access" -T presto