You can use the CLI to configure your connection.

This topic includes:

Install the Treasure Data Toolbelt

Open a terminal and run the following command to install the newest Treasure Data Toolbelt.

$ td --version
0.11.10

Guess and Preview are supported for Leads within List, Leads within Program.

Create a Configuration File (seed.yml)

Prepare seed.yml as shown in the following example, with your <parameters>

in:
  type: marketo
  <parameter: value>
out:
  <mode: append

Parameters

Value

Description

account_id

string

These values are available at Admin > Web Services page in Marketo.

If needed, you can find more detailed information on getting access to your credentials in Marketo’s documentation: http://developers.marketo.com/blog/quick-start-guide-for-marketo-rest-api/ 

client_id 

string

client_secret

string

target

string

Support import these targets:

  • lead

  • activity

  • campaign

  • all_lead_with_list_id

  • all_lead_with_program_id

  • program

  • custom_object

  • program_members

list_ids

string

target: all_lead_with_list_id. Comma separated List Id(s), import all person records if leave this field blank.

program_ids

string

target: all_lead_with_program_id|program_members. Comma separated Program Id(s). Leave blank to import all Members by all Programs

activity_type_ids

string

target: activity. Filter on only certain types of activities. Fetch all types if not specified.

incremental

boolean

target: lead|activity|program. When the value is true, each run will import new records only.

use_updated_at

boolean

target: lead. By default Leads import will filter by createdAt column. If selected will filter data by updatedAt column.

This feature is not available for all Marketo subscription. See https://developers.marketo.com/rest-api/bulk-extract/bulk-lead-extract/#filters

from_date

string

target: lead|activity. Import records that has a createdAt or updatedAt value is after the specified date.

fetch_days

integer

target: lead|activity. Use from_date and fetch_days to define an import date range.

escape

string

target: lead|activity. Result Marketo CSV file escape character (default \”)

quote

string

target: lead|activity. Result Marketo CSV file quote character (default \”)

query_by

string

target: program. Supported values: tag_type| date_range. Query Programs by Tag Type or Date Range. Leave blank to import all programs.

query_by: tag_type does not support incremental.

tag_type

string

target: program. Type of Marketo tag to filter

tag_value

string

target: program. Tag value to filter Marketo Program

earliest_updated_at

string

target: program. Exclude programs prior to this date. Must be valid ISO-8601 string. See Datetime field type description.

latest_updated_at

string

target: program. Exclude programs after this date. Must be valid ISO-8601 string. See Datetime field type description.

filter_type

string

target: program. Support filter type:

  • id

  • programId

  • folderId

  • workspace

filter_values

string array

target: program. The filter values

custom_object_api_name

string

target: custom_object. The API name of the custom object

custom_object_fields

string

target: custom_object. Comma separated API name of fields of the custom object (Optional).

custom_object_filter_type

string

target: custom_object. The API name of field of the custom object use to filter result. Only support integer field.

custom_object_filter_values

string

target: custom_object. Comma-separated list of field values to match. If this value is set, the custom_object_filter_from_value and custom_object_filter_to_value are ignored

custom_object_filter_from_value

integer

target: custom_object. Filter Marketo Custom Object has value greater than this value

custom_object_filter_to_value

integer

target: custom_object. Filter Marketo Custom Object has value smaller than this value. If not set, only records that have value greater than "From Value" will be returned. Job will stop if no record found in 300 consecutive value.

included_fields

string array

Add a list of Lead fields to be included in data import, only affects Lead family target.

marketo_limit_interval_milis

integer

Time wait for the next call if the request reaches Marketo concurrent limit (default 20000 ~ 20 secs)

maximum_retries

integer

Maximum times to retry Marketo request when error occurs (default 7)

batch_size

integer

Marketo REST API Batch size (default 300)

max_return

integer

Max records return in a single request. Program endpoint use offset for paging (default 200)

bulk_job_timeout_second

integer

Total time wait for bulk extract before fail the job (default 3600 ~ 1h)

polling_interval_second

integer

Interval to poll job status (default 60 secs)

read_timeout_millis

integer

Time to wait for Marketo response (default 60000~ 60 secs)

This configuration dumps the Marketo object specified in the target field because “replace” mode is specified.

For more details on available out modes, see the Appendix. 

Guess Fields (Generate load.yml)

Use connector:guess. This command automatically reads the target file and assesses(uses logic to guess) the file format. and output to load.yml. The file load.yml will include a schema for Lead.

$ td connector:guess seed.yml -o load.yml

If you open up load.yml, you’ll see assessed file format definitions including, in some cases, file formats, encodings, column names, and types.

Then you can preview how the system parses the file by using preview command.

$ td connector:preview load.yml

If the system detects your column name or type incorrectly, modify load.yml directly and preview again.

The Data Connector supports parsing of "boolean", "long", "double", "string", and "timestamp" types.

Execute Load Job

Submit the load job. It may take a couple of hours depending on the data size. Users need to specify the database and table where their data is stored.

$ td connector:issue load.yml \
    --database td_sample_db \
    --table td_sample_table \
    --time-column activity_date_time

The connector:issue command assumes you have already created a database(td_sample_db) and a table(td_sample_table). If the database or the table do not exist in TD, this command will fail, so create the database and table manually or use --auto-create-table option with td connector:issue command to auto-create the database and table:

$ td connector:issue load.yml \
    --database td_sample_db \
    --table td_sample_table \
    --time-column activity_date_time \
    --auto-create-table

You can assign a Time Format column to the "Partitioning Key" by using the "--time-column" option.

Scheduled Execution

You can schedule periodic data connector execution for periodic Marketo import. We configure our scheduler carefully to ensure high availability. By using this feature, you no longer need a cron daemon on your local data center.

For the scheduled import, Data Connector for Marketo imports all records.

Create the Schedule

A new schedule can be created by using the td connector:create command. The name of the schedule, cron-style schedule, the database and table where their data will be stored, and the Data Connector configuration file are required.

$ td connector:create \
    daily_marketo_leads_import \
    "10 0 * * *" \
    td_sample_db \
    td_sample_table \
    load.yml

The cron parameter also accepts three special options: `@hourly`, `@daily` and `@monthly`. For more detail on Scheduled Jobs.

By default, schedule is setup in UTC timezone. You can set the schedule in a timezone using -t or —timezone option. The --timezone option supports only extended timezone formats like ‘Asia/Tokyo’, ‘America/Los_Angeles’ etc. Timezone abbreviations like PST, CST are not supported and may lead to unexpected schedules.

List the Schedules

You can see the list of scheduled entries by td connector:list.

$ td connector:list
+-----------------------------------+---------------+----------+-------+-----------------+----------------------+
| Name                              | Cron          | Timezone | Delay | Database        | Table                |
+-----------------------------------+---------------+----------+-------+-----------------+----------------------+
| daily_marketo_leads_import.       | 10 0 * * *    | UTC      | 0     | td_sample_table | sample_table         |
+-----------------------------------+---------------+----------+-------+-----------------+----------------------+

Show the Setting and History of Schedules

td connector:show shows the execution setting of a schedule entry.

% td connector:show daily_marketo_leads_import
Name     : daily_marketo_leads_import
Cron     : 10 0 * * *
Timezone : UTC
Delay    : 0
Database : td_sample_db
Table    : td_sample_table

td connector:history shows the execution history of a schedule entry. To investigate the results of each individual execution, use td job <jobid>.

% td connector:history daily_marketo_leads_import
+--------+---------+---------+--------------+-----------------+----------+---------------------------+----------+
| JobID  | Status  | Records | Database     | Table           | Priority | Started                   | Duration |
+--------+---------+---------+--------------+-----------------+----------+---------------------------+----------+
| 578066 | success | 10000   | td_sample_db | td_sample_table | 0        | 2015-04-18 00:10:05 +0000 | 160      |
| 577968 | success | 10000   | td_sample_db | td_sample_table | 0        | 2015-04-17 00:10:07 +0000 | 161      |
| 577914 | success | 10000   | td_sample_db | td_sample_table | 0        | 2015-04-16 00:10:03 +0000 | 152      |
| 577872 | success | 10000   | td_sample_db | td_sample_table | 0        | 2015-04-15 00:10:04 +0000 | 163      |
| 577810 | success | 10000   | td_sample_db | td_sample_table | 0        | 2015-04-14 00:10:04 +0000 | 164      |
| 577766 | success | 10000   | td_sample_db | td_sample_table | 0        | 2015-04-13 00:10:04 +0000 | 155      |
| 577710 | success | 10000   | td_sample_db | td_sample_table | 0        | 2015-04-12 00:10:05 +0000 | 156      |
| 577610 | success | 10000   | td_sample_db | td_sample_table | 0        | 2015-04-11 00:10:04 +0000 | 157      |
+--------+---------+---------+--------------+-----------------+----------+---------------------------+----------+
8 rows in set

Delete the Schedule

td connector:delete removes the schedule.

$ td connector:delete daily_marketo_leads_import

Modes for Out Plugin

You can specify file import mode in out the section of seed.yml.

append (default)

This is the default mode and records are appended to the target table.

in:
  ...
out:
  mode: append

replace (In td 0.11.10 and later)

This mode replaces data in the target table. Any manual schema changes made to the target table remain intact with this mode.

in:
  ...
out:
  mode: replace
  • No labels