Skip to content
Last updated

Marketo Import Connection Using CLI

You can use the CLI to configure your connection.

This topic includes:

Limitations

  • Updating the connector configuration using td connector:update with -c, --config CONFIG_FILE option is not supported.

Install the Treasure Data Toolbelt

Open a terminal and run the following command to install the newest Treasure Data Toolbelt.

$ td --version
0.11.10

Guess and Preview are supported for Leads within List, Leads within Program.

Create a Configuration File (seed.yml)

Prepare seed.yml as shown in the following example, with your parameters

in:
  type: marketo
  <parameter: value>
out:
  <mode: append
ParametersValueDescription
account_idstringThese values are available at Admin > Web Services page in Marketo. If needed, you can find more detailed information on getting access to your credentials in Marketo’s documentation: http://developers.marketo.com/blog/quick-start-guide-for-marketo-rest-api/
client_idstring
client_secretstring
targetstringSupport import these targets: - lead - activity - campaign - all_lead_with_list_id - all_lead_with_program_id - program - custom_object - program_members
list_idsstringtarget: all_lead_with_list_id. Comma separated List Id(s), import all person records if leave this field blank.
program_idsstring**target: all_lead_with_program_id
activity_type_idsstringtarget: activity. Filter on only certain types of activities. Fetch all types if not specified.
incrementalboolean**target: lead
use_updated_atbooleantarget: lead. By default Leads import will filter by createdAt column. If selected will filter data by updatedAt column. This feature is not available for all Marketo subscription. See https://developers.marketo.com/rest-api/bulk-extract/bulk-lead-extract/#filters
from_datestring**target: lead
fetch_daysinteger**target: lead
escapestring**target: lead
quotestring**target: lead
query_bystringtarget: program. Supported values: tag_type
tag_typestringtarget: program. Type of Marketo tag to filter
tag_valuestringtarget: program. Tag value to filter Marketo Program
earliest_updated_atstringtarget: program. Exclude programs prior to this date. Must be valid ISO-8601 string. See Datetime field type description.
latest_updated_atstringtarget: program. Exclude programs after this date. Must be valid ISO-8601 string. See Datetime field type description.
filter_typestringtarget: program. Support filter type: - id - programId - folderId - workspace
filter_valuesstring arraytarget: program. The filter values
custom_object_api_namestringtarget: custom_object. The API name of the custom object
custom_object_fieldsstringtarget: custom_object. Comma separated API name of fields of the custom object (Optional).
custom_object_filter_typestringtarget: custom_object. The API name of field of the custom object use to filter result. Only support integer field.
custom_object_filter_valuesstringtarget: custom_object. Comma-separated list of field values to match. If this value is set, the custom_object_filter_from_value and custom_object_filter_to_value are ignored
custom_object_filter_from_valueintegertarget: custom_object. Filter Marketo Custom Object has value greater than this value
custom_object_filter_to_valueintegertarget: custom_object. Filter Marketo Custom Object has value smaller than this value. If not set, only records that have value greater than "From Value" will be returned. Job will stop if no record found in 300 consecutive value.
included_fieldsstring arrayAdd a list of Lead fields to be included in data import, only affects Lead family target.
marketo_limit_interval_milisintegerTime wait for the next call if the request reaches Marketo concurrent limit (default 20000 ~ 20 secs)
maximum_retriesintegerMaximum times to retry Marketo request when error occurs (default 7)
batch_sizeintegerMarketo REST API Batch size (default 300)
max_returnintegerMax records return in a single request. Program endpoint use offset for paging (default 200)
bulk_job_timeout_secondintegerTotal time wait for bulk extract before fail the job (default 3600 ~ 1h)
polling_interval_secondintegerInterval to poll job status (default 60 secs)
read_timeout_millisintegerTime to wait for Marketo response (default 60000~ 60 secs)

This configuration dumps the Marketo object specified in the target field because “replace” mode is specified.

For more details on available out modes, see the Appendix.

Guess Fields (Generate load.yml)

Use connector:guess. This command automatically reads the target file and assesses(uses logic to guess) the file format. and output to load.yml. The file load.yml will include a schema for Lead.

td connector:guess seed.yml -o load.yml

If you open up load.yml, you’ll see assessed file format definitions including, in some cases, file formats, encodings, column names, and types.

Then you can preview how the system parses the file by using preview command.

td connector:preview load.yml

If the system detects your column name or type incorrectly, modify load.yml directly and preview again.

The Data Connector supports parsing of "boolean", "long", "double", "string", and "timestamp" types.

Execute Load Job

Submit the load job. It may take a couple of hours depending on the data size. Users need to specify the database and table where their data is stored.

td connector:issue load.yml \
--database td_sample_db \
--table td_sample_table \
--time-column activity_date_time

The connector:issue command assumes you have already created a database(td_sample_db) and a table(td_sample_table). If the database or the table do not exist in TD, this command will fail, so create the database and table manually or use --auto-create-table option with td connector:issue command to auto-create the database and table:

td connector:issue load.yml \
--database td_sample_db \
--table td_sample_table \
--time-column activity_date_time \
--auto-create-table

You can assign a Time Format column to the "Partitioning Key" by using the "--time-column" option.

Scheduled Execution

You can schedule periodic data connector execution for periodic Marketo import. We configure our scheduler carefully to ensure high availability. By using this feature, you no longer need a cron daemon on your local data center.

For the scheduled import, Data Connector for Marketo imports all records.

Create the Schedule

A new schedule can be created by using the td connector:create command. The name of the schedule, cron-style schedule, the database and table where their data will be stored, and the Data Connector configuration file are required.

$ td connector:create \
    daily_marketo_leads_import \
    "10 0 * * *" \
    td_sample_db \
    td_sample_table \
    load.yml

The cron parameter also accepts three special options: @hourly, @daily and @monthly. For more detail on Scheduled Jobs.

By default, schedule is setup in UTC timezone. You can set the schedule in a timezone using -t or —timezone option. The --timezone option supports only extended timezone formats like ‘Asia/Tokyo’, ‘America/Los_Angeles’ etc. Timezone abbreviations like PST, CST are not supported and may lead to unexpected schedules.

List the Schedules

You can see the list of scheduled entries by td connector:list.

$ td connector:list
+-----------------------------------+---------------+----------+-------+-----------------+----------------------+
| Name                              | Cron          | Timezone | Delay | Database        | Table                |
+-----------------------------------+---------------+----------+-------+-----------------+----------------------+
| daily_marketo_leads_import.       | 10 0 * * *    | UTC      | 0     | td_sample_table | sample_table         |
+-----------------------------------+---------------+----------+-------+-----------------+----------------------+

Show the Setting and History of Schedules

td connector:show shows the execution setting of a schedule entry.

% td connector:show daily_marketo_leads_import

td connector:history shows the execution history of a schedule entry. To investigate the results of each individual execution, use td job jobid.

td connector:history daily_marketo_leads_import

Delete the Schedule

td connector:delete removes the schedule.

$ td connector:delete daily_marketo_leads_import

Modes for Out Plugin

You can specify file import mode in out the section of seed.yml.

append (default)

This is the default mode and records are appended to the target table.

in:
  ...
out:
  mode: append

replace (In td 0.11.10 and later)

This mode replaces data in the target table. Any manual schema changes made to the target table remain intact with this mode.

in:
  ...
out:
  mode: replace