Skip to content
Last updated

Facebook Lead Ads Import Integration CLI

You can connect Facebook Lead Ads Connector with your Facebook Page or Ad Account to import Leads data into Treasure Data.

Prerequisites

  • Basic knowledge of Treasure Data, including the TD Toolbelt
  • A Facebook Page/Ad account with Leads retrieval permission
  • Authorized Treasure Data account access

Use Command Line to Create a Connection

Install ‘td’ Command

Install the newest TD Toolbelt.

Create Seed Config File (seed.yml)

enable_guess_schema enabled:

in:
  type: facebook_leads
  app_secret: app_secret
  access_token: long-lived-access_token
  id: 33056800448180
  time_created: '2020-01-28T15:46:25+0000'
  incremental: false
  enable_guess_schema: true
out:
  mode: append

enable_guess_schema disabled:

in:
  type: facebook_leads
  app_secret: app_secret
  access_token: long-lived-access_token
  id: 33056800448180
  time_created: '2020-01-28T15:46:25+0000'
  incremental: false
  enable_guess_schema: false
  form_fields:
	- {name: ocupation, type: string}
	- {name: last_name, type: string}
	- {name: cell_Phone, type: string}
	- {name: email, type: string}
	- {name: name, type: string}
	- {name: opt_in_marketing, type: boolean}

out:
  mode: append

Configuration keys and descriptions are as follows:

Config keyTypeRequiredDescription
typestringyesConnector type "facebook_leads"
app_secretstringnoFacebook App Secret
access_tokenstringyesFacebook long-lived Access token, see instruction here
ad_account_idstringnoFacebook Ad Account ID
idstringyesLeads data can be imported by Ad Id or Form ID. See Appendixfor details how to get Ad ID and Form ID
time_createdstringnoImport Leads data submitted since this time until the current time. The field accepts ISO 8601 date-time format. E.g. 2020-01-01T00:00:00+0700
incrementalbooleannoOnly import new data each time. See How Incremental works
form_fieldsarraynoRequired when enable_guess_schema=false Facebook Lead Form fields name and its data type
- namestringyesForm field name. See Appendix for How to fine Lead’s Form Fields
- typestringyesField data type
- formatstringnoTimestamp format. E.g. %Y-%m-%dT%H:%M:%S%z
- skip_invalid_recordsbooleannoSkip invalid Leads data and continue to import others. If unselected, job will fail if invalid data is encountered

For more details on available out modes, see Appendix.

Guess Fields (Generate load.yml)

Use connector:guess. This command automatically reads the target data, and intelligently guesses the data format.

$ td connector:guess seed.yml -o load.yml

If you open the load.yml file,  you see guessed file format definitions including column names, data type and format.

---
in:
  type: facebook_leads
  app_secret: app_secret
  access_token: long-lived-access_token
  id: 514618966066498
  time_created: '2019-12-01T15:46:25+0000'
  incremental: false
  columns:
  - {name: id, type: long}
  - {name: created_time, type: timestamp, format: "%Y-%m-%dT%H:%M:%S%z"}
  - {name: ad_id, type: string}
  - {name: ad_name, type: string}
  - {name: adset_id, type: string}
  - {name: adset_name, type: string}
  - {name: campaign_id, type: string}
  - {name: campaign_name, type: string}
  - {name: form_id, type: long}
  - {name: platform, type: string}
  - {name: is_organic, type: boolean}
  - {name: name, type: string}
  - {name: surname, type: string}
  - {name: email, type: string}
out: {mode: append}
exec: {}
filters:
- from_value: {mode: upload_time}
  to_column: {name: time}
  type: add_time

Then you can preview the data by using preview command.

td connector:preview load.yml

If the system detects your column type unexpectedly, modify the  load.yml directly and preview again.

The data connector supports parsing of "boolean", "long", "double", "string", and "timestamp" types.

Execute Load Job

Submit the load job. It may take a couple of hours depending on the data size. Users need to specify the database and table where their data is stored.

td connector:issue load.yml --database td_sample_db --table td_sample_table

The preceding command assumes that you have already created database(td_sample_db) and table(td_sample_table). If the database or the table do not exist in TD this command will not succeed, so create the database and table manually or use

  • -auto-create-table

option with td connector:issue command to automatically create the database and table:

td connector:issue load.yml --database td_sample_db --table td_sample_table --time-column created_at --auto-create-table 

You can assign Time Format column to the "Partitioning Key" by "--time-column" option.

Scheduled Execution

You can schedule periodic data connector execution for periodic Leads import. We carefully configure our scheduler to ensure high availability. By using this feature, you no longer need a cron daemon on your local data center.

Create the Schedule

A new schedule can be created by using the td connector:create command. The name of the schedule, cron-style schedule, the database and table where their data is stored, and the data connector configuration file are required.

$ td connector:create \
    daily_leads_import \
    "10 0 * * *" \
    td_sample_db \
    td_sample_table \
    load.yml 

The cron parameter also accepts three special options: @hourly, @daily and @monthly.

Incremental Scheduling

You can load records incrementally by setting true for the incremental option.

in:
 type: facebook_leads
 app_secret: app_secret
 access_token: long-lived-access_token
 id: 33056800448180
 time_created: '2020-01-28T15:46:25+0000'
 incremental: true
out:
 mode: append

If you’re using scheduled execution, the connector automatically saves the last import time time_created value and holds it internally. Then it is used at the next scheduled execution.

in:
  type: facebook_leads
  ...
out:
  ...

Config Diff
---
in:
  time_created: '2020-02-02T15:46:25Z'

List the Schedules

You can see the list of scheduled entries by td connector:list.

td connector:list

FAQ for Import from Facebook Lead Ads

What Facebook App scopes or permissions are required for this Connector?

  • Following permissions are required:

    • email
    • public_profile
    • leads_retrieval
    • pages_manage_ads,pages_manage_metadata,pages_read_engagement,pages_read_user_content (if you want to import by Form ID)
    • ads_management (if you want to import by Ad ID)

How Incremental Loading Works

If  incremental: true is set, this connector loads all records created since the time_created, if the time_created is set, or import all available data until the current time if time_createdis not set. The next job execution will only import records created since the last job execution. This mode is useful when you want to fetch just the Leads created since the previously scheduled run.

For example the first job execution you specified the config as:

in:
 ...
 time_created: '2020-01-28T15:46:25+0000'
 incremental: true

When bulk data loading finishes successfully, it outputs time_created: parameter. E.g. '2020-02-02T15:46:25+0000' as config-diff so that next execution uses it.
At the next execution, when time_created: is also set, this plugin uses the time_created from config-diff and ignores the original value and the new job config runs as

in:
 ...
 time_created: '2020-02-02T15:46:25+0000'
 incremental: true

That way, each time the job runs, it only imports new records.

For Ad Account level Leads import (by setting the ad_account_id), a list of Lead Ad IDs and its latest time_created are stored as config-diff for the next job execution, e.g.

in:
  ...
  incremental: true
  list_time_created: {
	'23845900031': '2020-11-01T02:46:45Z',
	'23845899651': '2020-11-02T21:25:33Z',
	'23846121121': '2020-11-30T05:21:03Z',
	'23845899651': '2020-11-01T12:39:53Z',
	'23845900031': '2020-11-02T04:13:19Z',
	'23845899651': '2020-11-04T01:39:58Z'
}