You can connect Facebook Lead Ads Connector with your Facebook Page or Ad Account to import Leads data into Treasure Data.
- Basic knowledge of Treasure Data, including the TD Toolbelt
- A Facebook Page/Ad account with Leads retrieval permission
- Authorized Treasure Data account access
Install the newest TD Toolbelt.
enable_guess_schema enabled:
in:
type: facebook_leads
app_secret: app_secret
access_token: long-lived-access_token
id: 33056800448180
time_created: '2020-01-28T15:46:25+0000'
incremental: false
enable_guess_schema: true
out:
mode: appendenable_guess_schema disabled:
in:
type: facebook_leads
app_secret: app_secret
access_token: long-lived-access_token
id: 33056800448180
time_created: '2020-01-28T15:46:25+0000'
incremental: false
enable_guess_schema: false
form_fields:
- {name: ocupation, type: string}
- {name: last_name, type: string}
- {name: cell_Phone, type: string}
- {name: email, type: string}
- {name: name, type: string}
- {name: opt_in_marketing, type: boolean}
out:
mode: appendConfiguration keys and descriptions are as follows:
| Config key | Type | Required | Description |
|---|---|---|---|
| type | string | yes | Connector type "facebook_leads" |
| app_secret | string | no | Facebook App Secret |
| access_token | string | yes | Facebook long-lived Access token, see instruction here |
| ad_account_id | string | no | Facebook Ad Account ID |
| id | string | yes | Leads data can be imported by Ad Id or Form ID. See Appendixfor details how to get Ad ID and Form ID |
| time_created | string | no | Import Leads data submitted since this time until the current time. The field accepts ISO 8601 date-time format. E.g. 2020-01-01T00:00:00+0700 |
| incremental | boolean | no | Only import new data each time. See How Incremental works |
| form_fields | array | no | Required when enable_guess_schema=false Facebook Lead Form fields name and its data type |
| - name | string | yes | Form field name. See Appendix for How to fine Lead’s Form Fields |
| - type | string | yes | Field data type |
| - format | string | no | Timestamp format. E.g. %Y-%m-%dT%H:%M:%S%z |
| - skip_invalid_records | boolean | no | Skip invalid Leads data and continue to import others. If unselected, job will fail if invalid data is encountered |
For more details on available out modes, see Appendix.
Use connector:guess. This command automatically reads the target data, and intelligently guesses the data format.
$ td connector:guess seed.yml -o load.ymlIf you open the load.yml file, you see guessed file format definitions including column names, data type and format.
---
in:
type: facebook_leads
app_secret: app_secret
access_token: long-lived-access_token
id: 514618966066498
time_created: '2019-12-01T15:46:25+0000'
incremental: false
columns:
- {name: id, type: long}
- {name: created_time, type: timestamp, format: "%Y-%m-%dT%H:%M:%S%z"}
- {name: ad_id, type: string}
- {name: ad_name, type: string}
- {name: adset_id, type: string}
- {name: adset_name, type: string}
- {name: campaign_id, type: string}
- {name: campaign_name, type: string}
- {name: form_id, type: long}
- {name: platform, type: string}
- {name: is_organic, type: boolean}
- {name: name, type: string}
- {name: surname, type: string}
- {name: email, type: string}
out: {mode: append}
exec: {}
filters:
- from_value: {mode: upload_time}
to_column: {name: time}
type: add_timeThen you can preview the data by using preview command.
td connector:preview load.ymlIf the system detects your column type unexpectedly, modify the load.yml directly and preview again.
The data connector supports parsing of "boolean", "long", "double", "string", and "timestamp" types.
Submit the load job. It may take a couple of hours depending on the data size. Users need to specify the database and table where their data is stored.
td connector:issue load.yml --database td_sample_db --table td_sample_tableThe preceding command assumes that you have already created database(td_sample_db) and table(td_sample_table). If the database or the table do not exist in TD this command will not succeed, so create the database and table manually or use
-auto-create-table
option with td connector:issue command to automatically create the database and table:
td connector:issue load.yml --database td_sample_db --table td_sample_table --time-column created_at --auto-create-table You can assign Time Format column to the "Partitioning Key" by "--time-column" option.
You can schedule periodic data connector execution for periodic Leads import. We carefully configure our scheduler to ensure high availability. By using this feature, you no longer need a cron daemon on your local data center.
A new schedule can be created by using the td connector:create command. The name of the schedule, cron-style schedule, the database and table where their data is stored, and the data connector configuration file are required.
$ td connector:create \
daily_leads_import \
"10 0 * * *" \
td_sample_db \
td_sample_table \
load.yml The cron parameter also accepts three special options: @hourly, @daily and @monthly.
You can load records incrementally by setting true for the incremental option.
in:
type: facebook_leads
app_secret: app_secret
access_token: long-lived-access_token
id: 33056800448180
time_created: '2020-01-28T15:46:25+0000'
incremental: true
out:
mode: appendIf you’re using scheduled execution, the connector automatically saves the last import time time_created value and holds it internally. Then it is used at the next scheduled execution.
in:
type: facebook_leads
...
out:
...
Config Diff
---
in:
time_created: '2020-02-02T15:46:25Z'You can see the list of scheduled entries by td connector:list.
td connector:listWhat Facebook App scopes or permissions are required for this Connector?
Following permissions are required:
- public_profile
- leads_retrieval
- pages_manage_ads,pages_manage_metadata,pages_read_engagement,pages_read_user_content (if you want to import by Form ID)
- ads_management (if you want to import by Ad ID)
If incremental: true is set, this connector loads all records created since the time_created, if the time_created is set, or import all available data until the current time if time_createdis not set. The next job execution will only import records created since the last job execution. This mode is useful when you want to fetch just the Leads created since the previously scheduled run.
For example the first job execution you specified the config as:
in:
...
time_created: '2020-01-28T15:46:25+0000'
incremental: trueWhen bulk data loading finishes successfully, it outputs time_created: parameter. E.g. '2020-02-02T15:46:25+0000' as config-diff so that next execution uses it.
At the next execution, when time_created: is also set, this plugin uses the time_created from config-diff and ignores the original value and the new job config runs as
in:
...
time_created: '2020-02-02T15:46:25+0000'
incremental: trueThat way, each time the job runs, it only imports new records.
For Ad Account level Leads import (by setting the ad_account_id), a list of Lead Ad IDs and its latest time_created are stored as config-diff for the next job execution, e.g.
in:
...
incremental: true
list_time_created: {
'23845900031': '2020-11-01T02:46:45Z',
'23845899651': '2020-11-02T21:25:33Z',
'23846121121': '2020-11-30T05:21:03Z',
'23845899651': '2020-11-01T12:39:53Z',
'23845900031': '2020-11-02T04:13:19Z',
'23845899651': '2020-11-04T01:39:58Z'
}