# Facebook Lead Ads Import Integration CLI

You can connect Facebook Lead Ads Connector with your Facebook Page or Ad Account to import Leads data into Treasure Data.

## Prerequisites

- Basic knowledge of Treasure Data, including the [TD Toolbelt](https://toolbelt.treasuredata.com/)
- A Facebook Page/Ad account with Leads retrieval permission
- Authorized Treasure Data account access


## Use Command Line to Create a Connection

### Install ‘td’ Command

Install the newest [TD Toolbelt.](https://toolbelt.treasuredata.com/)

### Create Seed Config File (seed.yml)

enable_guess_schema enabled:


```yaml
in:
  type: facebook_leads
  app_secret: app_secret
  access_token: long-lived-access_token
  id: 33056800448180
  time_created: '2020-01-28T15:46:25+0000'
  incremental: false
  enable_guess_schema: true
out:
  mode: append
```

enable_guess_schema disabled:


```yaml
in:
  type: facebook_leads
  app_secret: app_secret
  access_token: long-lived-access_token
  id: 33056800448180
  time_created: '2020-01-28T15:46:25+0000'
  incremental: false
  enable_guess_schema: false
  form_fields:
	- {name: ocupation, type: string}
	- {name: last_name, type: string}
	- {name: cell_Phone, type: string}
	- {name: email, type: string}
	- {name: name, type: string}
	- {name: opt_in_marketing, type: boolean}

out:
  mode: append
```

Configuration keys and descriptions are as follows:

| **Config key** | **Type** | **Required** | **Description** |
|  --- | --- | --- | --- |
| type | string | yes | Connector type "facebook_leads" |
| app_secret | string | no | Facebook App Secret |
| access_token | string | yes | Facebook long-lived Access token, see instruction [here](/int/facebook-ads-insights-import-integration#h2__2004725160) |
| ad_account_id | string | no | Facebook Ad Account ID |
| id | string | yes | Leads data can be imported by Ad Id or Form ID. See [Appendix](/int/facebook-lead-ads-import-integration-cli#h1__1835053169)for details how to get Ad ID and Form ID |
| time_created | string | no | Import Leads data submitted since this time until the current time. The field accepts ISO 8601 date-time format. E.g. 2020-01-01T00:00:00+0700 |
| incremental | boolean | no | Only import new data each time. See [How Incremental works](http://docs.treasuredata.com/display/PD/Facebook+Lead+Ads+Import+Integration+Beta#How-Incremental-Loading-Works) |
| form_fields | array | no | Required when enable_guess_schema=false Facebook Lead Form fields name and its data type |
| - name | string | yes | Form field name. See Appendix for [How to fine Lead’s Form Fields](https://tddocs.atlassian.net/wiki/spaces/CW/pages/826835011/Facebook+Lead+Ads+Import+Integration#How-to-get-the-Lead-Form%E2%80%99s-Field-Name) |
| - type | string | yes | Field data type |
| - format | string | no | Timestamp format. E.g. %Y-%m-%dT%H:%M:%S%z |
| - skip_invalid_records | boolean | no | Skip invalid Leads data and continue to import others. If unselected, job will fail if invalid data is encountered |


For more details on available `out` modes, see [Appendix](/int/facebook-lead-ads-import-integration-cli#h1__1835053169).

### Guess Fields (Generate load.yml)

Use `connector:guess`. This command automatically reads the target data, and intelligently guesses the data format.


```bash
$ td connector:guess seed.yml -o load.yml
```

If you open the `load.yml` file,  you see guessed file format definitions including column names, data type and format.


```yaml
---
in:
  type: facebook_leads
  app_secret: app_secret
  access_token: long-lived-access_token
  id: 514618966066498
  time_created: '2019-12-01T15:46:25+0000'
  incremental: false
  columns:
  - {name: id, type: long}
  - {name: created_time, type: timestamp, format: "%Y-%m-%dT%H:%M:%S%z"}
  - {name: ad_id, type: string}
  - {name: ad_name, type: string}
  - {name: adset_id, type: string}
  - {name: adset_name, type: string}
  - {name: campaign_id, type: string}
  - {name: campaign_name, type: string}
  - {name: form_id, type: long}
  - {name: platform, type: string}
  - {name: is_organic, type: boolean}
  - {name: name, type: string}
  - {name: surname, type: string}
  - {name: email, type: string}
out: {mode: append}
exec: {}
filters:
- from_value: {mode: upload_time}
  to_column: {name: time}
  type: add_time
```

Then you can preview the data by using `preview` command.


```
td connector:preview load.yml
```

If the system detects your column type unexpectedly, modify the  `load.yml` directly and preview again.

The data connector supports parsing of "boolean", "long", "double", "string", and "timestamp" types.

### Execute Load Job

Submit the load job. It may take a couple of hours depending on the data size. Users need to specify the database and table where their data is stored.


```bash
td connector:issue load.yml --database td_sample_db --table td_sample_table
```

The preceding command assumes that you have already created *database(td_sample_db)* and *table(td_sample_table)*. If the database or the table do not exist in TD this command will not succeed, so create the database and table [manually](https://docs.treasuredata.com/smart/project-product-documentation/data-management) or use

- `-auto-create-table`


option with `td connector:issue` command to automatically create the database and table:


```bash
td connector:issue load.yml --database td_sample_db --table td_sample_table --time-column created_at --auto-create-table
```

You can assign Time Format column to the "Partitioning Key" by "--time-column" option.

### Scheduled Execution

You can schedule periodic data connector execution for periodic Leads import. We carefully configure our scheduler to ensure high availability. By using this feature, you no longer need a `cron` daemon on your local data center.

#### Create the Schedule

A new schedule can be created by using the `td connector:create` command. The name of the schedule, cron-style schedule, the database and table where their data is stored, and the data connector configuration file are required.


```
$ td connector:create \
    daily_leads_import \
    "10 0 * * *" \
    td_sample_db \
    td_sample_table \
    load.yml
```

The `cron` parameter also accepts three special options: `@hourly`, `@daily` and `@monthly`.

### Incremental Scheduling

You can load records incrementally by setting true for the `incremental` option.


```
in:
 type: facebook_leads
 app_secret: app_secret
 access_token: long-lived-access_token
 id: 33056800448180
 time_created: '2020-01-28T15:46:25+0000'
 incremental: true
out:
 mode: append
```

If you’re using [scheduled execution](http://docs.treasuredata.com/display/PD/Facebook+Lead+Ads+Import+Integration+Beta#Scheduled-Execution), the connector automatically saves the last import time `time_created` value and holds it internally. Then it is used at the next scheduled execution.


```
in:
  type: facebook_leads
  ...
out:
  ...

Config Diff
---
in:
  time_created: '2020-02-02T15:46:25Z'
```

### List the Schedules

You can see the list of scheduled entries by `td connector:list`.


```
td connector:list
```

## FAQ for Import from Facebook Lead Ads

**What Facebook App scopes or permissions are required for this Connector?**

- Following permissions are required:
  - email
  - public_profile
  - leads_retrieval
  - pages_manage_ads,pages_manage_metadata,pages_read_engagement,pages_read_user_content (if you want to import by Form ID)
  - ads_management (if you want to import by Ad ID)


### How Incremental Loading Works

If  `incremental: true` is set, this connector loads all records created since the **time_created**, if the **time_created** is set, or import all available data until the current time if **time_created**is not set. The next job execution will only import records created since the last job execution. This mode is useful when you want to fetch just the Leads created since the previously scheduled run.

For example the first job execution you specified the config as:


```
in:
 ...
 time_created: '2020-01-28T15:46:25+0000'
 incremental: true
```

When bulk data loading finishes successfully, it outputs **time_created**: parameter. E.g. '2020-02-02T15:46:25+0000' as config-diff so that next execution uses it.
At the next execution, when **time_created**: is also set, this plugin uses the **time_created** from config-diff and ignores the original value and the new job config runs as


```
in:
 ...
 time_created: '2020-02-02T15:46:25+0000'
 incremental: true
```

That way, each time the job runs, it only imports new records.

For Ad Account level Leads import (by setting the ad_account_id), a list of Lead Ad IDs and its latest **time_created** are stored as config-diff for the next job execution, e.g.


```
in:
  ...
  incremental: true
  list_time_created: {
	'23845900031': '2020-11-01T02:46:45Z',
	'23845899651': '2020-11-02T21:25:33Z',
	'23846121121': '2020-11-30T05:21:03Z',
	'23845899651': '2020-11-01T12:39:53Z',
	'23845900031': '2020-11-02T04:13:19Z',
	'23845899651': '2020-11-04T01:39:58Z'
}
```