# Gigya Import Integration CLI

You can use the CLI to configure your connection.

## Install the Treasure Data Toolbelt

Open a terminal and run the following command to install the newest [Treasure Data Toolbelt](https://toolbelt.treasuredata.com/).

## Prepare *load.yml* File

Prepare load.yml. The in: section is where you specify what comes into the connector from Gigya and the out: section is where you specify what the connector puts out to the database in Treasure Data.

Provide your Gigya account access information as follows:


```yaml
in:
  type: gigya
  data_center: US1
  authentication_mode: key_secret
  application_key: your_application_user_key
  secret_key: your_application_secret_key
  api_key: your_api_key
  data_source: account
  query: SELECT * FROM accounts
  fields_to_exclude: "XffFirstIp, httpReq"
  batch_size: 1000
```

Configuration keys and descriptions are as follows:

| **Config key** | **Type** | **Required** | **Description** |
|  --- | --- | --- | --- |
| type | string | yes | connector type |
| data_center | string | yes | specifies your data center location (available values are **US1, EU1, AU1, RU1, CN1**) |
| authentication_mode | string | no | Method for authentication, current support only **key_secret** |
| application_key | string | yes | Your application's user key |
| secret_key | string | yes | Your application's secret key |
| api_key | string | yes | Your API key |
| data_source | string | no | Your target data source (available values are **account**, **profile**, **data_store**, **audit**) |
| query | string | yes | Your custom Gigya query |
| `fields_to_exclude` | string | no | `Due to Gigya's API specification, it is not possible to specify the columns to be included in the SELECT statement. This parameter can be used to remove unnecessary columns.` |
| batch_size | number | no | Maximum number of records in a single batch |


## Preview the Data to be Imported

You can preview data to be imported using the command td connector:preview.


```
$ td connector:preview load.yml
```

## Execute the Load Job

Use td connector:issue to execute the job. Processing might take a couple of hours depending on the data size. The following are required:

- name of the schedule
- cron-style schedule
- database and table where their data will be stored
- the Data Connector configuration file


```bash
td connector:issue load.yml --database td_sample_db --table td_sample_table --time-column created_at
daily_xxxx_import
```

The preceding command assumes you have already created *database(td_sample_db)* and *table(td_sample_table)*. If the database or the table do not exist in TD, this command will not succeed. You must create the database and table [manually](https://docs.treasuredata.com/smart/project-product-documentation/data-management) or use --auto-create-table option with td connector:issue command to auto-create the database and table:


```bash
td connector:issue load.yml --database td_sample_db --table td_sample_table --time-column created_at --auto-create-table
```

It is recommended to specify --time-column option, because Treasure Data’s storage is partitioned by time. If the option is not given, the data connector selects the first long or timestamp column as the partitioning time. The type of the column, specified by --time-column, must be either of long or timestamp type. Use Preview results to check for the available column name and type. Generally, most data types have a last_modified_date column.

A time column is available at the end of the output.


```bash
td connector:issue load.yml --database td_sample_db --table td_sample_table
--time-column created_at
```

If your data doesn’t have a time column you can add it using the add_time filter. You add the "**time**" column by adding the add_time filter to your configuration file as follows.


```yaml
in:
  type: xxxxx
  ...
filters:
- type: add_time
  from_value:
    mode: upload_time
  to_column:
    name: time
out:
  type: td
```

Find more information at [add_time filter plugin](https://docs.treasuredata.com/smart/project-product-documentation/add_time-filter-function).

If you have a field called time, you do not have to specify --time-column option.


```
$ td connector:issue load.yml --database td_sample_db --table td_sample
```

## Scheduled Execution

You can schedule periodic data connector execution for periodic Gigya import. We configure our scheduler carefully to ensure high availability. By using this feature, you no longer need a cron daemon on your local data center.

For the scheduled import, the data connector for Gigya imports all objects that match the specified target.

Scheduled execution supports additional configuration parameters that control the behavior of the data connector during its periodic attempts to fetch data from Gigya:

- incremental This configuration is used to control the load mode, which governs how the data connector fetches data from Gigya based on one of the native timestamp or numeric field associated with each object
- incremental_columnn This configuration is used to define a based column to import into Treasure Data. You can define only one column for this field. Suggested values are **created, createdTimestamp, updated, updatedTimestamp**


Here’s an example of a load file using incremental mode


```yaml
in:

  type: gigya

  data_center: US1

  authentication_mode: key_secret

  application_key: your_application_user_key

  secret_key: your_application_secret_key

  api_key: your_api_key

  data_source: account

  batch_size: 1000

  query: SELECT * FROM accounts

  incremental: true

  incremental_column: created

filters:

- type: add_time

  from_value:

    mode: upload_time

  to_column:

    name: time
```

**Create the schedule**

A new schedule can be created using the td connector:create command. The name of the schedule, cron-style schedule, the database and table where their data will be stored, and the data connector configuration file are required.

The `cron` parameter accepts these options: `@hourly`, `@daily` and `@monthly`.

By default, schedule is setup in UTC timezone. You can set the schedule in a timezone using -t or --timezone option. The `--timezone` option supports only extended timezone formats like 'Asia/Tokyo', 'America/Los_Angeles' etc. Timezone abbreviations like PST, CST are *not* supported and may lead to unexpected schedules.

You can create scheduled job to import using the command td connector:create to run daily.


```
td connector:create connector_name @daily connector_database connector_table load.yml
```

# Modes for the plugin

You can specify import mode in the out section of the load.yml file.

The out: section controls how data is imported into a Treasure Data table.
For example, you may choose to append data or replace data in an existing table in Treasure Data.

Output modes are ways to modify the data as the data is placed in Treasure Data.

- **Append**(default): Records are appended to the target table.
- **Replace** (available In td 0.11.10 and later): Replaces data in the target table. Any manual schema changes made to the target table remain intact.


Examples:


```yaml
in:

  ...

out:

  mode: append
in:

  ...

out:

  mode: replace
```