# Onetrust Import Integration

As more and more data protection laws arrive all over the world, ensuring compliance is a priority. OneTrust is a privacy management and marketing compliance company. Its services are used by organizations to comply with global regulations like GDPR.

This OneTrust input integration is to provide an input integration that can collect customer's consent data and load it into TD. Access to OneTrust data on the Treasure Data platform enables your marketing team to optimally enrich your data.

## Prerequisites

- Basic knowledge of Treasure Data
- Basic knowledge of OneTrust
- GUID of a single Collection point to limit data, if not provided, get from all Collection Points.


## Obtain your API Key

1. Navigate to [https://app.onetrust.com/integrations/api-keys](https://app.onetrust.com/integrations/api-keys) .
2. Sign on to the OneTrust application if necessary.
3. Select **Add New**
![](/assets/image-20200930-061815.c53f036e36130a0a7e2c360d76560b778712ac0880b88b1cdcaa927756594700.1a9c01a1.png)
4. Type a name that you want for the Connection Name.
5. Select **Install.**


## Retrieving Collection Point GUID

1. Navigate to [https://app.onetrust.com/consent/collection-points](https://app.onetrust.com/consent/collection-points).
The Collection Point screen displays.
![](/assets/image-20201009-001408.5073acf5069945ba884a13e7c84ecc6cc00031a3691eac099dd0348e0f8d1292.1a9c01a1.png)
2. Select the corresponding Collection Point, the GUID is in the URL. For example:
![](/assets/image-20201009-001534.412d4e3f8d017c7dacba82e12cc7c0f613655dff3786d5113662cda6952d3326.1a9c01a1.png)


## Obtain your OAuth Access Token

Create a OneTrust token to store the Client ID and Secret.
This is short-lived token.

1. Navigate to [https://app.onetrust.com/settings/client-credentials/list](https://app.onetrust.com/settings/client-credentials/list).
2. Select **Add.**
![](/assets/image-20200930-062051.3d2e97b5f72ce0b6c8a832877d4d1d47a913eb211ce45214def2950e238129b5.1a9c01a1.png)
3. Type a name and describe your token.
4. Select appropriate **Access Token Lifetime.** The default lifetime is one hour.
![](/assets/image-20201007-163501.9e9203cad027c81f71bda8022fcfde213fa60367f43cb088f6891bbc7dce5b98.1a9c01a1.png)
5. Navigate to [https://app.onetrust.com/settings/client-credentials/list](https://app.onetrust.com/settings/client-credentials/list).
6. Select your credential.
7. Select **Generate Token**.
![](/assets/image-20201007-163605.b299e5cda4872485c67a620c34f07ceb259b8d87eb0daf2f3a4eecc34e474c1a.1a9c01a1.png)


## Use TD Console to Create Your Connection

### Create a New Connection

When you configure a data connection, you provide authentication to access the integration. In Treasure Data, you configure the authentication and specify the source information.

1. Open TD Console.
2. Navigate to Integrations Hub >  Catalog.
3. Search for and select OneTrust.
![](/assets/image-20200930-035432.52223b020cbfad0122a72e473007058025e579fd65bf1a5e252dc4c3ecde712c.1a9c01a1.png)
4. Type the name of the access token that you created in the OneTrust application.
5. Select Continue.
6. Type a name for your connection.
7. Select **Done.**


### Transfer Your OneTrust Account Data to Treasure Data

After creating the authenticated connection, you are automatically taken to Authentications.

1. Search for the connection you created.
2. Select **New Source**.
3. Type a name for the data transfer**.**
![](/assets/image-20201007-164656.14d9bbe63434c2b5c8f4cf85deb3eb3558b104e3b700565f7fdc0a4456ab160f.1a9c01a1.png)
4. Select **Next**.
The Source Table dialog opens.
![](/assets/screenshot-2025-04-02-at-21.54.18.52cd8f0eb32e7e29268a45767952f5d407b1b632380a9b1a57aaf7c6806e1f54.1a9c01a1.png)
5. Select **Next**.
The Data Settings dialog opens.
![](/assets/onetrust-import-integration-2024-02-09-2.41d7dab48769d91a50da4546496f97fe4d23a6372ded4f6e725105c3b2046394.1a9c01a1.png)
6. Edit the following parameters:


| Parameters  | Description  |
|  --- | --- |
| **Data Type** | - **Data Subject Profile**. Fetch Data Subject Profile data.
- **Collection Point**.Fetch Collection Point data.
- **Data Subject Profile (API V4)**. Fetch Data Subject Profile by OneTrust API V4.
- **Link Token (API V4)**. Fetch Link Token by OneTrust API V4.
- **Purpose (API V4)**. Fetch Purpose by OneTrust API V4.

 |
| **Collection Point GUID**(Optional) | GUID of a single Collection point to limit data, if not provided, get from all Collection Points. |
| **Incremental Loading** | Enables incremental report loading with new **Start Time** automatic calculation. For example, if you start incremental loading with **Start Time** = `2014-10-02T15:01:23Z` to `2014-10-03T15:01:23Z`, the next jobs run new **Start Time**will be `2014-10-03T15:01:23` |
| **Start Time** (Required when select Incremental Loading. Required for every API V4 Data Type). | For UI configuration, you can pick the date and time from supported browser, or input the date that suit the browser expectation of date time. For example, on Chrome, you will have a calendar to select Year, Month, Day, Hour, and Minute; on Safari, you need to input the text such as `2020-10-25T00:00`. For cli configuration, we need a timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds, for example: `"2014-10-02T15:01:23Z"`. |
| **Incremental By Modifications of** | - **Data Subject.** Incremental by the last update of the data subject
- **Consent Information**.Incremental by the last consent date of the consent information.

 |
| **End Time** (Required when select Incremental Loading. Required for every API V4 Data Type). | For UI configuration, you can pick the date and time from supported browser, or input the date that suit the browser expectation of date time. For example, on Chrome, you will have a calendar to select Year, Month, Day, Hour, and Minute; on Safari, you need to input the text such as `2020-10-25T00:00`. For cli configuration, we need a timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds, for example: `"2014-10-02T15:01:23Z"` |
| **Properties** (Optional) | Comma-separated setting to add `properties` query param when fetch **Data Subject Profile** Data Type. It will be showed when select the Data Type **Data Subject Profile.** Things to Know:- It is critical that all queries include properties=ignoreCount. Not including ignoreCount will significantly decrease performance. If you need the count, we advise to only include it in the initial query, and not for subsequent page calls.
- The values passed in the properties query parameter can change the response of this API. A fast response on large data sets can be obtained passing any of the following values: linkTokens, ignoreCount, ignoreTopics, ignoreCustomPreferences.
- It is strongly recommended to pass the requestContinuation parameter returned in the response of this API in the next API request to paginate. Including it is crucial for better performance when dealing with multiple pages of data subject records. For more information, see Understanding & Implementing Pagination.
- Recommended paramters: ignoreCount,requestContinuation Ref. [Get List of Data Subjects get](https://developer.onetrust.com/onetrust/reference/getdatasubjectprofileusingget)

 |
| **Request Continues Paging** (Optional) | If checked, the ingestion process will run to fetch data that is paginated by request continues tokens. This option is useful if the data volume is large. |


### Data Settings

1. Select **Next**.
The Data Settings page opens.
2. Skip this page of the dialog.


### Data Preview

You can see a [preview](/products/customer-data-platform/integration-hub/batch/import/previewing-your-source-data) of your data before running the import by selecting Generate Preview. Data preview is optional and you can safely skip to the next page of the dialog if you choose to.

1. Select **Next**. The Data Preview page opens.
2. If you want to preview your data, select **Generate Preview**.
3. Verify the data.


### Data Placement

For data placement, select the target database and table where you want your data placed and indicate how often the import should run.

1. Select **Next.** Under Storage, you will create a new or select an existing database and create a new or select an existing table for where you want to place the imported data.
2. Select a **Database** > **Select an existing** or **Create New Database**.
3. Optionally, type a database name.
4. Select a **Table**> **Select an existing** or **Create New Table**.
5. Optionally, type a table name.
6. Choose the method for importing the data.
  - **Append** (default)-Data import results are appended to the table.
If the table does not exist, it will be created.
  - **Always Replace**-Replaces the entire content of an existing table with the result output of the query. If the table does not exist, a new table is created.
  - **Replace on New Data**-Only replace the entire content of an existing table with the result output when there is new data.
7. Select the **Timestamp-based Partition Key** column.
If you want to set a different partition key seed than the default key, you can specify the long or timestamp column as the partitioning time. As a default time column, it uses upload_time with the add_time filter.
8. Select the **Timezone** for your data storage.
9. Under **Schedule**, you can choose when and how often you want to run this query.


#### Run once

1. Select **Off**.
2. Select **Scheduling Timezone**.
3. Select **Create & Run Now**.


#### Repeat Regularly

1. Select **On**.
2. Select the **Schedule**. The UI provides these four options: *@hourly*, *@daily* and *@monthly* or custom *cron*.
3. You can also select **Delay Transfer** and add a delay of execution time.
4. Select **Scheduling Timezone**.
5. Select **Create & Run Now**.


After your transfer has run, you can see the results of your transfer in **Data Workbench** > **Databases.**

## Import from OneTrust via CLI (Toolbelt)

Before setting up the integration, install the latest version of the [TD Toolbelt](https://toolbelt.treasuredata.com/).

### Prepare a Load File


```yaml
in:
  type: onetrust
  base_url: ***************
  auth_method: oauth
  access_token: ***************
  data_type: data_subject_profile
  incremental: false
  start_time: 2025-01-30T00:49:04Z
  end_time: 2025-02-28T17:00:00.000Z
  thread_count: 5
out:
  mode: replace
```

This example gets a list of Data Subject Profile objects. The start_time specifies the date to start getting data from. In this case, the import will start pulling data from January 30th, 2025 at 00:49.

**Parameters Reference**

| Name  | Description  | Value  | Default Value  | Require  |
|  --- | --- | --- | --- | --- |
| type | The source of the import. | "onetrust" |  | Yes |
| base_url | Base url of onetrust server. | String. | "app.onetrust.com" | Yes |
| auth_method | Authentication method "oauth" or "api_key" | String. | "oauth" | Yes |
| access_token | Oauth access token require when `oauth` auth mode. | String |  | Yes when auth_method is "oauth" |
| api_key | Api key require when `api_key` auth mode. | String |  | Yes when auth_medthod is "api_key" |
| data_type | The Data Type that want to fetch from OneTrust.- **Data Subject Profile**. Fetch Data Subject Profile data.
- **Collection Point**.Fetch Collection Point data.
- **Data Subject Profile (API V4).** Fetch Data Subject Profule data from API V4.
- **Link Token (API V4)**. Fetch Link Token data from API V4.
- **Purpose (API V4).** Ferch Purpose data from API V4.

 | String. Supported data_type:- data_subject_profile
- collection_point
- data_subject_profile_api_v4
- link_token_api_v4
- purpose_api_v4

 |  | Yes |
| collection_point_guid(Optional) | GUID of a single Collection point to limit data, if not provided, get from all Collection Points. It will show if the Data Type are **Data Subject Profile** and **Purpose (API V4).** | String. |  | No |
| incremental | Enables incremental report loading with new **Start Time** automatic calculation. For example, if you start incremental loading with **Start Time** is `2014-10-02T15:01:23Z` and **End Time** is `2014-10-03T15:01:23Z`, the next jobs run new **Start Time** will be `2014-10-03T15:01:23` | Boolean | False | Yes |
| start_time
 | For UI configuration, you can pick the date and time from supported browser, or input the date that suit the browser expectation of date time. For example, on Chrome, you will have a calendar to select Year, Month, Day, Hour, and Minute; on Safari, you need to input the text such as `2020-10-25T00:00`.
For cli configuration, we need a timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds, for example: `"2014-10-02T15:01:23Z"`.
 | TimeStamp.
 |  | Yes when select Incremental Loading.
No for every the **API V1** Data Type (data_subject_apiand collection_point).
Yes for every **API V4** Data Type. (data_subject_profile_v4, link_token_api_v4 and purpose_api_v4).
 |
| incremental_type | Select the time type that you want to fetch data from OneTrurst. | String.- **data_subject_profile.** Incremental by the last update of the data subject.
- **collection_point**.Incremental by the last consent date of the consent information.

 | "data_subject_profile" | Yes when select Incremental Loading in the **data_subject_profile**Data Type. |
| data_subject_properties(Optional)
 | Comma-separated setting to add `properties` query param when fetch **Data Subject Profile** Data Type.
It will be showed when select the Data Type **Data Subject Profile.**
 | String
 |  | No
 |
| end_time
 | For UI configuration, you can pick the date and time from supported browser, or input the date that suit the browser expectation of date time. For example, on Chrome, you will have a calendar to select Year, Month, Day, Hour, and Minute; on Safari, you need to input the text such as `2020-10-25T00:00`.
For cli configuration, we need a timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds, for example: `"2014-10-02T15:01:23Z"`.
 | TimeStamp
 |  | No for every the **API V1** Data Type (data_subject_apiand collection_point).
Yes for every **API V4** Data Type. (data_subject_profile_v4, link_token_api_v4 and purpose_api_v4).
 |
| request_continues_paging | If true, the ingestion process will run to fetch data that is paginated by request continues tokens. This option is useful if the data volume is large. | Boolean | false | No |
| ingest_duration_minutes | Set the time range for ingestion duration when target is data_subject_profile and request_continues_paging is true | Integer | 1440 | No |


To preview the data, use the
`td connector:preview`
command.


```bash
td connector:preview load.yml
```

### Execute the Load Job

It might take a couple of hours, depending on the size of the data. Be sure to specify the Treasure Data database and table where the data should be stored.  Treasure Data also recommends specifying the --time-column option because Treasure Data’s storage is partitioned by time (see [data partitioning](https://docs.treasuredata.com/smart/project-product-documentation/data-partitioning-in-treasure-data)). If this option is not provided, the data connector chooses the first long or timestamp column as the partitioning time. The type of the column specified by --time-column must be either of long and timestamp type.

If your data doesn’t have a time column, you can add a time column by using the *add_time* filter option. For more details see the documentation for the [add_time Filter Function](https://docs.treasuredata.com/smart/project-product-documentation/add_time-filter-function).


```bash
$ td connector:issue load.yml --database td_sample_db --table td_sample_table \--time-column created_at
```

The connector:issue command assumes that you have already created a database(td_sample_db) and a table(td_sample_table). If the database or the table does not exist in TD, this command fails. Create the database and table manually or use --auto-create-table option with td connector:issue command to auto-create the database and table.


```
$ td connector:issue load.yml --database td_sample_db --table td_sample_table--time-column created_at --auto-create-table
```

The data connector does not sort records on the server side. To use time-based partitioning effectively, sort records in files beforehand.

If you have a field called time, you don’t have to specify the --time-column option.


```
$ td connector:issue load.yml --database td_sample_db --table td_sample_table
```

### Import Modes

Specify the file import mode in the out: section of the load.yml file. The out: section controls how data is imported into a Treasure Data table. For example, you may choose to append data or replace data in an existing table.

| **Mode** | **Description** | **Examples** |
|  --- | --- | --- |
| Append | Records are appended to the target table. | in:    ...  out:    mode: append |
| Always   Replace | Replaces data in the target table.   Any manual schema changes made to the target table remain intact. | in:    ...  out:    mode: replace |
| Replace on new data | Replaces data in the target table only when there is new data to import. | in:    ...  out:    mode: replace_on_new_data |


### Scheduling Executions

You can schedule periodic data connector execution for incremental file import. The Treasure Data scheduler is optimized to ensure high availability.

For the scheduled import, you can import all files that match the specified prefix and one of these conditions:

- If use_modified_time is disabled, the last path is saved for the next execution. On the second and subsequent runs, the integration only imports files that come after the last path in alphabetical order.
- Otherwise, the time that the job is executed is saved for the next execution. On the second and subsequent runs, the connector only imports files that were modified after that execution time in alphabetical order.


### Create a Schedule Using the TD Toolbelt

A new schedule can be created using the td connector:create command.


```
$ td connector:create daily_import "10 0 * * *" \td_sample_db td_sample_table load.yml
```

Treasure Data also recommends specifying the --time-column option because Treasure Data’s storage is partitioned by time (see [data partitioning](https://docs.treasuredata.com/smart/project-product-documentation/data-partitioning-in-treasure-data)).


```
$ td connector:create daily_import "10 0 * * *" \td_sample_db td_sample_table load.yml \--time-column created_at
```

The cron parameter also accepts three special options: @hourly, @daily, and @monthly.

By default, the schedule is set up in the UTC timezone. You can set the schedule in a timezone using -t or --timezone option.  The --timezone option supports only extended timezone formats like Asia/Tokyo, America/Los_Angeles, etc. Timezone abbreviations like PST, CST are not supported and might lead to unexpected schedules.