You can ingest data of AccountsProfile ManagementData Store, and Audit Log from Gigya (SAP Customer Data Cloud) into Treasure Data.

 Prerequisites

Incremental loading uses the maximum value (max value) in the specified incremental column to load all records until the max value for the first execution and subsequent runs imports records from (max value +1) from the previous run to the current time when the job runs, which becomes the new max value.

Supported

  • Select clause
  • From clause
  • Where clause
  • Group By clause



Limitations and Requirements

  • STARTCONTAINS and WITH keyword is not supported
  • COUNTERS is not supported
  • LIMIT clause in the query will be removed automatically(SELECT * FROM ACCOUNTS LIMIT 100) returns all records ignoring the limit
  • Query with aggregate functions (sum, min, max, avg, sum_of_squares, variance, std ) and Group By clause only could ingest the first page
  • If you enter an invalid object in FROM clause (i.e SELECT * FROM unexisted) the query will be automatically fall back to the accounts object
  • Columns names are case-sensitive and Objects name are case-insensitive

Query Syntax Limitation

Treasure Data supports the following SQL query syntax for Gigya:

    • Select clause
    • From clause
    • Where clause
    • Group By clause
  • START, CONTAINS, and WITH keywords are not supported.
  • COUNTERS is not supported.
  • a LIMIT clause in the query is removed automatically.
    SELECT * FROM ACCOUNTS LIMIT 100 returns all records and ignores the limit.
  • A query with aggregate functions (sum, min, max, avg, sum_of_squares, variance, std ) and Group By clause can only ingest the first page.
  • Columns names are case-sensitive.
  • Object names are case-insensitive.
  • The Incremental Column should be numeric or timestamp.

Data Source Limitation

For Account Data Source

  • Your data query is limited to two objects: accounts, and email accounts.
  • If you enter an invalid object in a FROM clause, the query automatically falls back to the accounts object.
    For example, SELECT * FROM <does_not_exist> , substitutes accounts for <does_not_exist>.
  • Supported incremental columns for accounts objects are:[lastLogin, registered, oldestDataUpdatedTimestamp, lastUpdated, verifiedTimestamp, oldestDataUpdated, lastUpdatedTimestamp, created, createdTimestamp, verified, registeredTimestamp, lastLoginTimestamp, lockedUntil]
  • Supported incremental columns for emailAccounts are: [lastUpdated, lastUpdatedTimestamp, created, createdTimestamp]
  • Referenced link: accounts.search REST

For Profile Management Data Source

  • Your data query is limited to two objects, accounts, and email accounts.
  • If you enter an invalid object in a FROM clause, the query automatically falls back to the accounts object.
    For example, SELECT * FROM <does_not_exist> , substitutes accounts for <does_not_exist>.
  • Supported incremental columns for accounts objects are:[lastLogin, registered, oldestDataUpdatedTimestamp, lastUpdated, verifiedTimestamp, oldestDataUpdated, lastUpdatedTimestamp, created, createdTimestamp, verified, registeredTimestamp, lastLoginTimestamp, lockedUntil]
  • Supported incremental columns for emailAccounts are:[lastUpdatedTimestamp, created, createdTimestamp, lastUpdated]
  • Referenced link: ids.search REST

For Data Store Data Source

  • If you enter an invalid object in a FROM clause, you will receive error [400006] Invalid parameter value: Invalid argument: accounts type not allowed
  • Depending on the schema of the target Data Store then incremental columns will be varied. But you could enter an invalid column name and see the error return from TD Console to know what is an acceptable column name.
  • Referenced link: ds.search REST

For Audit Log Data Source

  • Supported incremental columns are: [@timestamp]
  • Referenced link: audit.search


Incremental Loading and Numeric and Timestamp Columns

Incremental loading uses the maximum value (max value) in the specified incremental column to load all records till max value for the first execution and subsequent runs import records from (max value +1) from the previous run to the current time when the job runs (which becomes the new max value).

Support for:

  • Incremental columns of numeric or timestamp type
  • Incremental columns for Accounts objects are:[lastUpdated, lastUpdatedTimestamp, created, createdTimestamp]
  • Incremental columns for EmailAccounts are:
    [lastLogin, registered, oldestDataUpdatedTimestamp, lastUpdated, verifiedTimestamp, oldestDataUpdated, lastUpdatedTimestamp, created, createdTimestamp, verified, registeredTimestamp, lastLoginTimestamp, lockedUntil]
  • Supported incremental columns for auditLog are: [@timestamp]


Use the TD Console to Create Your Connection

Obtain your API Key, User Key, and Secret Key From Gigya

  1. Follow the instruction in Creating and Managing Applications to create your own application and obtain the App User Key and Secret Key.
  2. Follow the instructions in API Key and Site Setup to obtain your API Key.
  3. Follow the instructions to determine your Data Center.

Create a New Connection

When you configure a data connection, you provide authentication to access the integration. In Treasure Data, you configure the authentication and then specify the source information.

1. Open TD Console.
2. Navigate to Integrations Hub ->  Catalog
3. Search and select Gigya (SAP Customer Data Cloud).

The following dialog opens.

4. Choose your account Data Center.
5. Type values for the following:
    • API Key
    • User Key
    • User Secret

6. Select Continue.
7. Enter a name for your connection and select Done.

Transfer Your Gigya Accounts Data to Treasure Data

After creating the authenticated connection, you are automatically taken to the Authentications tab.

1. Search for the connection you created and select New Source.

2. Name the Source.
3. Select Next.
4. In the Source Table, edit the parameters.

Parameters

Description

Data Source

Target data source. Current support: Accounts, Profile Management, Data Store and Audit Log

Query

Gigya's query to ingest data. Depend on your target object, the query would be variant.

For Account and Profile Management data source, only support on accounts and emailAccounts object (sample query: Select * From accounts)

For Data Store it would be whatever your object is (sample query: Select * From my_data)

For Audit Log, only support for auditLog object (sample query: Select * From auditLog

Batch Size

The maximum number of records to fetch in a single API call. The maximum value is 10000 and the minimum value is 10. When you customize the batch size, consider the following: a smaller value will let the API return faster but will cause more API calls.

Incremental

When running on a schedule, the next import ingests only the data that was updated after the last run based on the value of the Incremental Column.

Incremental Column

Which data object's column on which to perform the incremental transfer.

For Account and Profile Management data source, suggested values are: created, createdTimestamp, updated, and updatedTimestamp.

For Data Store data source, suggested values are: numeric or datetime colum

For Audit Log data source, suggested values are: @timestamp

Data Settings

5. In this dialog, you can edit data settings or opt to skip this step.


Data Preview 


You can see a preview of your data before running the import by selecting Generate Preview.

Data shown in the data preview is approximated from your source. It is not the actual data that is imported.

  1. Select Next.
    Data preview is optional and you can safely skip to the next page of the dialog if you want.

  2. To preview your data, select Generate Preview. Optionally, select Next

  3. Verify that the data looks approximately like you expect it to.


  4. Select Next.

Data Placement

For data placement, select the target database and table where you want your data placed and indicate how often the import should run.

  1.  Select Next. Under Storage you will create a new or select an existing database and create a new or select an existing table for where you want to place the imported data.

  2. Select a Database > Select an existing or Create New Database.

  3. Optionally, type a database name.

  4. Select a Table> Select an existing or Create New Table.

  5. Optionally, type a table name.

  6. Choose the method for importing the data.

    • Append (default)-Data import results are appended to the table.
      If the table does not exist, it will be created.

    • Always Replace-Replaces the entire content of an existing table with the result output of the query. If the table does not exist, a new table is created. 

    • Replace on New Data-Only replace the entire content of an existing table with the result output when there is new data.

  7. Select the Timestamp-based Partition Key column.
    If you want to set a different partition key seed than the default key, you can specify the long or timestamp column as the partitioning time. As a default time column, it uses upload_time with the add_time filter.

  8. Select the Timezone for your data storage.

  9. Under Schedule, you can choose when and how often you want to run this query.

    • Run once:
      1. Select Off.

      2. Select Scheduling Timezone.

      3. Select Create & Run Now.

    • Repeat the query:

      1. Select On.

      2. Select the Schedule. The UI provides these four options: @hourly, @daily and @monthly or custom cron.

      3. You can also select Delay Transfer and add a delay of execution time.

      4. Select Scheduling Timezone.

      5. Select Create & Run Now.

 After your transfer has run, you can see the results of your transfer in Data Workbench > Databases.




After your transfer has run, you can see the results of your transfer in Data Workbench > Databases.


  • No labels