VTEX is an eCommerce solution used by large brands in more than 40 countries. With this integration, you can ingest your customer information via Master Data as well as Orders from this eCommerce platform.

This Data Connector is in Beta. For more information, contact support@treasuredata.com.


Prerequisites

  • Basic Knowledge of Treasure Data

  • Basic knowledge of VTEX eCommerce platform

  • A valid VTEX account

Supported

In Orders, we support Custom Filters to have more data categorization and reduce the data size per load. However, as the f_creationDate is designed to be derived from From Date and To Date parameters, you must not include it in the Customer Filters to ensure the expected behavior of responded data. Do not use f_creationDate to filter Orders.


Incremental Loading for VTEX

Incremental Loading allows you to import new data each time job execution with no duplicate or missing data.

Orders Data Type

Implicitly, the Orders data type uses creationDate field for its incremental loading. After each run, the latest value of creationDate will be used as “From Date” for the next job execution.

For example, you set up importing Orders.

From Date = 2020-10-11T01:43:47.900Z and

To Date = 2020-11-11T01:43:47.900Z

  • After the first run, the latest value of creationDate field is 2020-11-10T08:00:00.000Z

  • The next job execution will start with:
    From Date = 2020-11-10T08:00:00.001Z and
    To Date = current execution time

  • And so on…

1 millisecond is added to the creationDate for the next job execution is due to the inclusion of From Date of the job result.

Master Data

Master data uses Incremental Field for its incremental loading. Implicitly, the API requests for Master data is ordered by the decending Incremental Field. For example, _sort=creationDate desc.

The very first value responses from the API will be used as a filter param for the next execution.

E.g. _where=creationDate>2020-11-11T01:43:47.900Z

For example, the Incremental Field is creationDate

  • First run, the request URL: https://…./scroll?_sort=creationDate desc

    • The first value of the response has creationDate=2020-11-11T01:43:47.900Z

  • Next run, the request URL will be: https://…./scroll?_sort=creationDate desc&_where=creationDate>2020-11-11T01:43:47.900Z

  • An so on…

Because the data is ordered by the Incremental Field, the field must concur with the following conditions:

  • Must be a timestamp or numeric field.

  • Must not be nulled

  • Must be indexed field.

Use the TD Console to Create Your Connection

Create a New Connection

In Treasure Data, you must create and configure the data connection prior to running your query. As part of the data connection, you provide authentication to access the integration.

  1. Open TD Console.

  2. Navigate to Integrations HubCatalog.

  3. Search for and select VTEX.

  4. The following dialog opens.

  5. Enter your VTEX account name, APP Key and App Token and select Environment

  6. Select CONTINUE.

Name Your Connection

  1. Type a name for your connection.

  2. Select Done.

Transfer Your VTEX Account Data to Treasure Data

After creating the authenticated connection, you are automatically taken to Authentications.

  1. Search for the connection you created. 

  2. Select New Source.

Connection

  1. Type a name for your Source in the Data Transfer field.

Source Table

  1. Select Next.
    The Source Table dialog opens.

  2. Edit the following parameters:

Parameters

Description

Data Type

Data type to import. You select

  • Orders

  • Master Data V2

From Date

Load Orders from this date. Support timestamp format: yyyy-MM-dd'T'HH:mm:ss.SSSZ

To Date (Optional)

Load Orders to this date. Leave blank will import Orders to the current date

Custom Filters (Optional)

Additional custom filters for Orders import. This is the request parameter segment of the List Orders API request. You can input any supported parameters in form of param_name=value for example q=rodrigo.cunha@vtex.com&q=21133355524

Incremental Loading

Import new data only from the last run. See About Incremental Loading

Data Entity Acronym

The acronym of the Data Entity to ingest data from. For example AL for(Address Support)

Filter Condition (Optional)

Master Data _where param value. You use this to filter the result value. E.g. firstName=Jon OR lastName=Smith

Keywords (Optional)

Master Data _keyword param value. Use this value to filter Master Data to match this keyword only. E.g *Maria*

Schema (Optional)

Master Data schema name to filter documents by the compatibility of the schema

Incremental Field

When selected Incremental Loading for Master Data type, you must enter the field name. The field must be a Timestamp or Numeric field and must be indexed and must not be null.

Data Settings

  1. Select Next.
    The Data Settings page opens.

  2. Optionally, edit the data settings or skip this page of the dialog. 

Parameters

Description

Retry Limit

Maximum retry times for each API call.

Initial retry time wait

Wait time for the first retry.

Max retry wait

Maximum time between retries.

HTTP Connect Timeout

The amount of time before the connection times out when doing API calls.

HTTP Read Timeout

The amount of time waiting for writing data into the request.

HTTP Write Timeout

The amount of time waiting for reading data from the response.

Data Preview 

You can see a preview of your data before running the import by selecting Generate Preview.

Data shown in the data preview is approximated from your source. It is not the actual data that is imported.

  1. Select Next.
    Data preview is optional and you can safely skip to the next page of the dialog if you want.

  2. To preview your data, select Generate Preview. Optionally, select Next

  3. Verify that the data looks approximately like you expect it to.


  4. Select Next.

 Data Placement

For data placement, select the target database and table where you want your data placed and indicate how often the import should run.

  1.  Select Next. Under Storage you will create a new or select an existing database and create a new or select an existing table for where you want to place the imported data.

  2. Select a Database > Select an existing or Create New Database.

  3. Optionally, type a database name.

  4. Select a Table> Select an existing or Create New Table.

  5. Optionally, type a table name.

  6. Choose the method for importing the data.

    • Append (default)-Data import results are appended to the table.
      If the table does not exist, it will be created.

    • Always Replace-Replaces the entire content of an existing table with the result output of the query. If the table does not exist, a new table is created. 

    • Replace on New Data-Only replace the entire content of an existing table with the result output when there is new data.

  7. Select the Timestamp-based Partition Key column.
    If you want to set a different partition key seed than the default key, you can specify the long or timestamp column as the partitioning time. As a default time column, it uses upload_time with the add_time filter.

  8. Select the Timezone for your data storage.

  9. Under Schedule, you can choose when and how often you want to run this query.

    • Run once:
      1. Select Off.

      2. Select Scheduling Timezone.

      3. Select Create & Run Now.

    • Repeat the query:

      1. Select On.

      2. Select the Schedule. The UI provides these four options: @hourly, @daily and @monthly or custom cron.

      3. You can also select Delay Transfer and add a delay of execution time.

      4. Select Scheduling Timezone.

      5. Select Create & Run Now.

 After your transfer has run, you can see the results of your transfer in Data Workbench > Databases.


  • No labels