You can import Magento (v2) objects into Treasure Data.
Basic knowledge of Treasure Data
Basic knowledge of Magento 2
Create a New Authentication
When you configure a data connection, you provide authentication to access the integration. In Treasure Data, you configure the authentication and then specify the source information.
Open the TD Console
Navigate to Integrations Hub -> Catalog
Search for Magento.
The following dialog opens. Choose one of the Authentication Modes. Edit the required credentials.
Authentication Mode Options
Using Login Credentials to Authenticate
Select Login Credential and enter your Admin-level user name and password.
Using an Integration Access Token to Authenticate
To get the access token, in your Magento Admin view, go to System > Integrations > Add New Integration (or open an existing one).
Select the appropriate permissions for your target resources.
Activate the integration and then copy the Access Token.
In Treasure Data, paste the name of the access token. For Base URL, set the value to be your Magento 2 REST API endpoint (for example,
Name your Authentication. Click Done.
Transfer Your Magento Data to Treasure Data
After creating the authenticated connection, you are automatically taken to the Authentications tab. Look for the connection you created and click New Source.
The following dialog opens.
Complete the details.
API URL: The desired resource (for example,
/V1/orders). The target API must conform to Magento Search API:
Incremental: Specify columns and column type. Set a begin and end value. These values are used as the initial fetching range and are automatically increased while keeping the same range (or duration for timestamp type).
When running on schedule, the time window of the fetched data automatically shifts forward on each run. For example, if the initial config is January 1, with ten days in duration, the first run fetches data modified from January 1 to January 10, the second run fetches from January 11 to January 20, and so on.
Review "How Incremental Loading Work: in the Appendix below.
Click Next to preview the data.
The Magento data connector must sample a response to guess for the resource's schema, and the Magento API potentially omits the decimal part of decimal numbers if they are rounded. The Magento data connector might incorrectly guess a column's data type as
long instead of
double and cause loss of precision. You can adjust to the correct types in Advance Settings.
Click Advanced Settings to see the schema settings.
Choose the target database and table
Choose an existing or create a new database and table where you want to transfer data to:
In the Schedule tab, you can specify a one-time transfer, or you can schedule an automated recurring transfer.
If you select Once now, click Start Transfer. If you select Repeat, specify your schedule options, then click Schedule Transfer.
After your transfer has run, you can see the results of your transfer in Data Workbench > Databases.
Use the Command Line to Create your Magento Connection
You can use the CLI to configure your connection.
Install the Treasure Data Toolbelt
Open a terminal and run the following command to install the newest Treasure Data Toolbelt.
Prepare load.yml file
Prepare load.yml. The in: section is where you specify what comes into the connector from Magento and the out: section is where you specify what the connector puts out to the database in Treasure Data.
Provide your Magento account access information as follows:
Either password or token
The Access Token for auth_method: token
Admin username for auth_method: username
Admin password for auth_method: password
Magento REST API endpoint, example: https://my-magento-server/rest
the target resource URL, example: /V1/products
should data import be continuous or once, default as false.
which column should be used as the cursor for incremental loading
the data type of incremental_column, either timestamp or number
initial start value for incremental_column_type: timestamp
initial end value for incremental_column_type: timestamp
initial start value for incremental_column_type: number
initial end value for incremental_column_type: number
Preview data to import (optional)
You can preview data to be imported using the command
Execute the Load Job
Use td connector:issue to execute the job. Processing might take a couple of hours depending on the data size. The following are required:
name of the schedule
database and table where their data will be stored
the Data Connector configuration file
The preceding command assumes you have already created database(td_sample_db) and table(td_sample_table). If the database or the table do not exist in TD, this command will not succeed so create the database and table manually or use -auto-create-table option with
td connector:issue command to auto create the database and table:
It is recommended to specify --time-column option, because Treasure Data’s storage is partitioned by time. If the option is not given, the data connector selects the first long or timestamp column as the partitioning time. The type of the column, specified by --time-column, must be either of long or timestamp type (use Preview results to check for the available column name and type. Generally, most data types have a last_modified_date column).
A time column is available at the end of the output.
If your data doesn’t have a time column you may add it using the add_time filter. You add the "time" column by adding the add_time filter to your configuration file as follows.
More details at add_time filter plugin.
If you have a field called
time, you do not have to specify -time-column option.
You can schedule a periodic Data Connector execution for a periodic Magento import. We manage our scheduler to ensure high availability. By using this feature, you no longer need a
cron daemon on your local data center.
Create the schedule
A new schedule can be created using the
td connector:create command. The name of the schedule, cron-style schedule, the database and table where their data will be stored, and the Data Connector configuration file are required.
The `cron` parameter also accepts these three options: `@hourly`, `@daily` and `@monthly`.
By default, the schedule is setup in the UTC timezone. You can set the schedule in a timezone using -t or --timezone option. The `--timezone` option only supports extended timezone formats like 'Asia/Tokyo', 'America/Los_Angeles' etc. Timezone abbreviations like PST, CST are *not* supported and may lead to unexpected schedules.
You can load records incrementally by specifying columns in your table using
incremental_end_timestamp respectively if
If you’re using scheduled execution, the connector automatically remembers the last value state and holds it internally. Then you can use it at the next scheduled execution. See Appendix: How Incremental Loading Works for details.
List the Schedules
You can see the list of currently scheduled entries by
Show the Setting and History of Schedules
td connector:show shows the execution setting for a schedule entry.
td connector:history shows the execution history of a schedule entry. To investigate the results of each individual execution, use
td job <jobid>.
Delete the Schedule
td connector:delete removes the schedule.
How can I get all the items from all the active carts on my Magento Site?
For those API that Magento 2 doesn't support off the shelf. Since TD Magento Connector simply requests from its REST API, you have to expose your own API and set the corresponding "Resource URL" (
resource_url) configuration. Keep in mind that API have to implement SearchCriteriaInterface for Magento Connector to work.
How Incremental Loading works
Incremental loading requires a monotonically increasing unique column (
incremental_column) to load records that were inserted (or updated) after the last execution.
This mode is useful when you want to fetch just the object targets that have changed since the previously scheduled run. For example, if
updated_at option is set, the request is as follows:
When bulk data loading finishes successfully, it outputs
incremental_run_count(number of jobs ran, starting at 0) parameters as config-diff so that the next execution uses it.
At the next execution, when those parameters are also set, this plugin uses
incremental_last_value as the new lower bound,
condition_type will also switch from
gteq. For upper bound, it is calculated by:
incremental_start_<timestamp/number> + incremetal_run_count * (incremental_end_<timestamp/number> - incremental_start_<timestamp/number>),
condition_type stays as
Currently, only strings, timestamp and integers are supported as incremental_columns.
If incremental: false is set, the data connector fetches all the records of the specified Magento object type, regardless of when they were last updated. This mode is best combined with writing data into a destination table using ‘replace’ mode.