# Magento Import Integration

Magento helps you create customized shopping experiences for your customers. Magento is an open-source e-commerce platform written in PHP.

With this integration, you can import Magento (v2) objects into Treasure Data.

## Prerequisites

- Basic knowledge of Treasure Data
- Basic knowledge of Magento 2 (Treasure Data supports v. 2.4.6)


## Create a New Authentication

When you configure a data connection, you provide authentication to access the integration. In Treasure Data, you configure the authentication and then specify the source information.

1. Open the TD Console
2. Navigate to **Integrations Hub** -> **Catalog**
3. Search for Magento.
4. Select **Create Authentication**.


![](/assets/image-20250822-140728.e8695503e3a649680a13c14c0403a32f92e1c78e18dd5e0bb34b979722769fcb.989f2923.jpeg)![](/assets/image-20250822-140802.db237f9d42c38ad8fe9b945b173d0889274e7d4ebfae1d9a4f0bb837d56a5a7d.989f2923.jpeg)
5. The following dialog opens. Choose one of the Authentication Modes. Edit the required credentials.

![](/assets/image-20200317-183839.4d35134146edba63454b6f351e5d060e1914144d7cacc62ad57574c59d4284a6.81c7c316.png)
6. Select Continue

## Authentication Mode Options

### Using Login Credentials to Authenticate

1. Select **Login Credential** and enter your Admin-level user name and password.


![](/assets/image-20200318-192714.cf2bb6ce3914c506302a659a9c7bdce4800cc19fd7f89b07162fbbcc4791eef8.81c7c316.png)

### Using an Integration Access Token to Authenticate

1. To get the access token, in your Magento Admin view, go to **System** > **Integrations** > **Add New Integration** (or open an existing one).


![](/assets/image-20200318-212932.2f950528d6adc67fb720325f8320ecc9718ebe4a43ad8d655234b3479faf7fd5.81c7c316.png)
2. Select the appropriate permissions for your target resources.

![](/assets/image-20200318-220439.3829bc484fad0ad9c5a077a17dd590e206daff53c9d147c5574b172e88117978.81c7c316.png)
3. Activate the integration and then copy the Access Token.

![](/assets/image2023-2-28_9-20-17.c5955ff401318ed820101332685599951707f0f22e011e2a07a4126750a630d0.81c7c316.png)
4. In Treasure Data, paste the name of the access token. For **Base URL**, set the value to be your Magento 2 REST API endpoint (for example, `http://my-magento-server/rest/`)

![](/assets/image-20200318-220536.ba9ec1ed7f4cbb0d027b43fef6c3cba87e4877427e4ff71db7d954009cc4dc90.81c7c316.png)
5. Select **Continue**.
6. Name your Authentication. Click **Done.**

## Transfer Your Magento Data to Treasure Data

1. After creating the authenticated connection, you are automatically taken to the Authentications tab. Look for the connection you created and click **New Source**.
2. The following dialog opens.
![](/assets/screenshot-2025-03-26-at-15.57.15.232dd1b588488f13f317d41036a696d71e481110fd5b558aa16b13a76b3e8048.81c7c316.png)
3. Complete the details.


![](/assets/image-20200318-221007.a48b8b801d0ffcc4b3cf89482c3a46cf40465e6a31816de18f194ff32fde5b37.81c7c316.png)

### Parameters:

- **API URL:** The desired resource (for example, `/V1/orders`). The target API must conform to Magento [Search API](https://devdocs.magento.com/guides/v2.3/rest/performing-searches.html):
- **Incremental**: Specify columns and column type. Set a begin and end value. These values are used as the initial fetching range and are automatically increased while keeping the same range (or duration for timestamp type).
When running on schedule, the time window of the fetched data automatically shifts forward on each run. For example, if the initial config is January 1, with ten days in duration, the first run fetches data modified from January 1 to January 10, the second run fetches from January 11 to January 20, and so on.
Review "How Incremental Loading Work: in the Appendix below.
- **Page Size**: Number of records to fetch per API call.


### Preview

Click **Next** to [preview](https://docs.treasuredata.com/smart/project-product-documentation/about-data-preview) the data.

The Magento data connector must sample a response to guess for the resource's schema, and the Magento API potentially omits the decimal part of decimal numbers if they are rounded. The Magento data connector might incorrectly guess a column's data type as `long` instead of `double` and cause loss of precision. You can adjust to the correct types in **Advance** **Settings**.

Click **Advanced Settings** to see the schema settings.

![](/assets/image-20200318-222202.6d38c79bcf73987b8e229bf10d212d72920394e99b63801a68e5d6d4194c27aa.81c7c316.png)

### Choose the target database and table

Choose an existing or create a new database and table where you want to transfer data to:

![](/assets/image-20200318-222300.c57e49e2f04ff6633c4b770d1ad088c40f84081c72f72e69ffc794d4a30069f9.81c7c316.png)

### Scheduling

1. In the **Schedule** tab, you can specify a one-time transfer, or you can schedule an automated recurring transfer.
2. If you select **Once now,** click **Start Transfer.** If you select **Repeat,** specify your schedule options, then click **Schedule Transfer.**
![](/assets/image-20200318-222343.c2dda55a4ab264256fbe6f7adb04e1efd6c915988f81e1cb4a49ec645f419d21.81c7c316.png)
3. After your transfer has run, you can see the results of your transfer in **Data Workbench** > **Databases**.


## Use the Command Line to Create your Magento Connection

You can use the CLI to configure your connection.

### Install the Treasure Data Toolbelt

Open a terminal and run the following command to install the newest [Treasure Data Toolbelt](https://toolbelt.treasuredata.com/).


```bash
$ td --version
0.16.3
```

### Prepare *load.yml* file

Prepare load.yml. The in: section is where you specify what comes into the connector from Magento  and the out: section is where you specify what the connector puts out to the database in Treasure Data.

Provide your Magento account access information as follows:


```yaml
in:
  type: magento
  auth_method: token
  token: xxxxxxxxxxxxx
  base_url: https://your-magento-server/rest/
  resource_url: V1/products
```

### Configuration Keys

| **Keys** | **Description** |
|  --- | --- |
| auth_method | Either password or token |
| token | The Access Token for auth_method: token |
| username | Admin username for auth_method: username |
| password | Admin password for auth_method: password |
| base_url | Magento REST API endpoint, example: [https://my-magento-server/rest](https://my-magento-server/rest) |
| resource_url | the target resource URL, example: /V1/products |
| incremental | should data import be continuous or once, default as false. |
| incremental_column | which column should be used as the cursor for incremental loading |
| incremental_column_type | the data type of incremental_column, either timestamp or number |
| incremental_start_timestamp | initial start value for incremental_column_type: timestamp |
| incremental_end_timestamp | initial end value for incremental_column_type: timestamp |
| incremental_start_number | initial start value for incremental_column_type: number |
| incremental_end_number | initial end value for incremental_column_type: number |
| page_size | Number of records to fetch per API call. Default 200. The value must be between 10 and 1000: number |


### Preview data to import (optional)

You can preview data to be imported using the command `td connector:preview`.


```
$ td connector:preview load.yml
+-----------------+-------------+----
| product_id:long | name:string | ...
+-----------------+-------------+----
| 1               | "Chocolate" |
| 2               | "Cookie"    |
| 3               | "Bike"      |
+-----------------+-------------+----
```

### Execute the Load Job

Use td connector:issue to execute the job. Processing might take a couple of hours depending on the data size. The following are required:

- name of the schedule
- cron-style schedule
- database and table where their data will be stored
- the Data Connector configuration file


```
$ td connector:issue load.yml --database td_sample_db --table td_sample_table --time-column created_at \
    daily_xxxx_import ...
```

The preceding command assumes you have already created *database(td_sample_db)* and *table(td_sample_table)*. If the database or the table do not exist in TD, this command will not succeed so create the database and table [manually](https://docs.treasuredata.com/smart/project-product-documentation/data-management) or use -auto-create-table option with `td connector:issue` command to auto create the database and table:


```bash
td connector:issue load.yml \
--database td_sample_db \
--table td_sample_table \
--time-column created_at \
--auto-create-table
```

It is recommended to specify --time-column option, because Treasure Data’s storage is partitioned by time. If the option is not given, the data connector selects the first long or timestamp column as the partitioning time. The type of the column, specified by --time-column, must be either of long or timestamp type (use Preview results to check for the available column name and type. Generally, most data types have a last_modified_date column).

A time column is available at the end of the output.


```bash
td connector:issue load.yml \
--database td_sample_db \
--table td_sample_table \
--time-column created_at
```

If your data doesn’t have a time column you may add it using the add_time filter. You add the "**time**" column by adding the add_time filter to your configuration file as follows.


```yaml
in:
  type: xxxxx
  ...
filters:
- type: add_time
  from_value:
    mode: upload_time
  to_column:
    name: time
out:
  type: td
```

More details at [add_time filter plugin](https://docs.treasuredata.com/smart/project-product-documentation/add_time-filter-function).

If you have a field called `time`, you do not have to specify -time-column option.

### Scheduled execution

You can schedule a periodic Data Connector execution for a periodic Magento import. We manage our scheduler to ensure high availability. By using this feature, you no longer need a `cron` daemon on your local data center.

### Create the schedule

A new schedule can be created using the `td connector:create` command. The name of the schedule, cron-style schedule, the database and table where their data will be stored, and the Data Connector configuration file are required.


```
$ td connector:create \
    daily_magento_import \
    "10 0 * * *" \
    td_sample_db \
    td_sample_table \
    load.yml
```

The `cron` parameter also accepts these three options: `@hourly`, `@daily` and `@monthly`. | By default, the schedule is setup in the UTC timezone. You can set the schedule in a timezone using -t or --timezone option. The `--timezone` option only supports extended timezone formats like 'Asia/Tokyo', 'America/Los_Angeles' etc. Timezone abbreviations like PST, CST are *not* supported and may lead to unexpected schedules.

### Incremental Scheduling

You can load records incrementally by specifying columns in your table using `incremental_column.`


```
in:
  type: magento
  base_url: ...
  api_url: ...
  auth_method: token
  token: ...
  incremental_column: true
  incremental_column_type: number
  incremental_start_number: ...
  incremental_end_number: ...
out:
  mode: append
  exec: {}
```

Replace `incremental_start_number` and `incrementa_end_number` to `incremental_start_timestamp` and `incremental_end_timestamp` respectively if `incremental_column_type: timestamp`.

If you’re using [scheduled execution](https://docs.treasuredata.com/smart/project-product-documentation/scheduling-jobs-using-td-console), the connector automatically remembers the last value state and holds it internally. Then you can use it at the next scheduled execution. See Appendix: How Incremental Loading Works for details.

### List the Schedules

You can see the list of currently scheduled entries by `td connector:list`.


```
$ td connector:list
+-----------------------+--------------+----------+-------+--------------+-----------------+------------------------------+
| Name                  | Cron         | Timezone | Delay | Database     | Table           | Config                       |
+-----------------------+--------------+----------+-------+--------------+-----------------+------------------------------+
| daily_magento_import  | 10 0 * * *   | UTC      | 0     | td_sample_db | td_sample_table | {"type"=>"magento", ... } |
+-----------------------+--------------+----------+-------+--------------+-----------------+------------------------------+
```

## Appendix

### How can I get all the items from all the active carts on my Magento Site?

For those API that Magento 2 doesn't support off the shelf. Since TD Magento Connector simply requests from its REST API, you have to expose your own API and set the corresponding "Resource URL" (`resource_url`) configuration. Keep in mind that API have to implement [SearchCriteriaInterface](https://devdocs.magento.com/guides/v2.3/rest/performing-searches.html) for Magento Connector to work.

### How Incremental Loading works

Incremental loading requires a monotonically increasing unique column (`incremental_column`) to load records that were inserted (or updated) after the last execution.

This mode is useful when you want to fetch just the object targets that have changed since the previously scheduled run. For example, if `incremental_column`: `updated_at` option is set, the request is as follows:


```http
GET https://your-magento-server/rest/V1/searchable-resource
&searchCriteria[filter_groups][0][filters][0][field]=updated_at
&searchCriteria[filter_groups][0][filters][0][value]=incremental_start_timestamp
&searchCriteria[filter_groups][0][filters][][condition_type]=gt
&searchCriteria[filter_groups][0][filters][0][field]=updated_at
&searchCriteria[filter_groups][0][filters][0][value]=incremental_end_timestamp
&searchCriteria[filter_groups][0][filters][][condition_type]=lt
```

When bulk data loading finishes successfully, it outputs `incremental_last_value` and `incremental_run_count`(number of jobs ran, starting at 0) parameters as config-diff so that the next execution uses it.
At the next execution, when those parameters are also set, this plugin uses `incremental_last_value` as the new lower bound, `condition_type` will also switch from `gt` to `gteq`. For upper bound, it is calculated by: `incremental_start_<timestamp/number> + incremetal_run_count * (incremental_end_<timestamp/number> - incremental_start_<timestamp/number>)`, `condition_type` stays as `lt`.


```
GET https://your-magento-server/rest/V1/searchable-resource
&searchCriteria[filter_groups][0][filters][0][field]=updated_at
&searchCriteria[filter_groups][0][filters][0][value]=incremental_last_value
&searchCriteria[filter_groups][0][filters][][condition_type]=gteq
&searchCriteria[filter_groups][0][filters][0][field]=updated_at
&searchCriteria[filter_groups][0][filters][0][value]=increased_end_value_with_the_mentioned_formula
&searchCriteria[filter_groups][0][filters][][condition_type]=lt
```

Currently, only strings, timestamp and integers are supported as incremental_columns.

`If incremental: false` is set, the data connector fetches all the records of the specified Magento object type, regardless of when they were last updated. This mode is best combined with writing data into a destination table using ‘replace’ mode.

### Unexpected 401 error with Integration Access Token

When you use Integration Access Token as Authentication Mode, you need to specify **Yes** to **Allow OAuth Access Tokens to be used as standalone Bearer tokens** option in Magento Admin. You can find this option via **Stores > Configuration > Services > OAuth > Consumer Settings**. If you don't do it, you will encounter unexpected 401 error.