Skip to content
Last updated

Amplitude Export Integration

Learn more about Amplitude Import Integration.

Use queries to create Amplitude Events from the data you store in Treasure Data.

Prerequisites

  • Basic knowledge of Treasure Data, including the TD Toolbelt.
  • An Amplitude account that can grant permissions to Treasure Data.

Create an Amplitude Account

If you do not have an Amplitude account, go to https://amplitude.com and register. Then, create an organization and a project. When you create a project, you receive an API Key for the project.

Amplitude User Activity

In the Amplitude console, you can view User Activity to explore Amplitude events:

Usage

Go to the TD Console, and Queries and select New Query to access the query editor. You can pick the query language and write your query. Select Output results.

Create or Select an Existing Amplitude Connection

OptionTask StepsTip
Create a connector- Go to Connections. - Select on the Amplitude icon from Sources Catalog. - Complete the required fields on the first pane. - In the API Key field, enter the key that you use in Amplitude. - In the Secret Key field, delete the default entry and leave the field empty. Next you complete the Configuration fields.Sometimes you need to define the column mapping before writing the query.
Select a connector- Go to Treasure Data console. - Go to Query Editor. - Access the query that you plan to use to export data. - Select Output results. The Choose Saved Connection dialog opens. - Type the connection name in the search box to filter and select the connection that you want. Next you complete the Configuration fields.You can also create a new connection from here by selecting on Create New Connection.

Creating a new connection:

You must enter your API Key. For output to Amplitude, the Secret Key field must be empty. If necessary, delete the default entry the Secret Key field. Region: support the data center for each region in the EU and the US (default)

Selecting an existing connection:

Optional: Complete Additional Configuration

After creating a new amplitude connection or selecting an existing one, you see the Additional configuration popup, for example:

ParametersDescriptionDefault values
TargetTarget Amplitude object to which data needs to be uploaded. Supported: - Event - IdentifyEvent
Disable auto-generation of Insert IDs for deduplicationBy default, TD automatically generates the insert_ids for deduplication. See Event Deduplication for details.Unchecked
OperationOperation to perform on Amplitude. Supported: - Set - Set Once - Add - Append - Prepend - Unset - Pre Insert - Post Insert - RemoveAdd
RegionSupport the data center for each region in the EU and the US (default).US
Number of events per batchNumber of records that TD connector sends as a batch to Amplitude10

Write a Quick Query to Populate Amplitude Events

In the following example, the Amplitude project has no events before the output of a query result:

In Treasure Data, run the following query with Amplitude as Result Export:

SELECT
  user_id,
  device_id,
  event_type,
  event_properties,
  country,
  time
FROM (
  VALUES(
    'a@gmail.com',
    'f925065b-4d56-486f-a2be-e97952e62925',
    'game_open',
    '{"load_time":0.0619166532,"dates":["sunday","monday"],"source":"notification"}',
    'United States',
    1515083919310
  ),
  (
    'b@gmail.com',
    'b36c8c80-71a0-4de4-8bcb-c801f5c1c6c3',
    'select_avatar',
    '{"load_time":0.0517593568,"dates":["friday","sunday"],"source":"notification"}]',
    'United States',
    1515083943092
  ),
  (
    'c@gmail.com',
    'e44a1024-c4bb-4eef-a02a-f2a7b3bf68b3',
    'watch_tutorial',
    '{"load_time":0.0689998562,"dates":["sunday","monday"],"source":"notification"}',
    'United States',
    1515083970894
  ),
  (
    'd@gmail.com',
    '8274fee1-7222-4e41-83ed-fb5521d5f4b8',
    'game_start',
    '{"load_time":0.0976384392,"dates":["friday","monday"],"source":"notification"}',
    'United States',
    1515084000948
  ),
  (
    'e@gmail.com',
    '29ba193d-0955-4084-9640-bb4d8ba9a23f',
    'ad_clicked',
    '{"load_time":0.0708434955,"dates":["monday","saturday"],"source":"notification"}',
    'United States',
    1515084024759
  )
  ) tbl(
    user_id,
    device_id,
    event_type,
    event_properties,
    country,
    time
  )

The preceding query requires no source table, which makes testing easy, but you still must choose a database. Pick "sample_datasets" or any other arbitrary table. Also, make sure that Presto is chosen as the SQL dialect.

The query completes in a few seconds. Log into your Amplitude account and check User Activity. It should look similar to this:

All columns or aliases included in your query must use the names and data types as described in the Amplitude API specification.

Combine Multiple Values into a Single JSON Column Value

If you have multiple values, you can combine them into a single JSON column value as follows:

SELECT CAST(MAP(ARRAY['source', 'status', 'action'], ARRAY['notification', 'pending', 'add']) AS JSON) AS event_properties

You create a simple JSON object for event_properties: '{"action":"add","source":"notification","status":"pending"}'.

You can create a more complex column, as follows:

SELECT CAST(MAP(ARRAY['load_time', 'source', 'dates'], ARRAY[CAST (0.8371 AS JSON), CAST ('notification' AS JSON), CAST(ARRAY['monday', 'tuesday'] AS JSON)]) AS JSON) AS event_properties

You create the following JSON object as the event_properties value: '{"dates":["monday","tuesday"],"load_time":0.8371,"source":"notification"}'.

Write a Query to a TD Table to Populate Amplitude Events

An example of a TD database table that can be used to create records for Amplitude is as follows:

In the Treasure Data console, go to query editor and enter a query, for example:

SELECT user_id, device_id, event_type, event_properties, country, event_time AS time FROM events

Select Output results to send the data to Amplitude. Next, you select an existing connector, or create a new connection using your api_key.

When the connector has been selected or created, in the query editor, select Run to begin the data transfer.

Event Deduplication

It is highly recommended that an insert_id be sent with each event to prevent duplicate events from being saved by Amplitude. The insert_id is a unique identifier an event. Amplitude drops events that have the same insert_id if the IDs are sent within 7 days of each other.

Disable Auto-Generation of Insert IDs

If you do not select to disable auto-generation of insert IDs, then TD automatically generates a SHA256 insert_id for each record.

If you select to disable auto-generation of insert IDs, then insert_id’s must be provided by the user.

Amplitude Columns

The following table lists event columns and data types supported by Amplitude.

Event Standard Columns

KeyDescription
user_id string(required unless device_id is present) A readable ID specified by you. "datamonster@gmail.com"
device_id string(required unless user_id is present) A device specific identifier, such as the Identifier for Vendor on iOS. "C8F9E604-F01A-4BD9-95C6-8E5357DF265D"
event_type string(required) A unique identifier for your event. "watch_tutorial"
time longThe timestamp of the event in milliseconds since epoch. It will be set to the upload time by default. 1396381378123
event_properties4 json{"load_time": 0.8371, "source": "notification", "dates": ["monday", "tuesday"]}
user_properties4 json{"age": 25, "gender": "female", "interests": ["chess", "football", "music"]}
groups4 json(Enterprise only) {"company_id": "1", "company_name":["Amplitude", "DataMonster"]}
app_versionThe version of your application the user is on. "2.1.3"
platform1,2 stringPlatform of the device. "iOS", "Android", or "Web"
os_name1,2 stringThe mobile operating system or browser the user is on. "iOS", "Android", "Chrome"
os_version1,2 stringThe version of the mobile operating system or browser the user is on. "8.1", "4.2.2", "37"
device_brand1,2 stringThe device brand the user is on. "Verizon"
device_manufacturer1,2 stringThe device manufacturer the user is on. "Samsung", "Asus", "Apple"
device_model1,2 stringThe device model the user is on. "Mac", "iphone 9,1", "sm-g30f"
carrier1,2 stringCarrier of the device. "Verizon"
country1,3 stringThe country the user is in. "United States"
region1,3 stringThe geographical region the user is in. "California"
city1,3 stringWhat city the user is in. "San Francisco"
dma1,3 stringThe Designated Market Area of the user. "San Francisco-Oakland-San Jose, CA"
language1 stringWhat language the user has set. "English"
price5 float(required for revenue data if revenue is not sent) The price of the item purchased. You can use negative values to indicate refunds. 4.99, -1.99
quantity5 integer(required for revenue data, defaults to 1 if not specified) The quantity of the item purchased. 1,2
revenue5 floatrevenue = price * quantity If you send all three fields of price, quantity, and revenue, then (price * quantity) will take precedence and be the revenue value. You can use negative values to indicate refunds. 4.99, -1.99
productId5 stringAn identifier for the product. Note: You must send a price and quantity with this field. "Google Play Store Product Id", "Medium Bundle"
revenueType5 stringType of revenue. Note: You must send a price and quantity with this field. "tax", "refund"
location_lat floatLatitude of the user. 37.77
location_lng floatLongitude of the user. -122.39
ip1 stringIP address of the user. Use "$remote" to use your server’s IP address. "127.0.0.1"
idfa string(iOS) Identifier for Advertiser. "AEBE52E7-03EE-455A-B3C4-E57283966239"
idfv string(iOS) Identifier for Vendor. "BCCE52E7-03EE-321A-B3D4-E57123966239"
adid string(Android) Google Play Services advertising ID (AdID). "AEBE52E7-03EE-455A-B3C4-E57283966239"

View footnotes for 1,2,3,4,5

Event Optional Columns

Optional Amplitude Specific KeyDescription
event_id intAn incrementing counter to distinguish events with the same user_id and timestamp from each other. Note: We recommend you send an event_id, increasing over time, especially if you suspect any events to occur simultaneously. An event_id is not required to send a session_id though. 1
session_id longThe start time of the session is in milliseconds since it uses epoch (Unix Timestamp), which is necessary if you want to associate events with a particular session (a session_id of -1 implies that you not sending a session_id, and so no session metrics will be tracked). Learn more about tracking sessions in Amplitude. 1396381378123
insert_id stringA unique identifier for the event being inserted; we will deduplicate events with the same insert_id sent within 7 days of each other. Some combination of device_id, user_id, session_id, event_type, and event_id or time, would likely serve as a sufficient insert_id value. "f47ac10b-58cc-4372-a567-0e02b2c3d479"

Identify Schema and data type

Column nameData typeRequired?Description
user_idStringConditionalRequired unless device_id is present. A UUID (unique user ID) specified by you. If you send a request with a user_id that's not in the Amplitude system yet, then the user tied to the user_id isn't marked new until their first event.
device_idStringConditionalRequired unless user_id is present. A device-specific identifier, such as the Identifier for Vendor (IDFV) on iOS.
user_properties_xString, Plattent to build an arrayOptionalA dictionary of key-value pairs that represent data tied to the user. Each distinct value appears as a user segment on the Amplitude dashboard. Object depth may not exceed 40 layers. You can store property values in an array, and date values are transformed into string values. Amplitude compares dates as strings, so Amplitude recommends that you use the ISO 8601 format (YYYY-MM-DDTHH:mm: ss). This lets you perform date comparisons in the web app. For example, '2016-01-31' > '2016-01-01'). This also applies to datetime values like '2017-08-07T10:09:08' > '2017-08-07T01:07:00'.
app_versionStringOptionalThe version of the app the user is on.
platformStringOptionalThe platform that's sending the data.
os_nameStringOptionalThe mobile operating system or browser the user is on.
os_versionStringOptionalThe version of the mobile operating system or browser the user is on.
device_brandStringOptionalThe device brand the user is on.
device_manufacturerStringOptionalThe device manufacturer of the device that the user is using.
device_modelStringOptionalThe device model the user is on.
carrierStringOptionalThe carrier the user has.
countryStringOptionalThe country that the user is in.
regionStringOptionalThe geographical region the user is in.
cityStringOptionalThe city the user is in.
dmaStringOptionalThe Designated Market Area of the user.
languageStringOptionalThe language the user has set.
payingStringOptionalWhether the user is paying.
start_versionStringOptionalThe version of the app the user was on first.

User Properties Fields (Identify Target Only)

User Property (Key)DescriptionSample ValueType
cohortBehavioral segment or experiment group"Beta Users"string
subscription_planUser subscription level"Pro"string
signup_dateThe user’s signup date (ISO format)"2025-01-01"timestamp
onboarding_completeWhether the user completed onboardingtrueboolean
login_countTotal number of logins5long
referrer_codeInvite/referral code used"ABC123"string
utm_sourceMarketing campaign source"Facebook"string
utm_mediumType of campaign channel"email"string
utm_campaignCampaign name"BlackFriday2025"string
last_purchase_dateLast successful purchase date"2025-05-01"timestamp
has_active_subscriptionThe user has an active subscriptionfalseboolean
features_enabledList of enabled feature flags["chat", "dark_mode"]string
interestsUser interests["music", "sports"]string
feedback_ratingFeedback rating from user (e.g., 1 to 5)4double
survey_completedWhether the user submitted a surveytrueboolean
preferred_languageUI/display language preference"en"string
lifetime_valueTotal value the user has generated (e.g., in USD)129.99double
has_push_enabledWhether push notifications are enabledtrueboolean
demo_attendedWhether the user joined a demo call/webinarfalseboolean
team_sizeTeam or organization size (for B2B use cases)50long

Or any field that not in Identify Schema list column is the user_properties columns.

(Optional) Schedule Query Export Jobs

You can use Scheduled Jobs with Result Export to periodically write the output result to a target destination that you specify.

Treasure Data's scheduler feature supports periodic query execution to achieve high availability.

When two specifications provide conflicting schedule specifications, the specification requesting to execute more often is followed while the other schedule specification is ignored.

For example, if the cron schedule is '0 0 1 * 1', then the 'day of month' specification and 'day of week' are discordant because the former specification requires it to run every first day of each month at midnight (00:00), while the latter specification requires it to run every Monday at midnight (00:00). The latter specification is followed.

Scheduling your Job Using TD Console

  1. Navigate to Data Workbench > Queries

  2. Create a new query or select an existing query.

  3. Next to Schedule, select None.

  4. In the drop-down, select one of the following schedule options:

    Drop-down ValueDescription
    Custom cron...Review Custom cron... details.
    @daily (midnight)Run once a day at midnight (00:00 am) in the specified time zone.
    @hourly (:00)Run every hour at 00 minutes.
    NoneNo schedule.

Custom cron... Details

Cron ValueDescription
0 * * * *Run once an hour.
0 0 * * *Run once a day at midnight.
0 0 1 * *Run once a month at midnight on the morning of the first day of the month.
""Create a job that has no scheduled run time.
 *    *    *    *    *
 -    -    -    -    -
 |    |    |    |    |
 |    |    |    |    +----- day of week (0 - 6) (Sunday=0)
 |    |    |    +---------- month (1 - 12)
 |    |    +--------------- day of month (1 - 31)
 |    +-------------------- hour (0 - 23)
 +------------------------- min (0 - 59)

The following named entries can be used:

  • Day of Week: sun, mon, tue, wed, thu, fri, sat.
  • Month: jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec.

A single space is required between each field. The values for each field can be composed of:

Field ValueExampleExample Description
A single value, within the limits displayed above for each field.
A wildcard '*' to indicate no restriction based on the field.'0 0 1 * *'Configures the schedule to run at midnight (00:00) on the first day of each month.
A range '2-5', indicating the range of accepted values for the field.'0 0 1-10 * *'Configures the schedule to run at midnight (00:00) on the first 10 days of each month.
A list of comma-separated values '2,3,4,5', indicating the list of accepted values for the field.0 0 1,11,21 * *'Configures the schedule to run at midnight (00:00) every 1st, 11th, and 21st day of each month.
A periodicity indicator '*/5' to express how often based on the field's valid range of values a schedule is allowed to run.'30 */2 1 * *'Configures the schedule to run on the 1st of every month, every 2 hours starting at 00:30. '0 0 */5 * *' configures the schedule to run at midnight (00:00) every 5 days starting on the 5th of each month.
A comma-separated list of any of the above except the '*' wildcard is also supported '2,*/5,8-10'.'0 0 5,*/10,25 * *'Configures the schedule to run at midnight (00:00) every 5th, 10th, 20th, and 25th day of each month.
  1. (Optional) You can delay the start time of a query by enabling the Delay execution.

(Optional) Export Integration Using the CLI

You can also use CLI (Toolbelt) to export results to Shopify by specifying the information you want to export to your Shopify server using the --result option of the td query command. For more information about the td query command, refer to TD Toolbelt: Query Commands.

Options for the yml configuration files used for TD Toolbelt are formatted in JSON, and the general structure is as follows.

  • For Event Target
out:
  type: amplitude
  api_key: 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
  target: 'event'
  disable_auto_generation_insert_ids: false
  region: 'US'
  events_per_batch: 10
  • For Identify Target
out:
  type: amplitude
  target: identify
  api_key: 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
  region: EU
  operation: set
  skip_invalid_records: false

Parameters

NameDescriptionValueDefault ValueRequired
typeConnector typeamplitudeN/AYes
api_keyAPI key to export dataN/AN/AYes
targetTarget Amplitude object to which data needs to be uploaded.Supported values: - event - identifyeventYes
regionThe API region for Amplitude export data.Supported values: - US - EUUSYes
operationOperation to perform on Amplitude.Supported values: - set - set_once - add - append - prepend - unset - pre_insert - post_insert - removeaddYes
disable_auto_generation_insert_idsDisable auto generation insert idsFalseN/A
events_per_batchEvent batch sizeInteger10N/A
skip_invalid_recordsEnable skipping of invalid records.True or FalseTrueN/A

CLI Usage Example

Upload Event

td query \
--result '{"type":"amplitude","api_key":"api_key","target":"event","region":"US","disable_auto_generation_insert_ids":"false","events_per_batch":10}'  \
-d sample_database \
"select user_id,device_id,event_type,value from tbl_shop" \
-T presto

Upload Identify

td query --result \
'{"type":"amplitude","api_key":"api_key","target":"identify","region":"US","operation":"add","skip_invalid_records":true}'\
-d sample_database \
"select user_id,device_id,user_properties,value from tbl_shop" \
-T presto

See Also