Writing Job Results into your MySQL Tables.

This article explains how to write job results to your existing MySQL instance.

Table of Contents

Prerequisites

  • Basic knowledge of Treasure Data, including the toolbelt.
  • A MySQL instance.
  • Treasure Data must have SELECT, INSERT, DELETE, CREATE, DROP and ALTER privileges.

Reference Architecture

A front-end application streamingly collects data to Treasure Data via Treasure Agent. Treasure Data periodically runs jobs on the data, then writes the job results to your MySQL tables.



Shown above is a fairly common architecture. Here are a few examples.

Example 1: Ranking: What are the “Top N of X?”

Every social/mobile application calculates the “top N of X” (ex: top 5 movies watched today). Treasure Data already handles the raw data warehousing; the “write-to-MySQL” feature enables Treasure Data to find the “top N” data as well.

Example 2: Dashboard Application

If you’re a data scientist, you need to keep track of a range of metrics every hour/day/month and make them accessible via visualizations. Using this “write-to-MySQL” feature, you can streamline the process and focus on your queries and your visualizations of the query results.

Untitled-3
You can limit the access to your database by using a list of static IPs. Please contact support@treasuredata.com if you need it.

Basic Usage

For On-demand Jobs

For on-demand jobs, just add the --result option to the td query command. After the job is finished, the results are written into your table.

$ td query --result 'mysql://user:password@host/database/table' \
  -w -d testdb \
  "SELECT code, COUNT(1) FROM www_access GROUP BY code"      

For Scheduled Jobs

For scheduled jobs, just add the --result option when scheduling a job. Every time the job runs, the results are written into mytbl.

$ td result:create mydb mysql://user:password@host/database
$ td sched:create hourly_count_example "0 * * * *" -d testdb "select count(*) from www_access" --result mydb:mytbl

Four Modes to Modify Data (Append/Replace/Truncate/Update)

You can add or delete data in four ways.

mysql://user:password@host/database/table               # append
mysql://user:password@host/database/table?mode=append   # append
mysql://user:password@host/database/table?mode=replace  # replace
mysql://user:password@host/database/table?mode=truncate # truncate
mysql://user:password@host/database/table?mode=update   # update

All four modes are atomic.

In the “host” section, you can specify the port as well. Here is an example:

mysql://superadmin:superstrongpassword@data.science.com:27013/transaction/user_info

APPEND

This is the default mode. The query results are appended to a table. If the table does not exist yet, a new table will be created.

REPLACE

If the table already exists, the rows of the existing table are replaced with the query results. If the table does not exist yet, a new table will be created.

We achieve atomicity (so that a consumer of the table always has consistent data) by performing the following three steps in a single transaction.

  1. Create a temporary table.
  2. Write to the temporary table.
  3. Replace the existing table with the temporary table using ALTER TABLE RENAME.

TRUNCATE

The system first truncates the existing table, then inserts the query results. If the table does not exist yet, a new table will be created.

Unlike REPLACE, TRUNCATE retains the indexes of your table.

UPDATE

This mode inserts new data and update existing data by performing the following steps.

  1. Create a temporary table.
  2. Write to the temporary table.
  3. Update data from temporary table to destination table.
  4. Insert the rest of the data from temporary table to destination table.

In order to specify a unique key, you can set unique parameter, which is comma-separated list of keys. If the values for all keys match all existing record, that record will be replaced. Here is an example:

mysql://superadmin:superstrongpassword@data.science.com:27013/transaction/user_infomode=update&unique=column1%2Ccolumn2
Untitled-3
On CLI, Comma in the paramter must be URL encoded. ("," should be "%2C")

Options

Writing Job Results to the MySQL server supports following options that can be specified as optional URL parameters.

ssl option

ssl option determines whether to use SSL or not for connecting to the MySQL server.

ssl=true

Use SSL from Treasure Data to the MySQL server connection. The MySQL server must be configured to accept SSL connection.

mysql://user:password@host/database/table?ssl=true

ssl=false (default)

Do not use SSL from Treasure Data to the MySQL server.

mysql://user:password@host/database/table?ssl=false

Character Encoding

Treasure Data always uses UTF-8 as character encoding. You also need to setup your database to use UTF-8 by default. If you came across any encoding related issues, please check the following configurations.

mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

Last modified: Mar 21 2017 06:30:56 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.