Embulk for MySQL

This article explains how to import data from MySQL database to Treasure Data using embulk-input-mysql input plugin.

Table of Contents

Prerequisites

  • Basic knowledge of Treasure Data.
  • Basic knowledge of Embulk
  • Embulk and embulk-output-td plugin installed on your machine.

Step 0: Install embulk-input-mysql plugin

To install embulk-input-mysql plugin, run the following command:

$ embulk gem install embulk-input-mysql

Step 1: Create seed configuration file

Using your favorite text editor, create embulk config file (for eg:seed.yml) defining input(MySQL) and ouput(TD) parameters. Example:

in:
    type: mysql
    host: localhost
    port: 3306
    user: username
    password: password
    database: mysql_db
    select: "col1, col2, datecolumn"
    where: "col4 != 'a'"
out:
    type: td
    apikey: xxxxxxxxxxxx
    endpoint: api.treasuredata.com
    database: dbname
    table: tblname
    time_column: datecolumn
    mode: replace 
    #by default mode: append is used, if not defined. Imported records are appended to the target table with this mode.
    #mode: replace, replaces existing target table
    default_timestamp_format: '%d/%m/%Y'

For further details about additional parameters and other examples are available, refer Embulk Input MySQL

Step 2: Guess Fields (Generate load.yml)

Embulk guess option uses seed.yml to read the target file and automatically guesses the column types/settings and creates a new file load.yml with this information.

$embulk guess seed.yml -o load.yml

Now, you may preview the data using embulk preview load.yml command. If any of the column types or data seems incorrect you may edit load.yml file directly and preview again to verify. If guess option doesn’t yield satisfactory results, you may change parameters in load.yml according to your requirement manually.

Step 3: Execute Load Job

Finally, issue the import job by running the following command:

$embulk run load.yml

It may take few mins to hours for the job to complete, depending on the size of the data.


Last modified: Mar 31 2016 21:47:24 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.