You can import data using Treasure Data’s open-source bulk data loader Embulk. Embulk is a open-source bulk data loader that helps data transfer between various databases, storage locations, file formats, and cloud services. For more information, see Embulk documentation

This topic includes:

Prerequisites

  • Basic knowledge of Treasure Data.

  • Basic Knowledge of Embulk

  • Embulk is a Java application. Make sure that Java is installed.


Installing Embulk from the Command Line


Platform


Linux

Mac

BSD (UNIX)

Execute the following commands:

  • curl --create-dirs -o ~/.embulk/bin/embulk -L "http://dl.embulk.org/embulk-latest.jar"

  • chmod +x ~/.embulk/bin/embulk

  • echo 'export PATH="$HOME/.embulk/bin:$PATH"' >> ~/.bashrc

  • source ~/.bashrc

Windows

Using PowerShell

"& {Invoke-WebRequest http://dl.embulk.org/embulk-latest.jar -OutFile embulk.bat}


Installing the Embulk Treasure Data Plugin

You can use Embulk plugins to load data to or from various systems and file formats. List of Embulk plugins by category.

The following command installs embulk-output-td plugin, which imports records to Treasure Data:

embulk gem install embulk-output-td

Using a Proxy Server

If you cannot upload, verify that your network is using a proxy. You can set the proxy by using the following command line option:

Linux:
  embulk -J-Dhttp.proxyHost=xxxx -J-Dhttp.proxyPort=xxxx -J-Dhttp.proxyUser=xxxx -J-Dhttp.proxyPassword=xxxx run config.yml
Windows:
  embulk.bat "-J-Dhttps.proxyHost=xxxx" "-J-Dhttps.proxyPort=xxxx" "-J-Dhttp.proxyUser=xxxx" "-J-Dhttp.proxyPassword=xxxx" run config.yml
Or,
  "java"  -Dhttps.proxyHost="host" -Dhttps.proxyPort="port" -jar embulk.bat run config.yml


  • No labels