You can import data using Treasure Data’s open-source bulk data loader Embulk. Embulk helps transfer data between various databases, storage locations, file formats, and cloud services. For more information, see the Embulk documentation.
Contents
- Prerequisites
- Installing Embulk from the Command Line
- Installing the Embulk Treasure Data Plugin
- Using a Proxy Server
- Basic knowledge of Treasure Data
- Basic knowledge of Embulk
- Java installed (Embulk is a Java application)
- JRuby installed and configured (Embulk v0.10.50 and v0.11.0 do not include JRuby; see the "JRuby" section of Embulk v0.11 is coming soon)
| Platform | Steps |
|---|---|
| Linux / macOS / BSD (UNIX) | Run: |
curl --create-dirs -o ~/.embulk/bin/embulk -L "http://dl.embulk.org/embulk-latest.jar"
chmod +x ~/.embulk/bin/embulk
echo 'export PATH="$HOME/.embulk/bin:$PATH"' >> ~/.bashrc
source ~/.bashrcWindows (PowerShell)|Run:
Invoke-WebRequest http://dl.embulk.org/embulk-latest.jar -OutFile embulk.batEmbulk plugins load data to or from various systems and file formats. See the list of Embulk plugins.
Install the embulk-output-td plugin (imports records to Treasure Data):
embulk gem install embulk-output-tdIf you cannot upload, verify whether your network uses a proxy. Set the proxy with command-line options:
Linux:
embulk -J-Dhttp.proxyHost=HOST -J-Dhttp.proxyPort=PORT -J-Dhttp.proxyUser=USER -J-Dhttp.proxyPassword=PASS run config.ymlWindows:
embulk.bat "-J-Dhttps.proxyHost=HOST" "-J-Dhttps.proxyPort=PORT" "-J-Dhttp.proxyUser=USER" "-J-Dhttp.proxyPassword=PASS" run config.ymlOr run via Java directly:
java -Dhttps.proxyHost=HOST -Dhttps.proxyPort=PORT -jar embulk.bat run config.yml