Data Import from REST API on Heroku

This article describes how to run td-agent on Heroku. td-agent is a data-import daemon for Treasure Data Hadoop. Although td-agent itself can accept data from various sources, the Heroku platform’s limitations cause td-agent to only accept HTTP input.

Table of Contents

Treasure Data Toolbelt setup

First, please download and install the Treasure Data Toolbelt for your development environment.

If you have... Please refer to...
Mac OS Download Mac Installer
Windows Download Win Installer
Linux Redhat/CentOS, Debian/Ubuntu
Ruby gem `gem install td`

Heroku CLI setup

The heroku-td CLI plugin is also required as a bridge between the heroku CLI and the td CLI. Once you install the CLI plugin, you will be able to execute the heroku td family of commands.

$ heroku plugins:install https://github.com/treasure-data/heroku-td.git
$ heroku td
usage: heroku td [options] COMMAND [args]
Untitled-3
Please make sure to use the Heroku toolbelt. On December 1, 2012, Heroku stopped releasing new updates to the heroku gem. We have also stopped supporting the heroku gem.

Create Your App

You’re now ready to set up td-agent as a separate Heroku application. td-agent will become your central log aggregation server. We provide a boilerplate repository to get you started.

Follow the steps below to create td-agent as a Heroku application.

# Clone
$ git clone git://github.com/treasure-data/heroku-td-agent.git
$ cd heroku-td-agent
$ rm -fR .git
$ git init
$ git add .
$ git commit -m 'initial commit'

# Create the app & deploy
$ heroku create --stack cedar
$ git push heroku master

# Enable Addon
$ heroku addons:add treasure-data

# Confirm your API key
$ heroku td apikey:show
2d14b1872cb789ed3733d4aa26ae8e6aba30bbbd

# Set your API key
$ vi td-agent.conf
$ git commit -a -m 'update apikey'

# Deploy
$ git push heroku master

Test

You can now confirm that the log aggregation server is set up correctly. In the example below, you will send a GET request to the log server at http://td-agent-on-heroku.herokuapp.com to log the event { "json": "message" } in sample_table in sample_db. Note how the JSON-formatted data is passed as a query parameter value.

# HTTP
$ curl "http://td-agent-on-heroku.herokuapp.com/td.sample_db.sample_table?json=%7B%22json%22%3A%22message%22%7D"

# HTTPS
$ curl "https://td-agent-on-heroku.herokuapp.com/td.sample_db.sample_table?json=%7B%22json%22%3A%22message%22%7D"

In general, the format is

http://{YOUR LOG SERVER DOMAIN}/td.{DB_NAME}.{TABLE_NAME}?json={JSON_FORMATTED_DATA}

Once you have sent the GET request, run the command heroku td tables to confirm that data is getting uploaded successfully.

$ heroku td tables
+------------------+---------------+------+-------+--------+
| Database         | Table         | Type | Count | Schema |
+------------------+---------------+------+-------+--------+
| sample_db        | sample_table  | log  | 1     |        |
+------------------+---------------+------+-------+--------+
1 rows in set

Logs are uploaded every 5 minutes.


Last modified: Feb 24 2017 09:27:52 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.