Rails Apps on Heroku

Table of Contents

Prerequisites

  • Basic knowledge of Ruby, Gems, and Bundler.
  • Basic knowledge of Heroku, including the Heroku toolbelt.
  • Ruby 1.8 or higher (for local testing).

Provisioning the add-on

The Treasure Data addon can be attached to a Heroku application via the CLI:

$ heroku addons:add treasure-data:free
Addint treasure-data:free to <your_app_name>... done (free)

Treasure Data Toolbelt setup

First, please download and install the Treasure Data Toolbelt for your development environment.

If you have... Please refer to...
Mac OS Download Mac Installer
Windows Download Win Installer
Linux Redhat/CentOS, Debian/Ubuntu
Ruby gem `gem install td`

Heroku CLI setup

The heroku-td CLI plugin is also required as a bridge between the heroku CLI and the td CLI. Once you install the CLI plugin, you will be able to execute the heroku td family of commands.

$ heroku plugins:install https://github.com/treasure-data/heroku-td.git
$ heroku td
usage: heroku td [options] COMMAND [args]
Untitled-3
Please make sure to use the Heroku toolbelt. On December 1, 2012, Heroku stopped releasing new updates to the heroku gem. We have also stopped supporting the heroku gem.

Data Import

Step 1: Import Data By Printing to STDOUT (Yes, that simple!)

You can import data to Treasure Data by simply writing to STDOUT in a specific format. The format is:

@[database.table] JSON-IN-ONE-LINE

Thus, in Rails,

class FooController
  # Example1: login event
  def login
    puts "@[development.login] #{{'uid'=>123}.to_json}"
  end

  # Example2: follow event
  def follow
    puts "@[development.follow] #{{'uid'=>123, 'from'=>'@TreasureData', 'to'=>'@Heroku'}.to_json}"
  end

  # Example3: pay event
  def pay
    puts "@[development.pay] #{{'uid'=>123, 'item_name'=>'Stone of Jordan', 'category'=>'ring', 'price'=>100, 'count'=>1}.to_json}"
  end
end
Untitled-3
The action name only supports alphanumeric characters and underscores. Hyphens cannot be used (an exception will be thrown).

Once you have inserted your event-logging code, push the modifications to Heroku!

$ git commit -a -m "Added Treasure Data Plugin"
$ git push heroku master
Untitled-3
In addition to writing to STDOUT, you can also use the logger library to post events to Treasure Data.

Step 2: Access Your Application

First, open your application on Heroku. The recorded events are first buffered locally, then periodically uploaded into the cloud. In the current implementation, the buffered data is uploaded every 5 minutes.

$ heroku open

Step 3: Check Your Uploaded Data

Database Structure

Treasure Data Hadoop’s data structure is like RDBMS: tables inside databases. In order to see the list of available databases, use the command: heroku td dbs.

$ heroku td dbs
+-------------+-------+
| Name        | Count |
+-------------+-------+
| development | 3     |
+-------------+-------+
1 row in set

Check Your Data

You can see the list of available tables by using the heroku td tables command.

$ heroku td tables
+-------------+--------+------+-------+--------+---------------------------+--------+
| Database    | Table  | Type | Count | Size   | Last import               | Schema |
+-------------+--------+------+-------+--------+---------------------------+--------+
| development | follow | log  | 1     | 0.0 GB | 2013-01-22 21:49:51 -0800 |        |
| development | login  | log  | 1     | 0.0 GB | 2013-01-22 21:49:51 -0800 |        |
| development | pay    | log  | 1     | 0.0 GB | 2013-01-22 21:49:52 -0800 |        |
+-------------+--------+------+-------+--------+---------------------------+--------+
3 rows in set
Untitled-3
If you don't see your data uploaded to TD, then your stdout may be buffered. See here for details.

To confirm that your application data has been uploaded properly, check the “Count” column. If any of the “Count” entries are non-zero, your event logs been transferred successfully.

Once your data has been uploaded properly, use the heroku td table:tail command to see the recent entries of a specific table.

$ heroku td table:tail development follow
{"uid":123,"time":1358920091,"to":"@Heroku","from":"@TreasureData"}

Analyze Your Data

Our service lets you to analyze your data using a SQL-style language. When your data is sent to us, your logs are imported into a Hadoop/Hive cluster. Hive lets its users query big data through a SQL-like interface.

Use the heroku td query command to issue queries to Treasure Data Hadoop; Treasure Data Hadoop accepts and executes queries on the cloud.

The example query below counts the number of “follow” actions that were generated by user id 12345:

$ heroku td query -w -d development \
  "SELECT COUNT(1) FROM follow WHERE uid = 123"
...
(some MapReduce messages)
...
+-----+
| _c0 |
+-----+
| 1   |
+-----+
1 row in set

The following example query counts the total number of logged actions for each day.

$ heroku td query -w -d development \
  "SELECT to_date(from_unixtime(time)) AS day, count(1) FROM follow GROUP BY to_date(from_unixtime(time)) ORDER BY day"
...
(some MapReduce messages)
...
+------------+-----+
| day        | _c1 |
+------------+-----+
| 2013-01-23 | 2   |
+------------+-----+
1 row in set

The heroku td query -format csv command outputs the results in csv.

Issue Queries from Ruby API

Below is an example of issuing a query from a Ruby program. The query API is asynchronous, and you can wait for the query to complete by polling the job periodically (e.g. by issuing job.finished? and job.update_progress! calls).

require 'td'
require 'td-client'
cln = TreasureData::Client.new(ENV['TREASURE_DATA_API_KEY'])
job = cln.query('testdb', 'SELECT COUNT(1) FROM www_access')
until job.finished?
  sleep 2
  job.update_progress!
end
if job.success?
  job.update_status!  # get latest info
  job.result_each { |row| p row }
end

Does Uploading Data Impact App Performance?

When using the td gem, the posted records are buffered locally at first, and the data is uploaded every 5 minutes. Because a dedicated thread uploads the data into the cloud, it doesn’t affect your application’s response time.

The local buffer also has a size limit. If the local data exceeds this limit, the records will be uploaded immediately.

Next Steps

The Heroku Addon Notes document explains the limitations of Heroku addons. We recommend that you review this information before moving on to other articles.

We offer a schema mechanism that is more flexible than that of traditional RDBMSs. For queries, we leverage the Hive Query Language.

Appendix: Importing Data To Treasure Data Using The Logger Library

Step 1: Import Data Using the td gem

First, add the td (Treasure Data) gem to your Gemfile.

gem 'td', "~> 0.10.22"

Next, install the gem locally via bundler.

$ bundle install

Step 2: Create treasure_data.yml

Download this file and save it to config/treasure_data.yml (or use the command below). You can modify this file if you wish to tweak your Treasure Data configuration.

$ curl -L 'http://bit.ly/UKfiIY' > config/treasure_data.yml

Step 3: Insert Logging Code

The ‘td’ gem comes with a built-in library for recording in-app events. Insert code as shown below to record events from your app:

class FooController
  # Example1: login event
  def login
    TD.event.post('login', {:uid=>123})
  end

  # Example2: follow event
  def follow
    TD.event.post('follow', {:uid=>123, :from=>'TD', :to=>'Heroku'})
  end

  # Example3: pay event
  def pay
    TD.event.post('pay',
                  {:uid=>123, :item_name=>'Stone of Jordan',
                   :category=>'ring', :price=>100, :count=>1})
  end
end

Your event-logging code should be placed near its corresponding event-generating code. Further details regarding the event logger API can be found here.

Development Environment Setup (for TD Logger Library ONLY)

In order to use Treasure Data on a development environment, please modify config/treasure_data.yml. It is disabled by default.

development:
  apikey: "<%= `heroku td apikey:show`.strip %>"
  database: rails_development
  debug_mode: true

Last modified: Feb 24 2017 09:27:52 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.