Java Apps on Heroku

Table of Contents

Prerequisites

  • Basic knowledge of Java.
  • Basic knowledge of Heroku, including the Heroku toolbelt.
  • Java 1.6 or higher.

Provisioning the add-on

The Treasure Data addon can be attached to a Heroku application via the CLI:

$ heroku addons:add treasure-data:nano
Adding treasure-data:nano on <your_app_name>... done, v3 (free)

Treasure Data Toolbelt setup

First, please download and install the Treasure Data Toolbelt for your development environment.

If you have... Please refer to...
Mac OS Download Mac Installer
Windows Download Win Installer
Linux Redhat/CentOS, Debian/Ubuntu
Ruby gem `gem install td`

Heroku CLI setup

The heroku-td CLI plugin is also required as a bridge between the heroku CLI and the td CLI. Once you install the CLI plugin, you will be able to execute the heroku td family of commands.

$ heroku plugins:install https://github.com/treasure-data/heroku-td.git
$ heroku td
usage: heroku td [options] COMMAND [args]
Untitled-3
Please make sure to use the Heroku toolbelt. On December 1, 2012, Heroku stopped releasing new updates to the heroku gem. We have also stopped supporting the heroku gem.

Data Import

Step 1: Import Data By Printing to STDOUT (Yes, that simple!)

You can import data to Treasure Data by simply writing to STDOUT in a specific format. The format is:

@[database.table] JSON-IN-ONE-LINE

Thus, in a simple Java app (this example was taken and modified from Heroku’s Java quickstart)

import java.io.IOException;
import javax.servlet.ServletException;
import javax.servlet.http.*;
import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.servlet.*;

public class HelloWorld extends HttpServlet {

    @Override
    protected void doGet(HttpServletRequest req, HttpServletResponse resp)
            throws ServletException, IOException {
        resp.getWriter().print("Hello from Java!\n");

        /* This is where we are writing to STDOUT with said format */
        System.out.println("@[development.login] {\"uid\":123}");
        System.out.println("@[development.follow] {\"uid\":123,\"from\":\"@TreasureData\",\"to\":\"@Heroku\"}");
        System.out.println("@[development.pay] {\"uid\":123,\"item_name\":\"Stone of Jordan\",\"category\":\"ring\",\"price\":100,\"count\":1}");
    }

    public static void main(String[] args) throws Exception{
        Server server = new Server(Integer.valueOf(System.getenv("PORT")));
        ServletContextHandler context = new ServletContextHandler(ServletContextHandler.SESSIONS);
        context.setContextPath("/");
        server.setHandler(context);
        context.addServlet(new ServletHolder(new HelloWorld()),"/*");
        server.start();
        server.join();
    }
}

That’s it! You’re now ready to deploy your changes.

$ git commit -a -m "Added Treasure Data Plugin"
$ git push heroku master

Step 2: Access Your Application

Next, open your application on Heroku. The recorded events are first buffered locally, then uploaded periodically into the cloud. In the current implementation, the buffered data is uploaded every 5 minutes.

$ heroku open

Step 3: Check Your Uploaded Data

Database Structure

Treasure Data Hadoop’s data structure is like RDBMS: tables inside databases. In order to see the list of available databases, use the command: heroku td dbs.

$ heroku td dbs
+-------------+-------+
| Name        | Count |
+-------------+-------+
| development | 3     |
+-------------+-------+
1 row in set

Checking Your Data

In order to see the tables inside the available databases, use the command: heroku td tables.

$ heroku td tables
+-------------+--------+------+-------+--------+---------------------------+--------+
| Database    | Table  | Type | Count | Size   | Last import               | Schema |
+-------------+--------+------+-------+--------+---------------------------+--------+
| development | follow | log  | 1     | 0.0 GB | 2013-01-22 21:49:51 -0800 |        |
| development | login  | log  | 1     | 0.0 GB | 2013-01-22 21:49:51 -0800 |        |
| development | pay    | log  | 1     | 0.0 GB | 2013-01-22 21:49:52 -0800 |        |
+-------------+--------+------+-------+--------+---------------------------+--------+
3 rows in set

To confirm that your application data has been uploaded properly, check the “Count” column. If any of the “Count” entries are non-zero, your event logs have been transferred successfully.

Once your data has been uploaded properly, use the heroku td table:tail command to see the recent entries of a specific table.

$ heroku td table:tail development follow
{"uid":123,"time":1358920091,"to":"@Heroku","from":"@TreasureData"}

Step 4: Analyze Your Data

You can analyze your data with a Hive-compatible query language. (When your data is sent to Treasure Data Hadoop, your logs are imported into a Hadoop/Hive cluster.)

Use the heroku td query command to issue queries to Treasure Data Hadoop; Treasure Data Hadoop accepts and executes queries on the cloud.

The example query below counts the number of “follow” actions that were generated by user id 12345:

$ heroku td query -w -d development \
  "SELECT COUNT(1) FROM follow WHERE uid = 123"
...
(some MapReduce messages)
...
+-----+
| _c0 |
+-----+
| 1   |
+-----+
1 row in set

The following example query counts the total number of logged actions for each day.

$ heroku td query -w -d development \
  "SELECT to_date(from_unixtime(time)) AS day, count(1) FROM follow GROUP BY to_date(from_unixtime(time)) ORDER BY day"
...
(some MapReduce messages)
...
+------------+-----+
| day        | _c1 |
+------------+-----+
| 2013-01-23 | 2   |
+------------+-----+
1 row in set

The heroku td query –format csv command outputs the results in csv.

Does Uploading Data Impact App Performance?

The td-logger library buffers data locally at first, and the data is uploaded every 5 minutes. Because a dedicated thread uploads the data into the cloud, it doesn’t affect your application’s response time.

The local buffer also has a size limit. If the local data exceeds this limit, the records will be uploaded immediately.

Next Steps

The Heroku Addon Notes document explains the limitations of Heroku addons. We recommend that you review this information before moving on to other articles.

We offer a schema mechanism that is more flexible than that of traditional RDBMSs. For queries, we leverage the Hive Query Language.

Appendix: Importing Data To Treasure Data Using The Logger Library

Alternatively, you can chose to use the Java td-logger library to log data to Treasure Data. The Java td-logger library supports two different upload methods:


Last modified: Feb 24 2017 09:27:52 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.