Treasure Data provides td-agent to collect server-side logs and events and to import the data from Scala applications.


Prerequisites

  • Basic knowledge of Scala.

  • Basic knowledge of Treasure Data, including the TD Toolbelt.

  • JVM, Scala, sbt v0.11 or later.

Installing td-agent

Install td-agenton your application servers. td-agent sits within your application servers, focusing on uploading application logs to the cloud.

The td-logger-java library enables Scala applications to post records to their local td-agent. td-agent, in turn, uploads the data to the cloud every 5 minutes. Because the daemon runs on a local node, the logging latency is negligible.

td-agent Install Options

To install td-agent, run one of the following commands based on your environment. The agent program is installed automatically by using the package management software for each platform like rpm/deb/dmg.

RHEL/CentOS 5,6,7

$ curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent3.sh | sh

Ubuntu and Debian

# 18.04 Bionic
$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-bionic-td-agent3.sh | sh
# 16.04 Xenial (64bit only)
$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent3.sh | sh
Legacy support for EOL versions is still available
# 14.04 Trusty
$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent3.sh | sh
# 12.04 Precise
$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-precise-td-agent3.sh | sh
# Debian Stretch (64-bit only) $ curl -L https://toolbelt.treasuredata.com/sh/install-debian-stretch-td-agent3.sh | sh
# Debian Jessie (64-bit only)
$ curl -L https://toolbelt.treasuredata.com/sh/install-debian-jessie-td-agent3.sh | sh
# Debian Squeeze (64-bit only)
$ curl -L https://toolbelt.treasuredata.com/sh/install-debian-squeeze-td-agent2.sh | sh

Amazon Linux

You can choose Amazon Linux 1 or Amazon Linux 2. Refer to Installing td-agent on Amazon Linux.

MacOS X 10.11+

$ open 'https://td-agent-package-browser.herokuapp.com/3/macosx/td-agent-3.1.1-0.dmg'


MacOS X 10.11.1 (El Capitan) introduced some security changes. After installing the td-agent, edit the /Library/LaunchDaemons/td-agent.plist file to change /usr/sbin/td-agent to /opt/td-agent/usr/sbin/td-agent.

Windows Server 2012+

The Windows installation requires multiple steps. Complete the steps documented:

Opscode Chef (repository)

You can read more about the repository.

$ echo 'cookbook "td-agent"' >> Berksfile
$ berks install

AWS Elastic Beanstalk is also supported. Windows is not supported.

Modifying /etc/td-agent/td-agent.conf

Next, specify your API key by setting the apikey option. You can retrieve your API key from your profile in TD Console. Set the apikey option in your td-agent.conf file.

# Treasure Data Input and Output
<source>
  type forward
  port 24224
</source>
<match td.*.*>
  type tdlog
  endpoint api.treasuredata.com
  apikey YOUR_API_KEY
  auto_create_table
  buffer_type file
  buffer_path /var/log/td-agent/buffer/td
  use_ssl true
</match>

YOUR_API_KEY should be your actual apikey string. Using a [write-only API key](access-control#rest-apis-access) is recommended.

Restart your agent when the following lines are in place.

# Linux
$ sudo /etc/init.d/td-agent restart

# MacOS X
$ sudo launchctl unload /Library/LaunchDaemons/td-agent.plist
$ sudo launchctl load /Library/LaunchDaemons/td-agent.plist

td-agent now accepts data via port 24224, buffers the data (var/log/td-agent/buffer/td), and automatically uploads the data into the cloud.

Using td-logger-java

First, add the following lines to build.sbt. The logger’s revision information can be found in CHANGES.txt.

If you need an all-in-one jar file, we provide one at http://central.maven.org/maven2/com/treasuredata/.

/* in build.sbt */
// Dependencies
libraryDependencies ++= Seq(
  "com.treasuredata" % "td-logger" % "${logger.version}"
)

Next, configure your treasure-data.properties file using the following commands:

td.logger.agentmode=true
td.logger.agent.host=localhost
td.logger.agent.port=24224
td.logger.agent.tag=td

Finally, insert the following lines into your application to initialize and post records. You can read more information about the API.

import java.util.Properties
import com.treasure_data.logger.TreasureDataLogger
import scala.collection.JavaConverters._

object Main {
  def main(args: Array[String]) {
    var props = System.getProperties();
    props.load(getClass.getResourceAsStream("treasure-data.properties"));
    var LOG = TreasureDataLogger.getLogger("test_db");

    var map = Map("from" -> "userA", "to" -> "userB");
    LOG.log("follow", map.asJava.asInstanceOf[java.util.Map[String, java.lang.Object]]);
  }
}

This example expects the following structure.

  • project_dir/build.sbt

  • project_dir/src/main/scala/Main.scala

  • project_dir/src/main/resources/treasure-data.properties

Confirming Data Import

First, execute the preceding program.

$ sbt compile run

Sending a SIGUSR1 signal flushes td-agent’s buffer. Upload starts immediately.

# Linux
$ kill -USR1 `cat /var/run/td-agent/td-agent.pid`

# MacOS X
$ sudo kill -USR1 `sudo launchctl list | grep td-agent | cut -f 1`

To confirm the data upload, use td tables.

$ td tables
+------------+------------+------+-----------+
| Database   | Table      | Type | Count     |
+------------+------------+------+-----------+
| test_db    | follow     | log  | 1         |
+------------+------------+------+-----------+

Production Deployments

High-Availability Configurations of td-agent

For high-traffic websites (more than 5 application nodes), use a high availability configuration of td-agent to improve data transfer reliability and query performance.

Monitoring td-agent

Monitoring td-agent itself is also important. Refer to the following document for general monitoring methods for td-agent:

td-agent is fully open-sourced under the Fluentd project.

Next Steps

We offer a schema mechanism that is more flexible than that of traditional RDBMSs. For queries, we leverage the Hive and Presto Query Languages.

  • No labels