Treasure Data provides td-agent to collect server-side logs and events and to import the data from Scala applications.
- Basic knowledge of Scala.
- Basic knowledge of Treasure Data, including the TD Toolbelt.
- JVM, Scala, sbt v0.11 or later.
Install td-agenton your application servers. td-agent sits within your application servers, focusing on uploading application logs to the cloud.
The td-logger-java library enables Scala applications to post records to their local td-agent. td-agent, in turn, uploads the data to the cloud every 5 minutes. Because the daemon runs on a local node, the logging latency is negligible.
To install td-agent, run one of the following commands based on your environment. The agent program is installed automatically by using the package management software for each platform like rpm/deb/dmg.
$ curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent3.sh | sh# 18.04 Bionic
$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-bionic-td-agent3.sh | sh
# 16.04 Xenial (64bit only)
$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent3.sh | sh# 14.04 Trusty
$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent3.sh | sh
# 12.04 Precise
$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-precise-td-agent3.sh | sh
# Debian Stretch (64-bit only) $ curl -L https://toolbelt.treasuredata.com/sh/install-debian-stretch-td-agent3.sh | sh
# Debian Jessie (64-bit only)
$ curl -L https://toolbelt.treasuredata.com/sh/install-debian-jessie-td-agent3.sh | sh
# Debian Squeeze (64-bit only)
$ curl -L https://toolbelt.treasuredata.com/sh/install-debian-squeeze-td-agent2.sh | shYou can choose Amazon Linux 1 or Amazon Linux 2. Refer to Installing td-agent on Amazon Linux.
$ open 'https://td-agent-package-browser.herokuapp.com/3/macosx/td-agent-3.1.1-0.dmg'MacOS X 10.11.1 (El Capitan) introduced some security changes. After installing the td-agent, edit the /Library/LaunchDaemons/td-agent.plist file to change /usr/sbin/td-agent to /opt/td-agent/usr/sbin/td-agent.
The Windows installation requires multiple steps. Complete the steps documented:
You can read more about the repository.
$ echo 'cookbook "td-agent"' >> Berksfile
$ berks installAWS Elastic Beanstalk is also supported. Windows is not supported.
Next, specify your API key by setting the apikey option. You can retrieve your API key from your profile in TD Console. Set the apikey option in your td-agent.conf file.
# Treasure Data Input and Output
source
type forward
port 24224
</source>
<match td.*.*>
type tdlog
endpoint api.treasuredata.com
apikey YOUR_API_KEY
auto_create_table
buffer_type file
buffer_path /var/log/td-agent/buffer/td
use_ssl true
</match>YOUR_API_KEY should be your actual apikey string. You can retrieve your API key from your profiles in TD Console. Using a write-only API key is recommended.
Restart your agent when the following lines are in place.
# Linux
$ sudo /etc/init.d/td-agent restart
# MacOS X
$ sudo launchctl unload /Library/LaunchDaemons/td-agent.plist
$ sudo launchctl load /Library/LaunchDaemons/td-agent.plisttd-agent now accepts data via port 24224, buffers the data (var/log/td-agent/buffer/td), and automatically uploads the data into the cloud.
First, add the following lines to build.sbt. The logger’s revision information can be found in CHANGES.txt.
If you need an all-in-one jar file, we provide one at http://central.maven.org/maven2/com/treasuredata/.
/* in build.sbt */
// Dependencies
libraryDependencies ++= Seq(
"com.treasuredata" % "td-logger" % "${logger.version}"
)Next, configure your treasure-data.properties file using the following commands:
td.logger.agentmode=true
td.logger.agent.host=localhost
td.logger.agent.port=24224
td.logger.agent.tag=tdFinally, insert the following lines into your application to initialize and post records. You can read more information about the API.
import java.util.Properties
import com.treasure_data.logger.TreasureDataLogger
import scala.collection.JavaConverters._
object Main {
def main(args: Array[String]) {
var props = System.getProperties();
props.load(getClass.getResourceAsStream("treasure-data.properties"));
var LOG = TreasureDataLogger.getLogger("test_db");
var map = Map("from" -> "userA", "to" -> "userB");
LOG.log("follow", map.asJava.asInstanceOf[java.util.Map[String, java.lang.Object]]);
}
}This example expects the following structure.
- project_dir/build.sbt
- project_dir/src/main/scala/Main.scala
- project_dir/src/main/resources/treasure-data.properties
First, execute the preceding program.
$ sbt compile runSending a SIGUSR1 signal flushes td-agent’s buffer. Upload starts immediately.
# Linux
$ kill -USR1 `cat /var/run/td-agent/td-agent.pid`
# MacOS X
$ sudo kill -USR1 `sudo launchctl list | grep td-agent | cut -f 1`To confirm the data upload, use td tables.
$ td tables
+------------+------------+------+-----------+
| Database | Table | Type | Count |
+------------+------------+------+-----------+
| test_db | follow | log | 1 |
+------------+------------+------+-----------+For high-traffic websites (more than 5 application nodes), use a high availability configuration of td-agent to improve data transfer reliability and query performance.
Monitoring td-agent itself is also important. Refer to the following document for general monitoring methods for td-agent:
td-agent is fully open-sourced under the Fluentd project.
We offer a schema mechanism that is more flexible than that of traditional RDBMSs. For queries, we leverage the Hive and Presto Query Languages.