td-agent was discontinued in December 2023 and has been replaced by fluent-package. The fluent-package is the official successor maintained by the Cloud Native Computing Foundation.
You can import JSON formatted logs using Fluentd to continuously import the access logs into the cloud.
Fluentd handles log-rotation. Fluentd keeps a record of the last position of the log, ensuring that each line is read exactly once even if the Fluentd process goes down. However, because the information is kept in a file, the "exactly once" guarantee breaks down if the file becomes corrupted.
- Basic knowledge of Fluentd, and its installation.
- The
fluentd:fluentduser must have permission to read the logs.
td-agent was discontinued in December 2023 and has been replaced by fluent-package. The fluent-package is the official successor maintained by the Cloud Native Computing Foundation. For migration guidance from td-agent, see Fluentd Installation Guide.
To install fluent-package, run one of the following commands based on your environment.
# fluent-package 6 LTS (recommended)
curl -fsSL https://fluentd.cdn.cncf.io/sh/install-redhat-fluent-package6-lts.sh | sh# Ubuntu 24.04 Noble - fluent-package 6 LTS
curl -fsSL https://fluentd.cdn.cncf.io/sh/install-ubuntu-noble-fluent-package6-lts.sh | sh
# Ubuntu 22.04 Jammy - fluent-package 6 LTS
curl -fsSL https://fluentd.cdn.cncf.io/sh/install-ubuntu-jammy-fluent-package6-lts.sh | sh# Debian Bookworm - fluent-package 6 LTS
curl -fsSL https://fluentd.cdn.cncf.io/sh/install-debian-bookworm-fluent-package6-lts.sh | sh# Amazon Linux 2023 - fluent-package 6 LTS
curl -fsSL https://fluentd.cdn.cncf.io/sh/install-amazon2023-fluent-package6-lts.sh | shDownload the MSI installer from:
After installation:
- Edit the configuration file at
C:/opt/fluent/etc/fluent/fluentd.conf - Start the service using
net start fluentdwinsvcor via Services administrative tool
fluent-package for macOS is planned to be available via Homebrew. For current installation options, see Fluentd Installation Guide.
After installation, start and verify the Fluentd service.
sudo systemctl start fluentd.service
sudo systemctl status fluentd.serviceThe configuration file is located at /etc/fluent/fluentd.conf.
net start fluentdwinsvcThe configuration file is located at C:\opt\fluent\etc\fluent\fluentd.conf.
fluentd -c /path/to/fluentd.confFor more details, see the Fluentd Documentation.
Specify your authentication key by setting the apikey option. You can view your API key from the TD Console.
Access /etc/fluent/fluentd.conf (for fluent-package) to set the apikey option.
YOUR_API_KEY should be your API key string.
# Tailing the JSON formatted Logs
<source>
@type tail
tag td.production.foo
path /path/to/the/file/foo.json
pos_file /var/log/fluent/foo.pos
<parse>
@type json
</parse>
</source>
# Treasure Data Input and Output
<match td.*.*>
@type tdlog
endpoint api.treasuredata.com
apikey YOUR_API_KEY
auto_create_table
use_ssl true
<buffer>
@type file
path /var/log/fluent/buffer/td
</buffer>
</match>Restart the Fluentd service when the following lines are in place.
sudo systemctl restart fluentd.serviceFluentd tails the file, buffers the log (/var/log/fluent/buffer/td), and automatically uploads the log into the cloud.
The following example is a sample log file. Every time a new line is appended to the log file, Fluentd parses the line and adds the record to the buffer. Fluentd uploads the data into the cloud every 5 minutes. To upload the data immediately, send a SIGUSR1 signal.
$ tail -n 5 /path/to/the/file/foo.json
{"a"=>"b", "c"=>"d"}
{"a"=>"b", "c"=>"d", "e"=>1}
{"a"=>"b", "c"=>"d", "e"=>1, "f"=>2.0}
{"a"=>"b", "c"=>"d"}
{"a"=>"b", "c"=>"d", "e"=>1}Issue the following commands to confirm that everything is configured correctly.
# append new entries
$ tail -n 3 /path/to/the/file/foo.json > sample.txt # take the last three lines of the log...
$ cat sample.txt >>/path/to/the/file/foo.json # and append them to the buffer file to trigger the tail plugin.
# flush the buffer
$ kill -USR1 $(cat /var/run/fluent/fluentd.pid)To confirm that your data has been uploaded successfully, issue the td tables command as follows.
td tables
+------------+------------+------+-----------+
| Database | Table | Type | Count |
+------------+------------+------+-----------+
| production | foo | log | 3 |
+------------+------------+------+-----------+Check /var/log/fluent/fluentd.log if it's not working correctly. The fluentd:fluentd user must have permission to read the logs.
We offer a schema mechanism that is more flexible than that of traditional RDBMSs. For queries, we leverage the Hive Query Language.