Server-Side Agent with PHP Apps

Treasure Data provides Server-Side Agent called Treasure Agent (td-agent), to collect server-side logs and events. This article explains 4 steps to streamingly import the data from PHP applications, through Treasure Agent.

Table of Contents

Prerequisites

  • Basic knowledge of PHP.
  • Basic knowledge of Treasure Data, including the toolbelt.
  • PHP 5.3 or higher (for local testing).
Untitled-3
The fluent-logger-php library does not work in Heroku (here's why).

What is Treasure Agent?

First of all, Treasure Agent (td-agent) needs to be installed on your application servers. Treasure Agent is an agent program sits within your application servers, focusing on uploading application logs to the cloud.



The fluent-logger-php library enables PHP applications to post records to their local td-agent. td-agent in turn uploads the data to the cloud every 5 minutes. Because the daemon runs on a local node, the logging latency is negligible.

How to install Treasure Agent?

This video demonstrates how to install Treasure Agent in 3 minutes.

Step 1: Installing Treasure Agent

To install Treasure Agent (td-agent), please execute one of the command below based on your environment. The agent program will be installed automatically by using the package management software for each platform like rpm/deb/dmg.

RHEL/CentOS 5,6,7

$ curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent2.sh | sh

Ubuntu & Debian

# 14.04 Trusty (64bit only)
$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent2.sh | sh
# 12.04 Precise
$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-precise-td-agent2.sh | sh
# 10.04 Lucid
$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-lucid-td-agent2.sh | sh

# Debian Squeeze (64bit only)
$ curl -L https://toolbelt.treasuredata.com/sh/install-debian-squeeze-td-agent2.sh | sh
# Debian Wheezy (64bit only)
$ curl -L https://toolbelt.treasuredata.com/sh/install-debian-wheezy-td-agent2.sh | sh

Amazon Linux

$ curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent2.sh | sh

MacOS X 10.11+

$ open 'https://packages.treasuredata.com/2/macosx/td-agent-2.3.0-0.dmg'
Untitled-3
With MacOS X 10.11.1 (El Capitan), some security changes were introduced and we are testing the changes we made to td-agent for this version of OS. For now, once the td-agent is installed, please edit the /Library/LaunchDaemons/td-agent.plist file to change /usr/sbin/td-agent to /opt/td-agent/usr/sbin/td-agent.

Windows Server 2012+

Windows installation needs multiple steps to follow. Please go to this documentation.



Opscode Chef (repository)

$ echo 'cookbook "td-agent"' >> Berksfile
$ berks install

AWS Elastic Beanstalk is also supported. Windows is currently NOT supported.

Step 2: Modifying /etc/td-agent/td-agent.conf

Next, please specify your API key by setting the apikey option. You can view your api key from the console.

# Unix Domain Socket Input
<source>
  type unix
  path /var/run/td-agent/td-agent.sock
</source>

# Treasure Data Output
<match td.*.*>
  type tdlog
  endpoint api.treasuredata.com
  apikey YOUR_API_KEY
  auto_create_table
  buffer_type file
  buffer_path /var/log/td-agent/buffer/td
  use_ssl true
</match>
Untitled-3
YOUR_API_KEY should be your actual apikey string. You can retrieve your api key from HERE. Using a [write-only API key](access-control#rest-apis-access) is recommended.

Please restart your agent once these lines are in place.

# Linux
$ sudo /etc/init.d/td-agent restart

# MacOS X
$ sudo launchctl unload /Library/LaunchDaemons/td-agent.plist
$ sudo launchctl load /Library/LaunchDaemons/td-agent.plist

td-agent will now accept data via port 24224, buffer it (var/log/td-agent/buffer/td), and automatically upload it into the cloud.

Step 3: Using fluent-logger-php

To use fluent-logger-php, please use Composer as a package manager. First, please create composer.json in your directori with the following content.

{
  "require": {
    "fluent/logger": "v1.0.0"
  }
}

Then, please install Composer and install necessary libraries.

$ curl -sS https://getcomposer.org/installer | php
$ php composer.phar install

Next, initialize and post the records as follows.

<?php
require_once __DIR__.'/vendor/autoload.php';
use Fluent\Logger\FluentLogger;
$logger = new FluentLogger("unix:///var/run/td-agent/td-agent.sock");
$logger->post("td.test_db.test_table", array("hello"=>"world"));
$logger->post("td.test_db.follow", array("from"=>"userA", "to"=>"userB"));

Step 4: Confirming Data Import

First, please execute the program above.

$ php test.php

Sending a SIGUSR1 signal will flush td-agent’s buffer; upload will start immediately.

# Linux
$ kill -USR1 `cat /var/run/td-agent/td-agent.pid`

# MacOS X
$ sudo kill -USR1 `sudo launchctl list | grep td-agent | cut -f 1`

To confirm that your data has been uploaded successfully, issue the td tables command as shown below.

$ td tables
+------------+------------+------+-----------+
| Database   | Table      | Type | Count     |
+------------+------------+------+-----------+
| test_db    | test_table | log  | 1         |
| test_db    | follow     | log  | 1         |
+------------+------------+------+-----------+
Untitled-3
The first argument of post() determines the database name and table name. If you specify `td.test_db.test_table`, the data will be imported into the table *test_table* within the database *test_db*. They are automatically created at upload time.

Tips on Production Deployment

Use Apache and mod_php

We recommend that you use Apache and mod_php. Other setups have not been fully validated.

Use Apache prefork MPM

Please use Apache prefork MPM. Other MPMs such as worker MPM should not be used. You can confirm your current settings with the apachectl -V command.

$ apachectl -V | grep MPM:
Server MPM:     Prefork

Set MaxRequestsPerChild

We recommend that you periodically restart your PHP processes by setting MaxRequestsPerChild in your Apache conf.

<IfModule mpm_prefork_module>
  StartServers          32
  MinSpareServers       32
  MaxSpareServers       32
  MaxClients            32
  MaxRequestsPerChild 4096
</IfModule>
Untitled-3
Do not set MaxRequestsPerChild to zero.

Production Deployments

High-Availablability Configurations of td-agent

For high-traffic websites (more than 5 application nodes), we recommend using a high availability configuration of td-agent. This will improve data transfer reliability and query performance.

Monitoring td-agent

Monitoring td-agent itself is also important. Please refer to this document for general monitoring methods for td-agent.

Untitled-3
td-agent is fully open-sourced under the fluentd project.

FAQs

“Resource temporarily unavailable” warning message appears in my PHP application

This problem happens when you have either relatively high volume, or old Linux kernel version. We need to tune up the Linux kernel a little bit.

Step 1: Increase Max # of File Descriptors

First, please increase the max number of file descriptor per process. When you type ulimit -n command and the result shows 1024, please follow the step below to increase to 65535.

Step 2: Optimize Kernel parameters

Please add these parameters to your /etc/sysctl.conf file. Please either type sysctl -w or reboot your node to have the changes take effect. You need a root permission.

net.core.somaxconn = 1024
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 10240    65535

Next Steps

We offer a schema mechanism that is more flexible than that of traditional RDBMSs. For queries, we leverage the Hive Query Language.


Last modified: Jan 17 2017 23:13:49 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.