Writing Job Results into Elastic Cloud (BETA)

This article explains how to write job results directly to your Elastic Cloud instance. This feature is currently in beta, any feedback would be appreciated. This result output will replace Elasticsearch after Jun 30, 2017.

Table of Contents

Prerequisites

  • Basic knowledge of Treasure Data, including the toolbelt.
  • Data imported into Treasure Data, that you wish to export into Elasticsearch.
  • A working knowledge of SQL, Hive, or Presto.
  • A working Elastic Cloud instance. Recommend version 2.0 or greater.
  • You can use this with your own Elasticsearch instance running at your environment.

Also, a knowledge of the following Elasticsearch hierarchy is helpful:

  • Cluster: A collection of one or more servers (nodes) that collectively holds and provides search and indexing functionality for your entire dataset.
  • Node: A single server that is part of (or all of) your cluster.
  • Index: This is analogous to a database. An index is a collection of documents with somewhat similar characteristics.
  • Type: This is analogous to a table. One or more types is defined within an index. A type is a logical category or partition of your index.
  • ID: A column containing each name for each row/record. In Elasticsearch result export, this setting is optional.

Basic Usage



Visit Treasure Data console, go to query editor, and enter your query into the query editor.

Next: click Add for Result Export, and select Elastic Cloud. Please fill out all the information below.



  • Nodes: comma separated list of nodes
  • Use SSL? whether to use SSL or not
  • Auth Method: select either Basic or None
  • Username: Username for basic authentication
  • Password: Password for above user
  • Mode: select either insert or replace
  • Index: the name of index
  • Type: the name of type
  • ID: (optional) the name of ID column

Once you execute your query, Treasure Data query result will be automatically imported into Elastic Cloud. Currently, this supports “basic authentication” including “Security”(formally “Shield”) of Elastic Cloud. But doesn’t support LDAP and Active Directory that are provided by “Security”.

Querying your results from Elastic Cloud instance

You can sanity check the data on your elastic search index with a simple query. Assuming the IP and port on your Elastic Cloud instance are ‘example.com:9200’, the following command can dump all your data to a file:

$ curl -XGET -i 'http://example.com:9200/*/_search' --user <username>:<password> > dump.txt

The result would be a JSON file with the column names, column types, and content according to the data you’ve previously exported there. An example of what an Elasticsearch query may output is shown below.

HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 2283

{"took":4,"timed_out":false,"_shards":{"total":15,"successful":15,"failed":0},"hits":{"total":100024,"max_score":1.0,"hits":[{"_index":"embulk_20160205-141457","_type":"embulk_type","_id":"AVKxyShGu46fqokIoDTf","_score":1...

For more information, please check the Elastic Cloud documentation and Elasticsearch documentation.


Last modified: Mar 31 2017 04:24:09 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.