Writing Job Results into Tableau Online

Using Tableau Online with Treasure Data allows you to interactively explore huge amounts of data, and share your data discoveries across your business organizations.



Table of Contents

Prerequisites

  • Basic knowledge of Treasure Data
  • An license and its installation of Tableau Online

Usage

CLI: One Time Query

Add the Tableau result output destination by using the -r / —result option for the td query command:

$ td query -d mydb -r 'tableau://username:password@host/?mode=replace' 'SELECT * FROM access'

where:

  • username: Your Tableau Online Username
  • password: Your Tableau Online Password
  • Host: Tableau Online Host on which your site exists (e.g. 10ay.online.tableau.com)

Console: One Time Query

Go to the query editor on the web console, and type in the query. The following example query uses the access log example data set, and calculates the distribution of HTTP method per day.

HIVE:

# HiveQL
SELECT
  CAST(TD_TIME_FORMAT(time, "yyyy-MM-dd 00:00:00") AS TIMESTAMP) AS `dates`,
  method AS `Method`,
  COUNT(1) AS `Count`
FROM
  www_access
GROUP BY
  TD_TIME_FORMAT(time, "yyyy-MM-dd 00:00:00"),
  method

PRESTO:

# Presto
SELECT
  CAST(TD_TIME_FORMAT(time, 'yyyy-MM-dd 00:00:00') AS TIMESTAMP) AS "dates",
  method AS "Method",
  COUNT(1) AS "Count"
FROM
  www_access
GROUP BY
  TD_TIME_FORMAT(time, 'yyyy-MM-dd 00:00:00'),
  method
Untitled-3
We're casting Datetime column to TIMESTAMP type from String type, for convenience in Tableau.

Tableau doesn’t support fractional seconds in a timestamp. Remove fractional seconds (for eg, using subtr() function) before casting it to TIMESTAMP type in the query.



Choose saved connection

A dialog Choose Saved Connection will be displayed. Select an existing Tableau online connection. If you do not have a Saved Connection already setup, follow the next step on how to create a new connection within the Sources Catalog.



Create a new connection

Visit Treasure Data Connections search and select Tableau. The dialog below will open



  • Username: Your Tableau Online Username
  • Password: Your Tableau Online Password
  • Host: Tableau Online Host on which your site exists (e.g. 10az.online.tableau.com)

Additional configuration

After create new Tableau connection or select an existing one, you will see the Additional configuration popup.



Parameters Description Default values
Datasource The name of TDE file on Tableau Online
Site If you don’t have specific Site, set EMPTY String for Tableau Online
Project Go to your Tableau Online to get a list of projects
Mode replace to replace TDE each time, append to append to existing TDE append
Untitled-3
Append Mode ignores new column if exporting data inserted an existing data source because of the specification of the Tableau. https://onlinehelp.tableau.com/current/pro/desktop/en-us/extracting_addfromfile.html

After completing all the fields, Submit the query. The system will execute the query, create the TDE (Tableau Data Extract) file, and upload TDE to Tableau Online.

Go to your Tableau Online, and click Data Sources at the top left bar. You can view the list of data sources, including your TDE file.



Cick New Workbook to create the charts and dashboard from the browser. Drag and drop the dimensions & measures from the left nav, to top right nav to create graphs. Then click Save to store the result.

Console: Scheduled Query

In the previous section, you have learned how to create the charts by one job. The next action is to periodically refresh the dataset by Scheduled Jobs feature. By using this feature, you can periodically execute the query on Treasure Data, and update TDE file on Tableau Server. You can specify schedules from the query editor by using TD_SCHEDULED_TIME() UDF, with append mode.

The following query calculates the # of records within last 24 hours, from the time that the query gets executed. By continuously running this scheduled query, you can avoid processing the entire data set every day.

HIVE:

# HiveQL
SELECT
  CAST(TD_TIME_FORMAT(time, "yyyy-MM-dd 00:00:00") AS TIMESTAMP) AS `dates`,
  method AS `Method`,
  COUNT(1) AS `Count`
FROM
  www_access
GROUP BY
  TD_TIME_FORMAT(time, "yyyy-MM-dd 00:00:00"),
  method

PRESTO:

# Presto
SELECT
  CAST(TD_TIME_FORMAT(time, 'yyyy-MM-dd 00:00:00') AS TIMESTAMP) AS "dates",
  method AS "Method",
  COUNT(1) AS "Count"
FROM
  www_access
GROUP BY
  TD_TIME_FORMAT(time, 'yyyy-MM-dd 00:00:00'),
  method

Options

Result output to Tableau supports various options. The options can be specified as URL parameters on the CLI or with the REST APIs or the Console where supported. The options are normally compatible with each other and can be combined. Where applicable, the default behavior is indicated.

ssl

The ssl option determines whether to use SSL for connecting to the Tableau server. When ‘true’, SSL is used. ssl=true is the default when this option is not specified.

tableau://username:password@host/?ssl=true

ssl_verify

The ssl_verify option determines whether to require certifcate verification for the SSL communication. When ‘true’, the certificate verification is required. ssl_verify=true is the default when this option is not specified.

tableau://username:password@host/?ssl=true&ssl_verify=true

Disabling certificate verification is useful when the Tableau server’s SSL certificate is self-signed.

Limitations

  • The maximum result record is 250,000,000 records. If it exceeded, the log outputs TDE file records limit exceeded: 250,000,000.
  • The lowest Timestamp value is 1000-01-01 00:00:00. If it exceeded, the log outputs invalid date value.

Last modified: Nov 24 2017 07:09:47 UTC

If this article is incorrect or outdated, or omits critical information, let us know. For all other issues, access our support channels.