Writing Job Results into Tableau Server

Using Tableau Server with Treasure Data allows users to interactively explore huge amounts of data, and also share data information across your organizations.



Table of Contents

Prerequisites

  • Basic knowledge of Treasure Data
  • An license and its installation of Tableau Server

Usage

CLI: One Time Query

1. Non-legacy Mode (new)

Add the Tableau result output destination by using the -r / —result option for the td query command:

$ td query -d mydb -r '{"type":"tableau","host":"my.tableauserver.com","username":"my_user","password":"passw0rd","ssl":true,"ssl_verify":false,"server_version":"10.0","datasource":"my_ds","site":"","project":"","mode":"replace","chunk_size_in_mb":50,"timezone":"PST"}' 'SELECT * FROM access'

where:

  • username: Your Tableau Server Username
  • password: Your Tableau Server Password
  • host: Your Tableau Server Hostname
  • site: The URL of the site to sign in to (optional)
    • To determine the value to use for the site attribute, sign in to Tableau Server and examine the value that appears after /site/ in the URL. For example, in the following URL, the site value is MarketingTeam:
      • https://MyServer/#/site/MarketingTeam/projects
    • If the site attribute is an empty string, you are signed in to the default site. Note that you always sign in to a specific site, even if you don’t specify a site when you sign in
    • Refer here for more details
  • datasource: target Tableau DataSource name
  • chunk_size_in_mb: Chunk file size (in MB) to be uploaded each time, default: 200, min: 100, max: 1024
  • timezone: timezone to convert from Timestamp data type to Tableau DateTime data type, default: UTC

2. Legacy Mode (not recommended)

Change the Tableau result output destination from JSON format to URL format as shown in the following example:

$ td query -d mydb -r 'tableau://username:password@my.tableauserver.com/?mode=replace' 'SELECT * FROM access'
$ td query -d mydb -r 'tableau://username:password@my.tableauserver.com:8443/?mode=replace&ssl=true' 'SELECT * FROM access'

3. Migrating Existing Output Configurations in order to prepare for Tableau Hyper

Tableau has updated its data engine with a technology called Hyper. We will support the new Hyper format soon, but in order to prepare for it, you must update your existing Treasure Data output configurations. New Treasure Data configurations automatically use the latest version.

To upgrade your existing configurations:

  1. Go to Queries and select your scheduled query.
  2. In Query Editor and click Output Results target.
  3. Choose your saved Tableau connection and uncheck the box Legacy mode.
  4. Input the Site ID value; it’s required for non-legacy mode (input EMPTY STRING if you sign in to the default site).
  5. Click Done to save your configuration.

Console: One Time Query

Go to the query editor on the web console, and type in the query. The following example query uses the access log example data set, and calculates the distribution of HTTP method per day.

HIVE:

# HiveQL
SELECT
  CAST(TD_TIME_FORMAT(time, "yyyy-MM-dd 00:00:00") AS TIMESTAMP) AS "dates",
  method AS `Method`,
  COUNT(1) AS `Count`
FROM
  www_access
GROUP BY
  TD_TIME_FORMAT(time, "yyyy-MM-dd 00:00:00"),
  method

PRESTO:

# Presto
SELECT
  CAST(TD_TIME_FORMAT(time, 'yyyy-MM-dd 00:00:00') AS TIMESTAMP) AS "dates",
  method AS `Method`,
  COUNT(1) AS `Count`
FROM
  www_access
GROUP BY
  TD_TIME_FORMAT(time, 'yyyy-MM-dd 00:00:00'),
  method
Untitled-3
We're casting Datetime column to TIMESTAMP type from String type, for convenience in Tableau. NOTE: Tableau doesn't support fractional seconds in a timestamp. Remove fractional seconds (for eg, using subtr() function) before casting it to TIMESTAMP type in the query.

Choose saved connection

A dialog Choose Saved Connection will be displayed. Select an existing Tableau Server connection. If you do not have a Saved Connection already setup, follow the next step on how to create a new connection within the Sources Catalog.



Create a new connection

Visit Treasure Data Connections search and select Tableau. The dialog below will open



  • Username: Your Tableau Server Username
  • Password: Your Tableau Server Password
  • Host: Your Tableau Server Hostname (If you want to set specified port number, set HOST:PORT)

Additional configuration

After create new Tableau connection or select an existing one, you will see the additional Configuration popup.

1. Non-legacy Mode (new)



Parameters Description Default values
Datasource Name The name of TDE file on Tableau Server
Site ID If you don’t have specific Site, set EMPTY String for Tableau Server
Project Name Go to your Tableau Server to get a list of projects
Mode replace to replace TDE each time, append to append to existing TDE append
Chunk File Size In MB Extract File is split into chunks before uploading. This option defines file size of each chunk (min: 100, max: 1024) 200
Timezone Timezone ID to use when converting from Timestamp (timezone-independent) to Tableau DateTime (timezone-dependent) UTC

2. Legacy Mode (not recommended)



Untitled-3
Legacy Mode will be deprecated soon, due to incompatibility with latest Tableau version. Migrate your existing scheduled jobs to use Non-legacy Mode to ensure compatibility.
Untitled-3
Currently, you MUST setup https (SSL). Check here about how to set up SSL on your Tableau Server.
Untitled-3
Append Mode ignores new column if exporting data inserted an existing data source because of the specification of the Tableau. https://onlinehelp.tableau.com/current/pro/desktop/en-us/extracting_addfromfile.html

After completing all the fields, Submit the query. The system will execute the query, create the TDE (Tableau Data Extract) file, and upload TDE to Tableau Server.

Go to your Tableau Server, and click Data Sources at the top left bar. You can view the list of data sources, including your TDE file.



Click New Workbook to create the charts and dashboard from the browser. Drag and drop the dimensions and measures from the left nav, to top right nav to create graphs. Then click Save to store the result.

Console: Scheduled Query

In the previous section, you have learned how to create the charts by one job. The next action is to periodically refresh the dataset by Scheduled Jobs feature. By using this feature, you can periodically execute the query on Treasure Data, and update TDE file on Tableau Server. The schedules can be specified from the query editor. It’s useful is to use TD_SCHEDULED_TIME() UDF, with append mode.

The query below will calculate the # of records within last 24 hours, from the time where the query gets executed. By continuously running this scheduled query, you can avoid processing the entire data set every day.

HIVE:

# HiveQL
SELECT
  CAST(TD_TIME_FORMAT(time, "yyyy-MM-dd 00:00:00") AS TIMESTAMP) AS `Datetime`,
  method AS `Method`,
  COUNT(1) AS `Count`
FROM
  www_access
WHERE
  TD_TIME_RANGE(time,
    TD_TIME_ADD(TD_SCHEDULED_TIME(), '-1d'),
    TD_SCHEDULED_TIME())
GROUP BY
  TD_TIME_FORMAT(time, "yyyy-MM-dd 00:00:00"),
  method

PRESTO:

# Presto
SELECT
  CAST(TD_TIME_FORMAT(time, 'yyyy-MM-dd 00:00:00') AS TIMESTAMP) AS "dates",
  method AS `Method`,
  COUNT(1) AS `Count`
FROM
  www_access
GROUP BY
  TD_TIME_FORMAT(time, 'yyyy-MM-dd 00:00:00'),
  method

Options

Result output to Tableau Server supports various options. The options can be specified as URL parameters on the CLI or with the REST APIs or the Console where supported. The options are normally compatible with each other and can be combined. Where applicable, the default behavior is indicated.

ssl

The ssl option determines whether to use SSL or not for connecting to the Tableau server. When ‘true’, SSL is used. ssl=true is the default when this option is not specified.

tableau://username:password@my.tableauserver.com/?ssl=true
tableau://username:password@my.tableauserver.com:443/?ssl=true

ssl_verify

The ssl_verify option determines whether to require certifcate verification for the SSL communication or not. When ‘true’, the certificate verification is required. ssl_verify=true is the default when this option is not specified.

tableau://username:password@my.tableauserver.com/?ssl=true&ssl_verify=true

Disabling certificate verification is particularly useful when the Tableau server’s SSL certificate is self-signed.

Limitations

  • The maximum result record is 250,000,000 records. If it exceeded, the log outputs TDE file records limit exceeded: 250,000,000.
  • The lowest Timestamp value is 1000-01-01 00:00:00. If it exceeded, the log outputs invalid date value.

Last modified: Apr 10 2018 15:19:35 UTC

If this article is incorrect or outdated, or omits critical information, let us know. For all other issues, access our support channels.