Using Tableau Server with Treasure Data allows you to interactively explore very large amounts of data, and share data information across your organizations.

For sample workflows of exporting to Tableau Server, view Treasure Boxes.

Prerequisites

  • Basic knowledge of Treasure Data

  • A license and its installation of Tableau Server

  • You MUST setup https (SSL). Configure SSL on your Tableau Server.

Limitations

  • Tableau doesn't support fractional seconds in a timestamp. Remove fractional seconds (for eg, using subtr() function) before casting it to TIMESTAMP type in the query.

  • The maximum result record is 250,000,000 records. If it exceeded, the log displays the message: Extract file records limit exceeded: 250000000.

  • The lowest Timestamp value is 1000-01-01 00:00:00. If it exceeded, the log displays the message: invalid date value.

  • .hyper extracts published to Tableau Server 10.5 via REST API will have the .tde file extension on the Tableau Console. These extracts were still in .hyper file format and functioned the same way as other .hyper extracts. The issue is fixed in 2018.1. More details: Tableau Server 2018.1 Release Notes (search for Issue ID 754677)

  • A Data Source belonging to a nested Project cannot be appended by using a REST API. A Resource Not Found error is returned. Replace mode does not have this issue. A fix is planned for Tableau Server 2018.1.4. If you encounter this error, it's recommended that you upgrade to the latest version of Tableau Server.

Migrating Existing Output Configurations to Tableau Hyper

Tableau has updated its data engine with a technology called Hyper. To take advantage of this, you must update your existing Treasure Data output configurations. New Treasure Data configurations automatically use the latest version.

To upgrade your existing configurations:

  1. Go to Queries and select your scheduled query.

  2. In Query Editor and select Export Results target.

  3. Choose your saved Tableau connection.

  4. Input the Site ID value; it’s required for multi-tenant Tableau Server (leave it empty if you sign in to Default site).

  5. Select 'hyper' for Data Source Type.

  6. Select Done to save your configuration.

You do not need to change the timezone configuration when migrating from the legacy Tableau to the current Tableau. You can leave the default value (UTC). See also the timezone in this article.

Obtain Tableau Connection Information  

From the command line or from within the Tableau apps or from your Tableau administrator, obtain the following:

  • Host
  • Username
  • Password
  • Server Version

The user profile that you use to connect to Tableau should have read and export rights. 

To determine the value to use for the site attribute, sign in to Tableau Server and examine the value that appears after /site/ in the URL. For example, in the following URL, the site value is MarketingTeam:

https://MyServer/#/site/MarketingTeam/projects

If the site attribute is an empty string, you are signed in to the default site. You are always signed in to a specific site, even if you don’t specify a site when you sign in.

Read the REST API reference for Tableau Server.

Project

On Tableau Server, there should be at least one 'Default' project. You can't delete or rename this project. If you omit the Project Name option, the 'Default' project is used.

In some rare cases, especially non-English Tableau Server, the 'Default' project name is localized and jobs fail if you don't provide a Project Name (because it couldn't find a project with 'Default' as the name).

Use the TD Console to Create Your Connection

Create a New Connection

In Treasure Data, you must create and configure the data connection prior to running your query. As part of the data connection, you provide authentication to access the integration.

1. Open TD Console.
2. Navigate to Integrations Hub Catalog.
3. Search for and select Tableau.

4. Select Create Authentication.
5. Type the credentials to authenticate.

6. Type a name for your connection.
7. Select Done.



Define your Query

  1. Complete the instructions in Creating a Destination Integration.
  2. Navigate to Data Workbench > Queries.

  3. Select a query for which you would like to export data.

  4. Run the query to validate the result set.

  5. Select Export Results.

  6. Select an existing integration authentication.
  7. Define any additional Export Results details. In your export integration content review the integration parameters.
    For example, your Export Results screen might be different, or you might not have additional details to fill out:
  8. Select Done.

  9. Run your query.

  10. Validate that your data moved to the destination you specified.


Integration Parameters for Tableau



ParameterValuesDescription
Datasource Name
The name you want to use for this data export. 
Site ID

Required for a multi-tenant server configuration, such as Tableau Online.

If you don’t have a specific Site, set EMPTY String for Tableau Server

The URL of the site to sign in to (optional)

To determine the value to use for the site attribute, sign in to Tableau Server and examine the value that appears after /site/ in the URL. For example, in the following URL, the site value is MarketingTeam:

https://MyServer/#/site/MarketingTeam/projects

If the site attribute is an empty string, you are signed in to the default site. You are always signed in to a specific site, even if you don’t specify a site when you sign in.

Read the REST API reference for Tableau Server.

Project Name

Required if your Tableau Server does not have a 'Default' project. Recommended to fill out for non-English locales.

Go to your Tableau Server to get a list of projects.

Mode

append

replace

replace to replace Data Source each time, append to append to existing Data Source

Append Mode ignores new columns if it is exporting data inserted from an existing data source because of the specification of the Tableau.

Data Source Type

hyper

tde

Hyper files contain one or more tables worth of data.

tde is a file format for Tableau data extraction. 

Extract File Chunk Size 

An extract file is split into chunks before it is exported to Tableau. Indicate a chunk size between 50 to 512 MB

Chunk file size (in MB) to be uploaded each time, default: 200, min: 100, max: 1024

HTTP read timeout
Value in milliseconds
Timezone

The timezone timestamp values will be stored as in Tableau.

Timezone ID to use when converting from Timestamp (timezone-independent) to Tableau DateTime (timezone-dependent).

The timestamp value is timezone independent. For example, 1548979200. The Tableau DateTime includes day, hour, minute, etc. To convert from timestamp value to Tableau DateTime the connector needs to know the target timezone. If your query contains a TIMESTAMP column, or you cast a DateTime column to TIMESTAMP, the value is exported to the Tableau server as DateTime. Meaning, there is a conversion and you need to provide a target timezone.

Treasure Data stores the DateTime value using the UTC timezone. In most cases, leave timezone configured as default (UTC), to preserve the value from Treasure Data, unless you particularly want to convert the value to another timezone.


Example Queries

The following example query uses the access log example data set, and calculates the distribution of the HTTP method per day. For convenience, it casts the Datetime column to TIMESTAMP type from String type in Tableau.

HIVE:

SELECT
  CAST(TD_TIME_FORMAT(time, "yyyy-MM-dd 00:00:00") AS TIMESTAMP) AS "dates",
  method AS `Method`,
  COUNT(1) AS `Count`
FROM
  www_access
GROUP BY
  TD_TIME_FORMAT(time, "yyyy-MM-dd 00:00:00"),
  method

PRESTO:

For convenience, in Presto queries, we recommend using TD_TIME_FORMAT instead of TD_TIME_STRING.

SELECT
  CAST(TD_TIME_FORMAT(time, 'yyyy-MM-dd 00:00:00') AS TIMESTAMP) AS "dates",
  method AS `Method`,
  COUNT(1) AS `Count`
FROM
  www_access
GROUP BY
  TD_TIME_FORMAT(time, 'yyyy-MM-dd 00:00:00'),
  method



Optionally Schedule the Query Export Jobs

You can use Scheduled Jobs with Result Export to periodically write the output result to a target destination that you specify.



1. Navigate to Data Workbench > Queries.
2. Create a new query or select an existing query.
3. Next to Schedule, select None.

4. In the drop-down, select one of the following schedule options.

Drop-down ValueDescription
Custom cron...

Review Custom cron... details.

@daily (midnight)Run once a day at midnight (00:00 am) in the specified time zone.
@hourly (:00)Run every hour at 00 minutes.
NoneNo schedule.

Custom cron... Details

Cron Value

Description

0 * * * *

Run once an hour

0 0 * * *

Run once a day at midnight

0 0 1 * *

Run once a month at midnight on the morning of the first day of the month

""

Create a job that has no scheduled run time.

 *    *    *    *    *
 -    -    -    -    -
 |    |    |    |    |
 |    |    |    |    +----- day of week (0 - 6) (Sunday=0)
 |    |    |    +---------- month (1 - 12)
 |    |    +--------------- day of month (1 - 31)
 |    +-------------------- hour (0 - 23)
 +------------------------- min (0 - 59)

The following named entries can be used:

  • Day of Week: sun, mon, tue, wed, thu, fri, sat

  • Month: jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec

A single space is required between each field. The values for each field can be composed of:

Field ValueExampleExample Description

a single value, within the limits displayed above for each field.



a wildcard ‘*’ to indicate no restriction based on the field. 

‘0 0 1 * *’ configures the schedule to run at midnight (00:00) on the first day of each month.
a range ‘2-5’, indicating the range of accepted values for the field.‘0 0 1-10 * *’ configures the schedule to run at midnight (00:00) on the first 10 days of each month.
a list of comma-separated values ‘2,3,4,5’, indicating the list of accepted values for the field.

0 0 1,11,21 * *’


configures the schedule to run at midnight (00:00) every 1st, 11th, and 21st day of each month.
a periodicity indicator ‘*/5’ to express how often based on the field’s valid range of values a schedule is allowed to run.

‘30 */2 1 * *’


configures the schedule to run on the 1st of every month, every 2 hours starting at 00:30. ‘0 0 */5 * *’ configures the schedule to run at midnight (00:00) every 5 days starting on the 5th of each month.
a comma-separated list of any of the above except the ‘*’ wildcard is also supported ‘2,*/5,8-10’‘0 0 5,*/10,25 * *’configures the schedule to run at midnight (00:00) every 5th, 10th, 20th, and 25th day of each month.
5.  (Optional) If you enabled the Delay execution, you can delay the start time of a query.

Execute the Query

Save the query with a name and run, or just run the query. Upon successful completion of the query, the query result is automatically imported to the specified container destination.


Scheduled jobs that continuously fail due to configuration errors may be disabled on the system side after several notifications.




Optionally Configure Export Results in Workflow

Within Treasure Workflow, you can specify the use of this data connector to export data.

Learn more at Using Workflows to Export Data with the TD Toolbelt.

Example Workflow for Tableau

timezone: UTC

_export:
  td:
    database: sample_datasets
  tableau:
    datasource: datasource_name
    site_id: site_id
    project: project_name
    mode: append
    legacy: false
    datasource_type: hyper

+td-result-output-tableau:
  td>: queries/sample.sql
  result_connection: tableau_connection
  result_settings:
    datasource: ${tableau.datasource}
    site: ${tableau.site_id}
    project: ${tableau.project}
    mode: ${tableau.mode}
    legacy: ${tableau.legacy}
    target_type: ${datasource_type}



Validate the Data Export

  1. Open your Tableau Server.

  2. Select Data Sources at the top left bar.
    You can view the list of data sources, including your TDE file.


  3. Select New Workbook to create the charts and dashboard from the browser.

  4. Drag and drop the dimensions and measures from the left nav, to the top right nav to create graphs.

  5. Select Save to store the result.












  • No labels