Using Tableau Online with Treasure Data allows you to interactively explore huge amounts of data, and share your data discoveries across your business organizations.

For sample workflows of exporting to Tableau Online, view Treasure Boxes.


Prerequisites

  • Basic knowledge of Treasure Data.

  • A license and its installation of Tableau Online.

  • The connector requires Datasource publish permission; the creator or admin user role would be matched.

  • Personal Access Token name and secret when you want to authenticate with Personal Access Token. 

Limitations

  • The maximum result record is 250,000,000 records.
    If it is exceeded, the log displays the message: Extract file records limit exceeded: 250000000.

  • The lowest Timestamp value is 1000-01-01 00:00:00.
    If it is exceeded, the log displays the message: invalid date value.

  • Appending a large dataset to an existing Data Source may result in timeout.
    You will see the below message in the job logs:

    2019-04-10 19:20:41.460 +0000 [WARN] (0001:transaction): !!! Data Source Publish is timed out. This is a known issue of Tableau when you append a large extract file to existing Data Source.
    2019-04-10 19:20:41.460 +0000 [WARN] (0001:transaction): !!! Check Tableau Console for final result of the Publish.
  • The job succeeds but you must check the final result on Tableau Console (you can verify by using the Number of records in the Data Source).

  • Token expiry when user chooses Personal Access Token for authentication method. Personal access tokens will expire if they are not used after 15 consecutive days. If they are used more frequently than every 15 days, an access token will expire after 1 year. After a year, you must create a new token. Expired personal access tokens will not display on the My Account Settings page.

Static IP Address of Treasure Data

The static IP address of Treasure Data is the access point and source of the linkage for this Integration. To determine the static IP address, contact your Customer Success representative or Technical support.

Use the TD Console to Create a Connection

In Treasure Data, you must create and configure the data connection before running your query. As part of the data connection, you provide authentication to access the integration.

There are two different processes to create a connection:

Use your Tableau username and password

Use a personal access token

Create a New Authentication with Tableau Username and Password

1. Open TD Console.
2. Navigate to Integrations Hub > Catalog.
3. Search for and select Tableau.

4. Select Create Authentication.

5. Type the following credentials to authenticate.
ParameterDescription
HostTableau online host where your site exists (e.g. 10az.online.tableau.com)
UsernameYour Tableau online username
PasswordYour Tableau online password
6. Select Continue.
7. Type a name for your connection.
8. Select Done.

Create a New Authentication with Personal Access Token

Obtain Personal Access Token from Tableau

1. Log in to your Tableau online server, and then go to My Account Settings.

2. From the Settings tab, enter a Personal Access Token name and then select Create new token.

3. Copy and store the Personal Access Token name/ Personal Access Token secret.

Log in to TD Console

4. Open TD Console.
5. Navigate to Integrations Hub > Catalog.
6. Search for and select Tableau.

7. Select Create Authentication.

8. Type the following credentials to authenticate.

Parameter 

Description
Host

Tableau online host where your site exists (e.g. 10az.online.tableau.com)

Auth method

Personal Access Token

Personal Access Token Name

Your Tableau Online Personal Access Token name

Personal Access Token Secret

Your Tableau Online Personal Access Token secret

9. Select Continue.
10. Type a name for your connection.
11. Select Done.


Define your Query

1. Navigate to Data Workbench > Queries.

2. Select New Query.

3. Run the query to validate the result set.

One Time Query CLI Usage

JSON-Style Config (New and Recommended)

  1. Add the Tableau result output destination by using the -r / —result option for the td query command:
Password Authentication
$ td query -d mydb -r '{"type":"tableau","host":"company.online.tableau.com","auth_method":"password","username":"my_user","password":"passw0rd","ssl":true,"ssl_verify":false,"server_version":"online","datasource":"my_ds","site":"MarketingTeam","project":"","mode":"replace","chunk_size_in_mb":50,"timezone":"America/Los_Angeles"}' 'SELECT * FROM access'

Personal Access Token Authentication
$ td query -d mydb -r '{"type":"tableau","host":"company.online.tableau.com","auth_method":"pat","pat_name":"pat_name","pat_secret":"pat_secret","ssl":true,"ssl_verify":false,"server_version":"online","datasource":"my_ds","site":"MarketingTeam","project":"","mode":"replace","chunk_size_in_mb":50,"timezone":"America/Los_Angeles"}' 'SELECT * FROM access'


ParameterDescription
auth_methodAuthentication method support([password, pat], default: password)

username

Your Tableau Online Username (required if auth_method is password)

password

Your Tableau Online Password (required if auth_method is password)

pat_name
Your Personal Access Token name ( required if auth_method is pat)
pat_secret
Your Personal Access Token secret (required if auth_method is pat)

host

Tableau Online Host on which your site exists (e.g. 10ay.online.tableau.com)

datasource

Target Tableau DataSource name

site

The URL of the site to sign in to

To determine the value to use for the site attribute, sign in to Tableau Online, and examine the value that appears after /site/ in the URL. For example, in the following URL, the site value is MarketingTeam:

https://online.tableau.com/#/site/MarketingTeam/workbooks


Read the REST API reference for Tableau Server.

Mode

replace to replace Data Source each time, append to append to existing Data Source. Default is append

chunk_size_in_mb

Chunk file size (in MB) to be uploaded each time, default: 100, min: 50, max: 512

read_timeout_millisThe time you wait for the response (max: 7200000). Value in milliseconds

timezone

Timezone to convert from Timestamp data type to Tableau DateTime data type, default: UTC

URL-Style (Not Recommended and cannot work with Personal Access Token authentication)

  1. Change the Tableau result output destination from JSON format to URL format as in the following example:
$ td query -d mydb -r 'tableau://username:password@host/datasource?mode=replace&site' 'SELECT * FROM access'

Migrating Existing Output Configurations to Tableau Hyper

Tableau has updated its data engine with a technology called Hyper. To take advantage of this, you must update your existing Treasure Data output configurations. New Treasure Data configurations automatically use the latest version.

Upgrade Your Existing Configurations

  1. Go to Queries and select your scheduled query.

  2. In Query Editor and click Export Results target.

  3. Choose your saved Tableau connection.

  4. Input the Site ID value; it’s required for Tableau Online.

  5. Select 'hyper' for Data Source Type.

  6. Click Done to save your configuration.

You do not need to change the timezone configuration when migrating from the legacy Tableau to the current Tableau. You can leave the default value (UTC). See also timezone in this article.

One Time Query using TD Console

Go to the query editor on the TD Console, and type in the query. The following example query uses the access log example data set, and calculates the distribution of HTTP method per day.

HIVE

# HiveQL
SELECT
  CAST(TD_TIME_FORMAT(time, "yyyy-MM-dd 00:00:00") AS TIMESTAMP) AS `dates`,
  method AS `Method`,
  COUNT(1) AS `Count`
FROM
  www_access
GROUP BY
  TD_TIME_FORMAT(time, "yyyy-MM-dd 00:00:00"),
  method

PRESTO

# Presto
SELECT
  CAST(TD_TIME_FORMAT(time, 'yyyy-MM-dd 00:00:00') AS TIMESTAMP) AS "dates",
  method AS "Method",
  COUNT(1) AS "Count"
FROM
  www_access
GROUP BY
  TD_TIME_FORMAT(time, 'yyyy-MM-dd 00:00:00'),
  method

Treasure Data is casting the Datetime column from String type to TIMESTAMP type in Tableau.

Tableau doesn’t support fractional seconds in a timestamp. Remove fractional seconds (for example, by using subtr() function) before casting to the TIMESTAMP type in the query.


Choose Saved Connection

A dialog Choose Integration displays. Select an existing Tableau Online connection. If you do not have a Saved Integration already set up, follow the next step on how to create a new connection within the Sources Catalog.

Additional Configuration

After you create a Tableau connection or select an existing one, you see the following Configuration popup.

Parameters

Description

Default values

Datasource Name

The name of destination Data Source on Tableau Online


Site ID

The URL of the site to sign in to, it’s required for Tableau Online


Project Name

Go to your Tableau Online to get a list of projects

Default

Mode

replace to replace Data Source each time, append to append to existing Data Source

append

Chunk File Size In MB

Extract File is split into chunks before uploading. This option defines file size of each chunk (min: 100, max: 1024)

200

HTTP read timeout in millisecondThe time you wait for the response (max: 7200000). Value in milliseconds7200000

Timezone

Timezone ID to use when converting from Timestamp (timezone-independent) to Tableau DateTime (timezone-dependent)

UTC

Append Mode ignores new columns if it is exporting data inserted an existing data source because of the specification of the Tableau. https://onlinehelp.tableau.com/current/pro/desktop/en-us/extracting_addfromfile.html

  1. After completing all the fields, Submit the query. The system will execute the query, create the Tableau Data Extract file (.tde or .hyper), and upload extract file to Tableau Online.
  2. Go to your Tableau Online, and click Data Sources at the top left bar. You can view the list of data sources, including your TDE file.
  3. Select New Workbook to create the charts and dashboard from the browser. Drag and drop the dimensions and measures from the left navigation to the top right navigation to create graphs. Select Save to store the result.

Migration Authentication from Password to Personal Access Token

When you enable MFA on the Tableau online console, you cannot use username/password to authenticate anymore. Tableau requires a Personal Access Token authentication. You must migrate all connections that used the Password Authentication method to the Personal Access Token Authentication method.

Change Connection to Personal Access Token Authentication Method

1. Login to TD console and go to Authentication.

2. Change Auth method from Password to Personal Access Token and enter token name/secret.


You can use this connection to export data to tableau with your Personal Access Token name/secret.

Update a Save Query to Personal Access Token Authentication Method


1. Change the connector in the saved query from Password Authentication to Personal Access Token Authentication.
2. Navigate to Data Workbench > Queries.
3. Select your saved query and click clone. Enter a new name for your query.


4. From the query editor select Export Result and choose the connection that was edited from step 1; fill export configuration again. 

5. Go to query on TD console and delete your old query.

(Optional) Schedule the Query Export Jobs

You can use Scheduled Jobs with Result Export to periodically write the output result to a target destination that you specify.


1. Navigate to Data Workbench > Queries.
2. Create a new query or select an existing query.
3. Next to Schedule, select None.

4. In the drop-down, select one of the following schedule options.

Drop-down ValueDescription
Custom cron...

Review Custom cron... details.

@daily (midnight)Run once a day at midnight (00:00 am) in the specified time zone.
@hourly (:00)Run every hour at 00 minutes.
NoneNo schedule.

Custom cron... Details

Cron Value

Description

0 * * * *

Run once an hour

0 0 * * *

Run once a day at midnight

0 0 1 * *

Run once a month at midnight on the morning of the first day of the month

""

Create a job that has no scheduled run time.

 *    *    *    *    *
 -    -    -    -    -
 |    |    |    |    |
 |    |    |    |    +----- day of week (0 - 6) (Sunday=0)
 |    |    |    +---------- month (1 - 12)
 |    |    +--------------- day of month (1 - 31)
 |    +-------------------- hour (0 - 23)
 +------------------------- min (0 - 59)

The following named entries can be used:

  • Day of Week: sun, mon, tue, wed, thu, fri, sat

  • Month: jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec

A single space is required between each field. The values for each field can be composed of:

Field ValueExampleExample Description

a single value, within the limits displayed above for each field.



a wildcard ‘*’ to indicate no restriction based on the field. 

‘0 0 1 * *’ configures the schedule to run at midnight (00:00) on the first day of each month.
a range ‘2-5’, indicating the range of accepted values for the field.‘0 0 1-10 * *’ configures the schedule to run at midnight (00:00) on the first 10 days of each month.
a list of comma-separated values ‘2,3,4,5’, indicating the list of accepted values for the field.

0 0 1,11,21 * *’


configures the schedule to run at midnight (00:00) every 1st, 11th, and 21st day of each month.
a periodicity indicator ‘*/5’ to express how often based on the field’s valid range of values a schedule is allowed to run.

‘30 */2 1 * *’


configures the schedule to run on the 1st of every month, every 2 hours starting at 00:30. ‘0 0 */5 * *’ configures the schedule to run at midnight (00:00) every 5 days starting on the 5th of each month.
a comma-separated list of any of the above except the ‘*’ wildcard is also supported ‘2,*/5,8-10’‘0 0 5,*/10,25 * *’configures the schedule to run at midnight (00:00) every 5th, 10th, 20th, and 25th day of each month.
5.  (Optional) If you enabled the Delay execution, you can delay the start time of a query.

Execute the Query

Save the query with a name and run, or just run the query. Upon successful completion of the query, the query result is automatically imported to the specified container destination.


Scheduled jobs that continuously fail due to configuration errors may be disabled on the system side after several notifications.



Execute the Query

Save the query with a name and run, or just run the query. Upon successful completion of the query, the query result is automatically imported to the specified container destination.

Scheduled jobs that continuously fail due to configuration errors may be disabled on the system side after several notifications.

The following query calculates the # of records within the last 24 hours, from the time that the query gets executed. By continuously running this scheduled

query, you can avoid processing the entire data set every day.

The following query calculates the # of records within the last 24 hours, from the time that the query gets executed. By continuously running this scheduled query, you can avoid processing the entire data set every day.

HIVE

# HiveQL SELECT 
  CAST(
    TD_TIME_FORMAT(time,
      "yyyy-MM-dd 00:00:00") AS TIMESTAMP
  ) AS `dates`,
  method AS `Method`,
  COUNT(1) AS `Count`
FROM
  www_access
GROUP BY
  TD_TIME_FORMAT(time,
    "yyyy-MM-dd 00:00:00"),
  method

PRESTO

# Presto SELECT 
  CAST(
    TD_TIME_FORMAT(time,
      'yyyy-MM-dd 00:00:00') AS TIMESTAMP
  ) AS "dates",
  method AS "Method",
  COUNT(1) AS "Count"
FROM
  www_access
GROUP BY
  TD_TIME_FORMAT(time,
    'yyyy-MM-dd 00:00:00'),
  method

Options

Result output to Tableau supports various options. The options can be specified as URL parameters on the CLI or with the REST APIs or the Console where supported. The options are normally compatible with each other and can be combined. Where applicable, the default behavior is indicated.

ssl Option

The ssl option determines whether to use SSL for connecting to the Tableau server. When ‘true’, SSL is used. ssl=true is the default when this option is not specified.

tableau://username:password@host/?ssl=true

ssl_verify Option

The ssl_verify option determines whether to require certificate verification for the SSL communication. When ‘true’, certificate verification is required. ssl_verify=true is the default when this option is not specified.

tableau://username:password@host/?ssl=true&ssl_verify=true

Disabling certificate verification is useful when the Tableau server’s SSL certificate is self-signed.

Timezone

  1. To convert from timestamp value, which is timezone independent, for example, 1548979200, to Tableau DateTime, which includes day, hour, minute, etc. the connector needs to know the target timezone.
  2. If your query contains a TIMESTAMP column, or you cast a datetime column to TIMESTAMP, the value is exported to the Tableau server as DateTime. Meaning, there is a conversion and you need to provide the target timezone as necessary.
  3. Treasure Data stores datetime value using UTC timezone. In most cases, leave timezone config as default (UTC), to preserve the value from Treasure Data, unless you particularly want to convert the value to another timezone.

An example of configuring a timezone other than the default UTC is as follows:

From the CLI

$ td query "..." -r '{ "type": "tableau", ..., "timezone": "America/Los_Angeles" }'

As part of TD Workflow

host: "company.online.tableau.com"ssl: truessl_verify: trueusername: "my_user"password: "passw0rd"
datasource: "my_ds"site: "my_company"project: "Default" server_version: "online"
timezone: "America/Los_Angeles"

(Optional) Configure Export Results in Workflow


Within Treasure Workflow, you can specify the use of a data connector to export data.

Learn more at Using Workflows to Export Data with the TD Toolbelt.

Example Workflow for Password Authentication Method

_export:
  td:
    database: tableau_db
 
+tableau_export_task:
  td>: export_tableau.sql
  database: ${td.database}
  result_connection: new_created_tableau_auth
  result_settings:

    type: tableau
    host: "host"
    auth_method: "password"
    username: "xxxxxxxxx"
    password: "xxxxxxxxx"
    ssl: true
    sslVerify: true
    serverVersion: "online"
    datasource: "datasource"
    site: "site"
    project: "project"
    targetType: "hyper"
    chunkSizeInMb: 100
    timezone: "UTC"


Example Workflow for Personal Access Token Authentication Method

_export:
  td:
    database: tableau_db
 
+tableau_export_task:
  td>: export_tableau.sql
  database: ${td.database}
  result_connection: new_created_tableau_auth
  result_settings:

    type: tableau
    host: "host"
    auth_method: "pat"
    pat_name: "xxxxxxxxx"
    pat_secret: "xxxxxxxxx"
    ssl: true
    sslVerify: true
    serverVersion: "online"
    datasource: "datasource"
    site: "site"
    project: "project"
    targetType: "hyper"
    chunkSizeInMb: 100
    timezone: "UTC"



  • No labels