Tableau + Treasure Data Reference Architecture

While Treasure Data is a BI tool agnostic service, customers like retailer MUJI are using Tableau for the BI / Visual Analytics. In this article, we want to introduce our customers’ usage pattern of combining Treasure Data, Tableau Desktop and Tableau Server. Let’s begin with understanding the characteristics of each solutions.

Table of Contents

What is Tableau?



Tableau is a business intelligence software that helps people see and understand their data. There are two major products provided by Tableau Software: Tableau Desktop and Tableau Server. Tableau Desktop is a Desktop Application (Windows or Mac) to visualize and analyze data. It helps us create workbooks, visualizations, dashboards, and stories.

Users can publish these to Tableau Server for sharing within an organization. In short, Tableau Desktop is a BI designer tool, while Tableau Server is a publishing environment to share the visualizations. Tableau Online is a hosted version of Tableau Server, which doesn’t require you to manage the BI server.

What is Treasure Data?

Treasure Data is a cloud-based managed service for data and analytics. Treasure Data empowers data-driven companies to focus on insights, not infrastructure.

Users can store trillions of records in the cloud by collecting semi-structured big data in real-time, and aggregate the data by using one of several query engines. Often times, those results will be fed to a data warehouse or reporting server for further consumption by end-users.

Tableau + Treasure Data Reference Architecture

So, why combine Treasure Data & Tableau? Treasure Data provides a scalable backend to handle new big data sources (application logs, web logs, mobile data, sensor data, etc), while Tableau provides flexible visual analytics for existing data sources (EDW, CRM, ERP, etc).

By combining Treasure Data and Tableau, you can quickly get the insights from any data sources of any size. Here’s a reference architecture diagram. Let’s see how it works step by step.



Step 1: Collect Big Data (Treasure Data)



First, let’s start collecting data into Treasure Data. Treasure Data provides various ways to collect data into the cloud in near-real-time. The data sources depicted here are ‘time-series’ data, which means there is historical data, produced in real time, and growing rapidly as your business scales. Here are the four main data collection capabilities provided by Treasure Data:

Treasure Data is now importing almost 1 Million records per second, and all Treasure Data customers benefit from such scale. Setting up the data collection usually takes only a couple of hours, or even a few minutes in some cases.

Step 2: Aggregate Big Data (Treasure Data)



Now we have raw data in the cloud. To provide a better experience for the BI consumers, it’s a good idea to summarize this raw data into smaller sets for performance reasons. By using one of Treasure Data’s embedded query engines, you can crunch big data into an aggregated format.

Treasure Data supports ‘Tableau Result Output’ so you can directly push the aggregated results into Tableau Server. You don’t need any additional infrastructure to do this. You can even automate this process by using Scheduled Jobs to periodically aggregate the data.

Treasure Data can push the query results as a ‘Tableau Data Extract‘ (TDE) file. TDE is a Tableau’s proprietary columnar file format, optimized for efficiently slicing and dicing data (see Why Use Tableau Data Extracts). The TDE file will be directly saved into Tableau Server.

Step 3: Design Workbooks (Tableau Desktop)



Now we have raw data access and aggregated data too. It’s time to explore the data using Tableau Desktop. It has a lot of built-in connectors for existing data sources (EDW, CRM, ERP, Excel, etc), so you can interact with them directly.

Tableau Desktop can also connect to Tableau Server to directly interact with the TDE files.

You can create the dashboard using TDE files on Tableau Server too. Every time you drag & drop the columns on your Desktop, the data is processed at the Server side. If you have too much network latency between Server and Desktop, you can download the TDE file to your local disk.

Third, Treasure Data provides an ODBC driver for Tableau Desktop so that data analyst can have raw data access.

Analysts can choose any of the above methods depending on the needs. You can also join across these data sources. For example, you can join between Salesforce.com data and a TDE file, or even join between multiple TDE files. Once the workbook is created, Tableau Desktop can publish it to Tableau Server.

Step 4: Share the Workbooks (Tableau Server)



Now everything is set. Analysts can publish workbooks to the server and the consumers can view these from their browsers. The beauty is analysts can quickly iterate on the data and reports by having access to all the data sources, so they’re now self-reliant.

Conclusion

Tableau + Treasure Data empowers data-driven companies to rapidly explore data and get insights. By combining these two solutions, people can focus on insights, not infrastructure, with an industry-leading visual analytics tool. If you have any questions, please contact us. For the next steps, please follow these links to learn how to connect to Treasure Data from each Tableau product.


Last modified: Feb 22 2017 23:40:42 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.