Skip to content
Last updated

Data Partitioning in Treasure Data

Tables in Treasure Data are partitioned by the time column by default.

When records are filtered by the time column value, query engines process only relevant records instead of processing an entire data set. This processing time data pruning allows efficient data processing.

In the following examples, only records that fit the specified time range are selected.

--example 1:
SELECT  
  ... 
WHERE 
  TD_TIME_RANGE(time,'2013-01-01', 'PDT', null);

--example 2:
SELECT 
  COUNT(1) 
FROM 
  table_name
WHERE 
  TD_TIME_RANGE(time, '2017-07-01', '2017-07-02', 'UTC');

The system dynamically merges or splits table partitions in the background to maintain optimal query performance and load balancing.

User-Defined Partitioning

User-defined partitioning is an alternative to timestamp-based partitioning. User-defined partitioning allows other data partitioning strategies that can improve performance when working with non-time-series data. For more information, see Defining Partitioning for Presto.

For examples of how to use time-based partitioning in Treasure Data, refer to: