# Trino Performance Tuning

Review the following information to learn how to tune Trino for the
best performance. These performance tuning tips assume the following level of experience:

- Basic knowledge of Treasure Data.
- Basic knowledge of Trino query engine. Learn how to write [Trino Queries](/ja/products/customer-data-platform/data-workbench/queries/trino/writing_trino_queries).


## Tuning Tips

Review these tips and make adjustments to increase your Trino queries performance.

- Always use [TD_TIME_RANGE](/ja/products/customer-data-platform/data-workbench/queries/sql-reference/td_trino_function_reference#td_time_range)
or [TD_INTERVAL](/ja/products/customer-data-platform/data-workbench/queries/sql-reference/td_trino_function_reference#td_interval)
- Avoid using slow memory consuming operators such as
  - ORDER BY
  - COUNT(DISTINCT x)
    - Consider using approximate aggregate function [approx_distinct](https://trino.io/docs/423/functions/aggregate.html#approx_distinct)(x) instead.
- Join processing
  - Tables should be joined in the order of a larger table to smaller tables.
  - Using a non-equi join condition slows down the query processing.
- Columnar storage characteristics
  - Choosing too many columns slows down query processing.
- Query result size
  - Generating too many rows takes time. Consider writing result records to a table with CREATE TABLE AS... or INSERT INTO.


## Specifying the Columns you Need

Actual data in Treasure Data is stored as a columnar storage format,
which is optimized for the query using only specific columns.
Restricting accessed columns can improve your query performance
significantly. Specify only needed columns instead of using a wildcard
(`*`).


```
[GOOD]: SELECT time,user,host FROM tbl
[BAD]:  SELECT * FROM tbl
```

## Leveraging Time-Based Partitioning

All imported data is automatically partitioned based on the `time` field within each data record.

By specifying the time range in your query, you can avoid reading
unnecessary data and can thus speed up your query significantly.

### Specify Time as an Integer

When the `time` field is specified within a WHERE clause, the query
parser automatically detects which partitions should be processed. This
auto-detection does **not** work if you specify the time with `float`
instead of `int`.


```
[GOOD]: SELECT field1, field2, field3 FROM tbl WHERE time > 1349393020
[GOOD]: SELECT field1, field2, field3 FROM tbl WHERE time > 1349393020 + 3600
[GOOD]: SELECT field1, field2, field3 FROM tbl WHERE time > 1349393020 - 3600
[BAD]:  SELECT field1, field2, field3 FROM tbl WHERE time > 13493930200 / 10
[BAD]:  SELECT field1, field2, field3 FROM tbl WHERE time > 1349393020.00
[BAD]:  SELECT field1, field2, field3 FROM tbl WHERE time BETWEEN 1349392000 AND 1349394000
[BAD]:  SELECT field1, field2, field3 FROM tbl WHERE time > (SELECT MAX(last_updated) FROM tbl2)
```

### Use TD-TIME-RANGE

You can use TD_TIME_RANGE to partition data.


```
[GOOD]: SELECT ... WHERE TD_TIME_RANGE(time, '2013-01-01', 'UTC')
[GOOD]: SELECT ... WHERE TD_TIME_RANGE(time, '2013-01-01', NULL, 'UTC')
[GOOD]: SELECT ... WHERE TD_TIME_RANGE(time, '2013-01-01',
                                       TD_TIME_ADD('2013-01-01', '1day', 'UTC'))
```

However, if you use division in TD_TIME_RANGE, the time partition optimization doesn’t work. For example, Treasure Data discourages the SQL constructions as they disable optimization.

**Poorly optimized SQL**


```
SELECT ... WHERE TD_TIME_RANGE(time, TD_SCHEDULED_TIME() / 86400 * 86400))
```

**Poorly optimized SQL**


```
SELECT ... WHERE TD_TIME_RANGE(time, 1356998401 / 86400 * 86400))
```

The TD_INTERVAL user defined function is also available for partitioning.
TD_INTERVAL provides an intuitive way to specify the time range.
For example, to select the last 7 days:


```
SELECT ... WHERE TD_INTERVAL(time, '-7d')
```

## Considering the Cardinality within the GROUP BY Clause

It's possible to improve the performance of the GROUP BY function by
carefully ordering a list of fields within the GROUP BY in an order of
high cardinality.

| **SQL Example** |  |
|  --- | --- |
| Good | `SELECT GROUP BY uid, gender` |
| Bad | `SELECT GROUP BY gender, uid` |


Or, use numbers instead of strings for the GROUP BY column, because
numbers require less memory and are faster to compare than strings.

## Using LIMIT with ORDER BY

ORDER BY requires that all rows be sent to a single worker which then
sorts them. ORDER BY might often require a lot of memory on a Trino
worker. When you want to look up the top or bottom N records, use LIMIT
which can reduce the sort cost and memory pressure.

| **SQL Example** |  |
|  --- | --- |
| Good | `SELECT * FROM tbl ORDER BY time LIMIT 100` |
| Bad | `SELECT * FROM tbl ORDER BY time` |


## Using Approximate Aggregate Functions

Trino has a couple of [approximate aggregation
functions](https://trino.io/docs/350/functions/aggregate.html),
that give you significant performance improvements, but with some
errors. For example, by using approx_distinct() function, you can get an
approximation of COUNT(DISTINCT x)with a standard error of 2.3%. The
following example gives an approximate count of the previous day’s
unique users.


```
SELECT
  approx_distinct(user_id)
FROM
  access
WHERE
  TD_TIME_RANGE(time,
    TD_TIME_ADD(TD_SCHEDULED_TIME(), '-1d', 'PDT'),
    TD_SCHEDULED_TIME())
```

## Aggregating a Series of LIKE Clauses in One Single regexp_like Clause

Trino’s query optimizer is unable to improve queries that contain many
LIKE clauses. As a consequence, the query execution might be slower than
expected.

To improve the performance, you can substitute a series of LIKE clauses
that are chained with the OR condition with a single regexp_like clause,
which is Trino native.

For example:


```
SELECT
  ...
FROM
  access
WHERE
  method LIKE '%GET%' OR
  method LIKE '%POST%' OR
  method LIKE '%PUT%' OR
  method LIKE '%DELETE%'
```

can be optimized by replacing the four LIKE clauses with a single
regexp_like clause:


```
SELECT
  ...
FROM
  access
WHERE
  regexp_like(method, 'GET|POST|PUT|DELETE')
```

## Choosing Trino Join and Sort Algorithms

### Join Algorithms

There are two types of join distributions:

- Partitioned: each node participating in the query builds a hash
table from a fraction of the data
- Broadcast: each node participating in the query builds a hash table
from all of the data (data is replicated to each node)


The default join algorithm used in Trino is distributed PARTITION join.
This algorithm partitions both the left and right tables using the hash
values of join keys. Partitioned join uses multiple worker nodes for
processing queries; it is generally faster and takes less memory.

In some cases where one of the join tables is very small, the overhead
of partitioning data through the network using distributed PARTITION
join might exceed the benefit of broadcasting the whole table to all the
nodes participating in the join operation. In those cases, 'BROADCAST'
join might perform better. If you use the BROADCAST join, specify the
large table first in the join clause. 'BROADCAST' join can be enabled
specifying the following magic comment.


```
-- set session join_distribution_type = 'BROADCAST'
```

This option will use more memory as the right join table will be copied
to all nodes.

### Sort Algorithms

For queries with ORDER BY clause, Trino uses Distributed sort by
default; the sorting operation runs in parallel on multiple nodes, and a
single node merges the final results. However, there might be cases
where a single node sort operation performs better. For such cases, the
distributed sort can be disabled by using the following magic comment.


```
-- set distributed_sort=false
```

Disabling distributed sort will use more memory of a single node; the
query might fail due to the maximum memory limit.

## Defining Partitioning for Trino

All tables in Treasure Data are partitioned based on the time column.
This is why queries that use TD_TIME_RANGE or similar predicates on the
time column are efficient in Treasure Data. Trino can eliminate
partitions that fall outside the specified time range without reading them.

User-defined partitioning (UDP) provides hash partitioning for a table
on one or more columns in addition to the time range partitioning.
A query that filters on the set of columns used as user-defined partitioning keys
can be more efficient because Trino can skip scanning partitions that
have matching values on that set of columns.

Refer to [User-defined partitioning](/ja/products/customer-data-platform/data-workbench/queries/sql-reference/udp) for the details of UDP.

## Managing Resources for Trino Queries

- Use resource pools
- Run more concurrent queries
- Use more CPU resources
- Use more memory (limited)


## Magic Comments

Magic comments are instructions inserted into Trino SQL queries and given by the database driver to override the database optimization that occurs with standard Trino queries. The database driver follows certain rules for adding the magic comments to the database query and the magic comments override any execution plan the query optimizer might select for a query.

Use magic comments in cases when you know more about the database content and structure than the query optimizer. It can be more efficient to use magic comments to improve execution time.

### Magic Comment Parsing Decimal Literals as Double

A Trino session-level property that controls whether the decimal literals (ex. 1.2) are parsed as decimal or double.

The *parse_decimal_literals_as_double* session property is supported as a magic comment.

Decimal literals without an explicit type specifier (e.g. 1.2) are treated as the values of the DOUBLE type by default.

This magic comment can disable the behavior and use decimal literals as the values of the DECIMAL type. For example:


```
-- set session parse_decimal_literals_as_double = 'false'
SELECT * FROM large_table, small_table
WHERE small_table.id = large_table.id
```

### Magic Comment Distributed Sort

Trino session property that when enabled allows the sort operator to execute in parallel on multiple nodes in the cluster.

Distributed sort allows sorting of data, which exceeds `query.max-memory-per-node`. Distributed sort is enabled through the distributed_sort session property, or distributed-sort configuration property set in `etc/config.properties` of the coordinator. Distributed sort is enabled by default.

When distributed sort is enabled, the sort operator executes in parallel on multiple nodes in the cluster. Partially sorted data from each worker node is then streamed to a single worker node for a final merge. This technique allows to utilization of memory of multiple worker nodes for sorting. The primary purpose of the distributed sort is to allow for the sorting of data sets that don’t normally fit into single node memory. Performance improvement can be expected, but it won’t scale linearly with the number of nodes, since the data needs to be merged by a single node.


```
-- set session distributed_sort = 'true'
SELECT * FROM large_table, small_table
WHERE small_table.id = large_table.id
```

### Magic Comment Time Partitioning Range

A TD-specific setting to customize partition size for INSERT/CTAS queries.

The time_partitioning_range session property is supported as a magic comment.


```
-- set session time_partitioning_range = 'value'
```

The value can be:

- none
  - no time partitioning
- number with unit
  - allowed units are `h, d, mo, q, y` (hour, day, month, quarter, year). For example, 12h, 2d, 1mo, 1q, 1y


### Magic Comment Limiting Execution Time

If you want to override any other limits for a single query, you can apply the following query hint at the beginning of the query:


```
-- set session query_max_execution_time = '2h'
SELECT
  COUNT(*)
FROM
  ALL_EVENTS
;
```

If limits are set in your account, and you write a query with a hint, then the smallest limit between account, resource-group, query-hint, and global is used. Supported time measures are h (hours), m (minutes), s (seconds).

If the limit specified in the query hint is longer than the configured limit, then the shorter configured limit still applies. For example, if a 1-hour limit is configured for ad hoc queries, then you can use the magic comment to limit a query to 5 minutes (5m) but not to increase the limit to 2 hours (2h).

Info
All Trino queries are limited to 6 hours runtime, regardless of any account-level policy or query hint.

### Magic Comment to improve performance for result output

To improve the job performance of queries that produce massive results, `result_output_redirect` is used to enhance the performance of Trino jobs. By default, `result_output_redirect` is set to 'true'.


```
-- set session result_output_redirect='true'
```

This impacts performance even though the query results are ready; TD Trino is waiting for the worker node to complete its sequential operations. To mitigate this, Treasure Data now uses `result_output_direct`, which redirects the query result to S3 in parallel, thereby improving the performance of the queries.

LIMITATION: There are SQL queries where Hints do not improve performance, including:

- When a query includes ORDER BY
- When a query is CREATE TABLE or INSERT TABLE
- When a query includes uppercase letters
- When a query has duplicated column names
- When a query has NULL without explicit type


## Trino Join Performance Improvement with Equi-Joins

Sometimes you can use equi-joins to improve performance of the joins in
your Trino queries. The *equi-join* concatenates tables by comparing
join keys using the equal (=) operator.

If this comparison becomes complex, the join processing slows down.

For example, if you want to join two tables with date string,
‘2015-10-01’, but one of the tables only has columns for year, month,
and day values, you can write the following query to generate date
strings:


```
SELECT a.date, b.name FROM
left_table a
JOIN right_table b
ON a.date = CAST((b.year * 10000 + b.month * 100 + b.day) as VARCHAR)
```

This query delays the join processing because the join condition
involves several expressions.

To make the example query faster, you can push this condition down into
a subquery to prepare a join key beforehand:


```
SELECT a.date, b.name
FROM
  left_table a
JOIN (
  SELECT
    CAST((b.year * 10000 + b.month * 100 + b.day) as VARCHAR) date,
# generate join key
    name
  FROM right_table
) b
ON a.date = b.date  # Simple equi-join
```

In this example, join keys are a.date and b.date str columns. Comparing
VARCHAR strings is much faster than comparing VARCHAR and expression
results.

## Trino Query Simplification Options

The following are some of the ways that you can simplify your Trino
queries.

### Use WITH statement

If your query becomes complex or deeply nested, try to extract
subqueries using WITH clause. For example, the following query that has
a nested subquery:


```
SELECT a, b, c FROM (
   SELECT a, MAX(b) AS b, MIN(c) AS c FROM tbl GROUP BY a
) tbl_alias
```

can be rewritten as follows:


```
WITH tbl_alias AS (SELECT a, MAX(b) AS b, MIN(c) AS c FROM tbl GROUP BY a)
SELECT a, b, c FROM tbl_alias
```

You can also enumerate multiple sub-queries in WITH clause, by using a
comma:


```
WITH tbl1 AS (SELECT a, MAX(b) AS b, MIN(c) AS c FROM tbl GROUP BY a),
     tbl2 AS (SELECT a, AVG(d) AS d FROM another_tbl GROUP BY a)
SELECT tbl1.*, tbl2.* FROM tbl1 JOIN tbl2 ON tbl1.a = tbl2.a
```

### Use WITH statement in CREATE TABLE statement

If your CREATE TABLE query becomes complex or deeply nested, try to
extract subqueries using WITH clause. For example, one sub query can be
rewritten as follow:


```
CREATE TABLE tbl_new AS WITH tbl_alias AS (SELECT a, MAX(b) AS b, MIN(c) AS c FROM tbl1)
SELECT a, b, c FROM tbl_alias
```

You can also enumerate multiple sub-queries in WITH clause as in the
following:


```
CREATE TABLE tbl_new AS WITH tbl_alias1 AS (SELECT a, MAX(b) AS b, MIN(c) AS c FROM tbl1),
                             tbl_alias2 AS (SELECT a, AVG(d) AS d FROM tbl2)
SELECT tbl_alias1.*, tbl2_alias.* FROM tbl_alias1 JOIN tbl_alias2 ON tbl_alias1.a = tbl_alias2.a
```

### Specify GROUP BY targets with numbers

Group by clause requires having the same expression in the SELECT
statement:


```
SELECT TD_TIME_FORMAT(time, 'yyyy-MM-dd HH', 'PDT') hour, count(*) cnt
FROM my_table
GROUP BY TD_TIME_FORMAT(time, 'yyyy-MM-dd HH', 'PDT')  # <-- redundant expression
```

You can simplify this query by using GROUP BY 1, 2, …:


```
SELECT TD_TIME_FORMAT(time, 'yyyy-MM-dd HH', 'PDT') hour, count(*) cnt
FROM my_table
GROUP BY 1
```

These numbers correspond to the column indexes (1-origin) of the SELECT
statement.