# Trino Troubleshooting

## Troubleshooting Trino Query Runtime Limits and Query Hint Override

By default, the Trino service limits query execution time to 6 hours.
You can manage the Trino query execution time in several ways.
You might want to limit some or all Trino queries to run for a shorter period.
For example:

- You can limit execution time for ad hoc queries by default, to avoid
unexpectedly high Trino costs.
- You can push users to choose Hive over Trino for large and
long-running data pipeline jobs.


### Policies to Limit Query Execution Time

Contact Support to request custom limits on the following types of queries:

- Ad hoc queries from the Console, and queries submitted through ODBC and JDBC
- Queries that run as scheduled queries or as steps in workflows
- Queries explicitly assigned to any named resource pool


The limits apply to queries created by any user.

### Limiting Execution Time of a Single Query with Query Hints

If you want to override any other limits for a single query, you can
apply the following query hint at the beginning of the query:


```
-- SET SESSION query_max_execution_time = '6h'
SELECT COUNT(*) FROM ALL_EVENTS;
```

If limits are set in your account, and you write a query with a hint,
then the smallest limit between account, resource-group, query-hint, and
global is used.

Supported time measures are d (days), h (hours), m (minutes), s(seconds).

Note: All Trino queries are limited to 6 hours runtime, regardless of
any account-level policy or query hint.

If the limit specified in the query hint is longer than the configured
limit, then the shorter configured limit still applies. For example, if
a 1-hour limit is configured for ad hoc queries, then you can use the
magic comment to limit a query to 5 minutes (5m) but not to increase the
limit to 2 hours (2h).

## Troubleshooting Trino Queries

Typically, when Trino queries running on Treasure Data have problems
the following solutions can be tried:

- [Start with a smaller data set and review subqueries](#start-with-a-smaller-data-set-and-review-subqueries)
- [Create a Sample Data Set with a VALUES Clause](#create-a-sample-data-set-with-a-values-clause)


If they do not solve your particular issue, there is always e-mail to
[support@treasuredata.com](mailto:support@treasuredata.com) with the job IDs of your queries. If
possible, include information about the expected results and the meaning
of your data set.

### Start with a smaller data set and review subqueries

Trino can process millions of rows in a second. If you see query errors
or unexpected results, try to minimize your input data set. Here are
some typical ways you can refine your query:

- Narrow down the input data size by using
[TD_TIME_RANGE](/products/customer-data-platform/data-workbench/queries/sql-reference/td_trino_function_reference#td_time_range)
- Create a sample table by using CREATE TABLE AS SELECT … WHERE ….
  - Review: Use CREATE TABLE AS SELECT.
- Minimize your query:
  - Remove irrelevant columns from the SELECT statements
  - Reduce the number of joins
  - Replace SELECT c1, c2, … with SELECT count(*)
    - If you are still experiencing query errors, skipping the
query result generation with count(*) will reduce your
investigation time.
- Extract sub-queries and check their results one by one.
  - Running individual subqueries can help identify the performance
bottleneck.
- Check the conditions in the WHERE clause carefully.
  - Checking a NULL value using the equal (=) operator always
returns false. Instead, use IS NULL (or IS NOT NULL).
  - The following query scans the entire data set even if
TD_TIME_RANGE is used:



```
SELECT * FROM table1
WHERE
  col1 < 100
  OR col2 is TRUE
  AND TD_TIME_RANGE(time, '2015-11-01')
```

The full data set scan occurs because AND has a stronger precedence than
OR. This condition is equivalent to:


```
(col1 < 100) OR (col2 is TRUE AND TD_TIME_RANGE(time, '2015-11-01'))
```

The first condition does not specify any TD_TIME_RANGE, so it results in
scanning the whole table. To fix this, use parenthesis appropriately:


```
(col1 < 100 OR col2 is TRUE) AND TD_TIME_RANGE(time, '2015-11-01')
```

### Create a Sample Data Set with a VALUES Clause

To test Trino queries, use the VALUES clause to prepare a sample data
set. You don’t need to import a data set. For example:


```
SELECT * FROM (VALUES (1, 'apple'), (2, 'banana')) as fruit(id, name);

# This query gives the following table:
#  id |  name
#-----+--------
#   1 | apple
#   2 | banana
```

## Troubleshooting Trino Errors

Review the following sections to troubleshoot Trino error codes.

### Trino error - Concatenated string is too large

This error is caused by using the concat function ( `concat` or `||` ) to concatenate strings that have a combined length greater than 1048576 characters.

**Resolution**

As a workaround, break your concat statement into two or more statements that each yield fewer than 1048576 total characters.

### Trino error - mismatched input xxxx expecting - SYNTAX ERROR

**Description**

For example, the query below will simply cause the error in title:

**Example**


```
SELECT * from tbl SELECT;
```

**Error Message**


```
line 1:15: mismatched input 'SELECT'. Expecting: '(', 'LATERAL', 'UNNEST', <identifier>
```

**Cause**

A keyword is detected in SQL, while it should not be there as per SQL grammar. As result a compile error as titled, will be output.

**Resolution**

Locate the keyword by information in error message (in this case "line 1:15"), then check and fix it.


```
SELECT * from tbl;
```

### Trino Error - second argument of max n min n must be less than or equal to 10000 - found xxxx

**Description**

Trino can take a 2nd/3rd argument (n) to get largest value of all input
values in the following 4 functions:

- min(col, n)
- max(col, n)
- min_by(col1, col2, n)
- max_by(col1, col2, n)


However, Trino restricts n to 10000 or less since v0.194.

**Cause**

When calling the above functions, using a large value for *n* can cause
performance issues in Trino clusters. As a result, Trino will return
an error when a value greater than 10000 is for used for *n*.

**Resolution**

Assign a value less than 10000 to n, when you call any of the 4
functions.

**Related Information**

- [Trino Aggregate Functions](https://trino.io/docs/423/functions/aggregate.html)


### Trino error - column xxxx cannot be resolved - SYNTAX ERROR

If you receive this error message, you can take actions to resolve the
issue.

**Description**

The following is an example.

Table Definition:

![image](/assets/336100.24575ad2c0a3de6eed29e5304ad7aae68e5ef47ba4eee38479363bc66159c1ed.0053d791.png)

Based on the example, the following queries can cause the error: Column
'xxxx' cannot be resolved:


```
SELECT id1 FROM tbl;
Column 'id1' cannot be resolved
```

or


```
SELECT id FROM tbl WHERE id="123" ;
Column '123' cannot be resolved
```

**Cause**

Column 'id1' does not exist (the column specified in the SQL query is
incorrect).

or

Double quotations are used to contain the string in the WHERE clause.
Double quotations can be used only to specify the column name.

**Resolution**

Use correct column name in SQL query.


```
SELECT id FROM tbl;
```

or

Use single quotations around strings values.

Do not use any quotations when specifying a number.


```
SELECT id FROM tbl WHERE id=123 ;
```

### Trino error - Unexpected parameters for function yyyy

This example query would cause the following error:


```
SELECT td_time_format(time) from pageviews;
```

Error:


```
Unexpected parameters (bigint) for function td_time_format.
Expected: td_time_format(bigint, varchar, varchar) , td_time_format(bigint, varchar)
```

**Cause**

This error occurs when the number of arguments or the data type provided
in the function is different from those specified for that function.

**Resolution**

Provide the correct number and type of arguments required for the
function. See the notes for the correct number and type of arguments.

Fixed query:


```
SELECT td_time_format(time, 'yyyy-MM-dd HH:mm:ss', 'UTC') from pageviews;
```

### TOO MANY REQUESTS FAILED

**Error Example:**


```
Query 20160519_223238_81890_gf734 failed: TOO_MANY_REQUESTS_FAILED: Failed to read f1a04ceb-c217-4ee4-ba4f-17b10c78ec9f.
```

**Description**

The **TOO_MANY_REQUESTS_FAILED** error indicates that there are many
data reading errors. Trino itself does not have fault tolerance, so in
this situation, it will exit with an error once. By re-executing the
query in TD, the query itself will be successful. Therefore, the log
will be cleared once, but the Trino query will be re-executed after a
few minutes.

**Cause**

The message happens when our internal error increased. But, trino job
retries automatically, so the job succeeded.

**Resolution**

Trino job retries automatically or re-execute the query in TD.

### Trino Reliability and PAGE TRANSPORT TIMEOUT Error

Trino transfers the data through the network. Because connection
timeouts or problems in worker nodes, this network data transfer may
occasionally fail with the PAGE_TRANSPORT_TIMEOUT error.

Trino is designed for faster query processing when compared to Hive, so
it sacrifices fault-tolerance, to some extent. Typically, more than
99.5% of Trino queries finish without any error on the first run.

Treasure Data provides a query retry mechanism on query failures, so
nearly 100% of queries finish successfully after being retried.

Because of the nature of network-based distributed query processing, if
your query tries to process billions of records, the chance of hitting
network failures increases. If you start seeing the
PAGE_TRANSPORT_TIMEOUT error frequently, try to reduce the input data
size by narrowing down the TD_TIME_RANGE or reducing the number of
columns in SELECT statements.

### QUERY HAS TOO MANY STAGES Error

**Error Example:**


```
Query 20250314_041639_00000_vc9wt failed: Number of stages in the query (2515) exceeds the allowed maximum (2500).
```

**Description**

The **QUERY_HAS_TOO_MANY_STAGES** error indicates that the query has too many
stages. A large number of stages will introduce instability in the cluster.
The maximum number of stages in a query plan is capped at 2500 in TD.

**Cause**

The message happens when a query generates more stages than 2500.

**Resolution**

Rewrite the query not to exceed the allowed maximum. If the query contains
WITH clauses that are referenced more than once, please consider creating
temporary table(s) for the queries in those clauses.