# Writing Trino Queries

The following information can help you learn how to write Trino queries.

## Trino Queries

What follows is not a complete set of SQL statements. These are just some basic SQL functions that you might use in your queries.

Info
Before you begin writing your queries, review [Trino Query Runtime Limits and Query Hint Override](/products/customer-data-platform/data-workbench/queries/trino/troubleshooting#troubleshooting-trino-query-runtime-limits-and-query-hint-override).

### Select Data

You can use the SELECT statement to select data from a database. The data returned is stored in a results table, called the result-set.

SELECT … FROM … WHERE … LIMIT

**Example**


```
SELECT contact_id, last_name, first_name
FROM contacts
WHERE website = "Company_Organization.com"
ORDER BY contact_id DESC
LIMIT 5;
```

### Aggregate Data

Use aggregate functions to perform a calculation on one or more values and return a single value.

#### GROUP BY

Use the `GROUP BY` command to group the result set (used with aggregate functions: COUNT, MAX, MIN, SUM, AVG).

**Example**


```
SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country;
```

#### MIN()/MAX()

The MIN() function returns the smallest value of the selected column.

**Example**


```
SELECT MIN(columnname)
FROM table_name
WHERE condition;
```

The MAX() function returns the largest value of the selected column.

**Example**


```
SELECT MAX(column_name)
FROM table_name
WHERE condition;
```

#### MIN_BY/MAX_BY

`MIN_BY` takes two arguments and returns the value of the first argument for which the value of the second argument is minimized.


```
SELECT MIN_BY(close_date, close_value) as date_of_min_sale
FROM sales_pipeline
```

`MAX_BY` takes two arguments and returns the value of the first argument for which the value of the second argument is maximized.

#### APPROX_DISTINCT

`APPROX_DISTINCT` calculates an approximate count of the number of distinct values.

**Example**


```
SELECT APPROX_DISTINCT(close_value)
  FROM sales_pipeline
```

#### APPROX_PERCENTILE

`APPROX_PRECENTILE` calculates the value at a given percentile of a distribution of values.

**Example**


```
SELECT APPROX_PERCENTILE(close_value, 0.75)
FROM sales_pipeline
```

### Work with Time Ranges

#### Choose a Target Data Time Range

Use [TD_TIME_RANGE](/products/customer-data-platform/data-workbench/queries/sql-reference/td_trino_function_reference#td_time_range) and [TD_INTERVAL](/products/customer-data-platform/data-workbench/queries/sql-reference/td_trino_function_reference#td_interval).

#### Format Date Time Strings

Use [TD_TIME_FORMAT](/products/customer-data-platform/data-workbench/queries/sql-reference/td_trino_function_reference#td_time_format) and [TD_TIME_STRING](/products/customer-data-platform/data-workbench/queries/sql-reference/td_trino_function_reference#td_time_string).

#### Truncate Time into Hour, Day, and Month

Use [TD_DATE_TRUNC](/products/customer-data-platform/data-workbench/queries/sql-reference/td_trino_function_reference#td_date_trunc)

#### Specify a Query Target Range

Use [TD_SCHEDULED_TIME](/products/customer-data-platform/data-workbench/queries/sql-reference/td_trino_function_reference#td_scheduled_time)

### Create and Update Tables from SQL Results

The following SQL statements can help you create and update tables in Trino.

#### CREATE TABLE … AS

Create a new table containing the result of a select query. Use CREATE TABLE to create an empty table. You might want to use these optional clauses:

- `IF NOT EXISTS` clause causes the error to be suppressed if the table already exists.
- `WITH` clause can be used to set properties on the newly created table.


**Example**


```
CREATE TABLE [ IF NOT EXISTS ] table_name [ ( column_alias, ... ) ]
[ COMMENT table_comment ]
[ WITH ( property_name = expression [, ...] ) ]
AS query
[ WITH [ NO ] DATA ]
```

#### INSERT INTO

Insert data into an existing table. The following example uses `person` as the name of the table.

**Example**


```
insert into person values (1, 'james', 10, '20190301');
```

### Perform Window Functions

Window functions perform calculations across rows of the query result. They run after the `HAVING` clause but before the `ORDER BY` clause. Invoking a window function requires special syntax using the `OVER` clause to specify the window. A window has these three components:

- **Partition Specification**. Separates the input rows into different partitions. This is analogous to how the `GROUP BY` clause separates rows into different groups for aggregate functions.
- **Ordering Specification**. Determines the order in which input rows are processed by the window function.
- **Window Frame**. Specifies a sliding window of rows to be processed by the function for a given row. If the frame is not specified, it defaults to `RANGE UNBOUNDED PRECEDING`, which is the same as `RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`. This frame contains all rows from the start of the partition up to the last peer of the current row.


For example, the following query ranks orders for each clerk by price.

**Example**


```
SELECT orderkey, clerk, totalprice,
       rank() OVER (PARTITION BY clerk
                    ORDER BY totalprice DESC) AS rnk
FROM orders
ORDER BY clerk, rnk
```

### Process Data

#### Data Type Conversion Functions

Trino implicitly converts numeric and character values to the correct type if such a conversion is possible.
Trino does not convert between character and numeric types. For example, a query that expects a `varchar` does not automatically convert a `bigint` value to an equivalent `varchar`.

When necessary, values can be explicitly cast to a particular type.

**Conversion Functions**

`cast(value AS type) → type`

Explicitly cast a value as a type. This can be used to cast a varchar to a numeric value type and vice versa.

`try_cast(value AS type) → type`

Like `cast()`, but returns null if the cast fails.

#### String Functions

You can write string functions to extract, find like, replace, and split data.

Learn more about [Trino Expression Functions](https://trino.io/docs/423/functions/regexp.html).

#### Process JSON

Use JSON functions and operators to process data.

Learn more about [Trino JSON Functions and Operators](https://trino.io/docs/423/functions/json.html)

#### Numbers (integer, float)

There are many mathematical operators available in Trino that you can use to process data.

Here are a few you might find useful:

- `ceil(x) → [same as input]`
This is an alias for `ceiling()`.
- `floor(x) → [same as input]`
Returns `x` rounded down to the nearest integer.
- `round(x, d) → [same as input]`
Returns `x` rounded to `d` decimal places


To review all mathematical operators, see [Trino Documentation: Mathematical Functions and Operators](https://trino.io/docs/423/functions/math.html).

## Trino Query Tips

### Use the WITH Clause for Nested Queries

The WITH clause is useful for nested queries as shown in this example
query:


```
SELECT a, b, c
FROM (
  SELECT a, MAX(b) AS b, MIN(c) AS c FROM tbl GROUP BY a
) tbl_alias
```

The same query can be written with the WITH clause as follows:


```
WITH tbl_alias AS(
  SELECT a, MAX(b) AS b, MIN(c) AS c FROM tbl GROUP BY a
)
SELECT a, b, c FROM tbl_alias
```

Notice that the query has been “de-nested”.

The following example shows multiple subqueries use WITH. Each subquery
is delimited by a comma.


```
WITH tbl1 AS(
  SELECT a, MAX(b) AS b, MIN(c) AS c FROM tbl GROUP BY a
),
tbl2 AS(
  SELECT a, AVG(d) AS d FROM another_tbl GROUP BY a
)
SELECT tbl1.*, tbl2.*
FROM tbl1 JOIN tbl2 ON tbl1.a = tbl2.a
```

### Use VALUES for Prototyping

If you want to quickly test Trino syntax, you can use VALUES to create
a table immediately.


```
SELECT a, b, c
FROM (
  VALUES
    (1, 2.0, 'x'),
    (3, 4.0, 'y'),
    (5, 6.0, 'z')
) tbl(a, b, c)
```

The above query returns the following table:

| **a** | **b** | **c** |
|  --- | --- | --- |
| 1 | 2.0 | 'x' |
| 3 | 4.0 | 'y' |
| 5 | 6.0 | 'z' |


### Clean up a Table

To clean up a table before using CREATE TABLE AS or INSERT INTO
statements, use multiple statements split by semi-colon.


```
DROP TABLE IF EXISTS mytable;

CREATE TABLE mytable AS SELECT ...;
```

## Eventual Consistency in Multiple Updates Job

Multiple-statement execution is not guarded by a transaction, multiple
update operations in a single job are not atomic but eventual consistency
is maintained. For example, assuming the following job failed after I1:


```
-- I1
INSERT INTO mytable SELECT ...;

-- I2
INSERT INTO mytable SELECT ...;
```

If the system finds a sporadic failure during the job execution, it will
start the job from scratch, but operations once succeeded were skipped.
So the actual execution could be I1 (success), I2 (failure), (retry from scratch),
I1 (skipped) and I2 (success). The final result will be correct.

## Trino Query Signatures

You can view query signatures from the Trino Utilization dashboard and
use the query signatures to identify and group similar queries.

The Trino Utilization dashboard is available upon request for most customers.

Trino query signatures are clusters of Trino queries that share common SQL
statements. Similar jobs, jobs with only a few differences (such as the
date range computed), are grouped together. You can then analyze the
clusters of queries and determine which queries require the most or
least resources in a given month.

The Trino Query Details table, which is posted on the Trino
Utilization dashboard, includes a table column: Query Names or
Signature.

- [Query Signature Definitions for Common Letters](#query-signature-definitions-for-common-letters)
- [Example of Query Signatures](#example-of-query-signatures)


### Query Signature Definitions for Common Letters

Query Signatures are subject to change. The following list defines
common query letters that help to decipher the query that the query
signature represents:

| Query Letter | Description |
|  --- | --- |
| S | Select `S[* ]` means `SELECT *` is used (retrieving all columns. This argument might be inefficient because our storage is columnar format.) |
| Cnt | count(x) |
| CntD | count(distinct x) (If query has this signature, it means a single node process = slow!) |
| T | table |
| G | Group by |
| J | Join |
| O | Order by (single node process = slow!) |
| L | Limit |
| CT | Create table |
| I | Insert Into |
| DT | Delete table |
| D | Delete |
| E | Duplicate Elimination (select distinct, union, etc.) |
| U | Union All |
| E(U(…)) | Union (with duplicate elimination) |


### Example of Query Signatures

The following example shows SQL statements mapped to query signatures:

| Query Signature | Results |
|  --- | --- |
| SELECT 1 | `S(V)` |
| SELECT * FROM t | `S *  t - &gt; #` |
| INSERT INTO A SELECT * FROM B | `B I(``S *``) B - &gt; A` |
| SELECT * FROM t, ‘query’ | `S[*](T)` |
| SELECT * FROM t, ‘table’ | `t - &gt; #` |