The following information can help you learn how to write Presto queries. Before you begin writing your queries, review Presto Query Runtime Limits and Query Hint Override.
There are many more SQL statements than what we document here. These are some basic SQL functions that you might use in your queries.
You can use the SELECT statement to select data from a database. The data returned is stored in a results table, called the result-set.
SELECT … FROM … WHERE … LIMIT
Use aggregate functions to perform a calculation on one or more values and return a single value.
GROUP BY command to group the result set (used with aggregate functions: COUNT, MAX, MIN, SUM, AVG).
The MIN() function returns the smallest value of the selected column.
The MAX() function returns the largest value of the selected column.
MIN_BY takes two arguments and returns the value of the first argument for which the value of the second argument is minimized.
MAX_BY takes two arguments and returns the value of the first argument for which the value of the second argument is maximized.
APPROX_DISTINCT calculates an approximate count of the number of distinct values.
APPROX_PRECENTILE calculates the value at a given percentile of a distribution of values.
Work with Time Ranges
Choose a Target Data Time Range
Format Date Time Strings
Truncate Time into Hour, Day, and Month
Specify a Query Target Range
Create and Update Tables from SQL Results
The following SQL statements can help you create and update tables in Presto.
CREATE TABLE … AS
Create a new table containing the result of a select query. Use CREATE TABLE to create an empty table. You might want to use these optional clauses:
IF NOT EXISTSclause causes the error to be suppressed if the table already exists.
WITHclause can be used to set properties on the newly created table.
Insert data into an existing table. The following example uses person as the name of the table.
Delete rows from a table. If you specify a
WHERE clause, only the matching rows are deleted. Otherwise, all rows from the table are deleted.
The WHERE clause also helps reduce unintentional DELETE commands.
DELETE statements are helpful for:
Cleaning up error records
Removing duplicate records
Removing unnecessary or unexpected results from result tables
Review known limitations for the Presto DELETE statement.
<table_name> is the name of the table for deletion
<condition> is the clause to specify a search condition
Delete all products shipped by boat.
Delete all products for low priority orders.
Delete all products.
Perform Window Functions
Window functions perform calculations across rows of the query result. They run after the
HAVING clause but before the
ORDER BY clause. Invoking a window function requires special syntax using the
OVER clause to specify the window. A window has these three components:
Partition Specification. Separates the input rows into different partitions. This is analogous to how the
GROUP BYclause separates rows into different groups for aggregate functions.
Ordering Specification. Determines the order in which input rows are processed by the window function.
Window Frame. Specifies a sliding window of rows to be processed by the function for a given row. If the frame is not specified, it defaults to
RANGE UNBOUNDED PRECEDING, which is the same as
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. This frame contains all rows from the start of the partition up to the last peer of the current row.
For example, the following query ranks orders for each clerk by price.
Access Array Type Data
You can use the aggregate function, array_agg, with ARRAY data types that allow arrays to be created as part of the SELECT list of a query.
The following target table contains one row per employee and stores all phone numbers associated with an employee in an ARRAY type.
AWS Lambda is a compute service that lets you run code without provisioning or managing servers.
This example creates an array (fruits). The
has_elements flag identifies whether or not more elements need to be processed.
This example processes the array elements. We want to remove processed elements from the array. The
.shift() method removes the first element from the array.
Data Type Conversion Functions
Presto implicity converts numeric and character values to the correct type if such a conversion is possible. Presto does not convert between character and numeric types. For example, a query that expects a varchar does not automatically convert a bigint value to an equivalent varchar.
When necessary, values can be explicitly cast to a particular type.
cast(value AS type) → type
Explicitly cast a value as a type. This can be used to cast a varchar to a numeric value type and vice versa.
try_cast(value AS type) → type
cast(), but returns null if the cast fails.
You can write string functions to extract, find like, replace, and split data.
Learn more about Presto Regular Expression Functions.
Use JSON functions and operators process data.
Learn more about Presto JSON Functions and Operators.
Numbers (integer, float)
There are many mathematical operators available in Presto that you can use to process data.
Here are a few you might find useful:
ceil(x) → [same as input]
This is an alias for
floor(x) → [same as input]
x rounded down to the nearest integer.
round(x, d) → [same as input]
To review all mathematical operators, visit Presto Documentation: Mathematical Functions and Operators.