The following information can help you learn how to write Presto queries. Before you begin writing your queries, review Presto Query Runtime Limits and Query Hint Override.



SQL Basics

There are many more SQL statements than what we document here. These are some basic SQL functions that you might use in your queries.

Select Data

You can use the SELECT statement to select data from a database. The data returned is stored in a results table, called the result-set.

SELECT … FROM … WHERE … LIMIT

Example

SELECT contact_id, last_name, first_name
FROM contacts
WHERE website = "Company_Organization.com"
ORDER BY contact_id DESC
LIMIT 5;

Aggregate Data

Use aggregate functions to perform a calculation on one or more values and return a single value.

GROUP BY

Use the GROUP BY command to group the result set (used with aggregate functions: COUNT, MAX, MIN, SUM, AVG).

Example

SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country;

MIN()/MAX()

The MIN() function returns the smallest value of the selected column.

Example

SELECT MIN(column_name)
FROM table_name
WHERE condition;

The MAX() function returns the largest value of the selected column.

Example

SELECT MAX(column_name)
FROM table_name
WHERE condition;

MIN_BY/MAX_BY

MIN_BY takes two arguments and returns the value of the first argument for which the value of the second argument is minimized. 

SELECT MIN_BY(close_date, close_value) as date_of_min_sale
FROM sales_pipeline

MAX_BY takes two arguments and returns the value of the first argument for which the value of the second argument is maximized. 

APPROX_DISTINCT

APPROX_DISTINCT calculates an approximate count of the number of distinct values. 

Example

SELECT APPROX_DISTINCT(close_value)
  FROM sales_pipeline


APPROX_PERCENTILE

APPROX_PRECENTILE calculates the value at a given percentile of a distribution of values. 

Example

SELECT APPROX_PERCENTILE(close_value, 0.75)
FROM sales_pipeline

Work with Time Ranges

Choose a Target Data Time Range

Use TD_TIME_RANGE and TD_INTERVAL.

Format Date Time Strings

Use TD_TIME_FORMAT and TD_TIME_STRING.

Truncate Time into Hour, Day, and Month

Use TD_DATE_TRUNC.

Specify a Query Target Range

Use TD_SCHEDULED_TIME.

Create and Update Tables from SQL Results

The following SQL statements can help you create and update tables in Presto.

CREATE TABLE … AS

Create a new table containing the result of a select query. Use CREATE TABLE to create an empty table. You might want to use these optional clauses:

  • IF NOT EXISTS clause causes the error to be suppressed if the table already exists.

  • WITH clause can be used to set properties on the newly created table.

Example

CREATE TABLE [ IF NOT EXISTS ] table_name [ ( column_alias, ... ) ]
[ COMMENT table_comment ]
[ WITH ( property_name = expression [, ...] ) ]
AS query
[ WITH [ NO ] DATA ]

INSERT INTO

Insert data into an existing table. The following example uses person as the name of the table.

Example

insert into person values (1, 'james', 10, '20190301');

Delete Records

Delete rows from a table. If you specify a WHERE clause, only the matching rows are deleted. Otherwise, all rows from the table are deleted.

The WHERE clause also helps reduce unintentional DELETE commands.

DELETE statements are helpful for:

  • Cleaning up error records

  • Removing duplicate records

  • Removing unnecessary or unexpected results from result tables

Limitations

Review known limitations for the Presto DELETE statement.

Syntax

DELETE FROM <table_name> [ WHERE <condition> ]

Where:

  • <table_name> is the name of the table for deletion

  • <condition> is the clause to specify a search condition

Examples

Delete all products shipped by boat.

DELETE FROM products WHERE shipmode = 'BOAT';

Delete all products for low priority orders.

DELETE FROM products
WHERE orderkey IN (SELECT orderkey FROM orders WHERE priority = 'LOW');

Delete all products.

DELETE FROM products;

Perform Window Functions

Window functions perform calculations across rows of the query result. They run after the HAVING clause but before the ORDER BY clause. Invoking a window function requires special syntax using the OVER clause to specify the window. A window has these three components:

  • Partition Specification. Separates the input rows into different partitions. This is analogous to how the GROUP BY clause separates rows into different groups for aggregate functions.

  • Ordering Specification. Determines the order in which input rows are processed by the window function.

  • Window Frame. Specifies a sliding window of rows to be processed by the function for a given row. If the frame is not specified, it defaults to RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. This frame contains all rows from the start of the partition up to the last peer of the current row.

For example, the following query ranks orders for each clerk by price.

Example

SELECT orderkey, clerk, totalprice,
       rank() OVER (PARTITION BY clerk
                    ORDER BY totalprice DESC) AS rnk
FROM orders
ORDER BY clerk, rnk

Access Array Type Data

ARRAY_AGG

You can use the aggregate function, array_agg, with ARRAY data types that allow arrays to be created as part of the SELECT list of a query.

Example

The following target table contains one row per employee and stores all phone numbers associated with an employee in an ARRAY type.

 CREATE SET TABLE employeePhoneInfo
      (emp_id INTEGER, 
       emp_name VARCHAR(30), 
       emp_phone emp_phone_array);
 
   INSERT INTO employeePhoneInfo 
      SELECT emp_id, emp_name, 
         ARRAY_AGG(emp_phone, NEW emp_phone_array()) 
      FROM employee GROUP BY emp_id,emp_name 
         WHERE emp_id < 100;

AWS Lambda

AWS Lambda is a compute service that lets you run code without provisioning or managing servers.

Example

This example creates an array (fruits). The has_elements flag identifies whether or not more elements need to be processed.

exports.handler = (event, context, callback) => 
    var fruits = ['apple', 'orange', 'pinaple'];
    
    event.fruits = fruits;
    
    if (event.fruits.length > 0) 
        event.has_elements = true;
     else 
        event.has_elements = false;
    
    
    callback(null, event);
;

This example processes the array elements. We want to remove processed elements from the array. The .shift() method removes the first element from the array.

exports.handler = (event, context, callback) => 
    let fruit = event.fruits.shift();
    // Process array element
    console.log('Processing fruit: ' + fruit);
    
    // Array still has elements:
    if (event.fruits.length > 0) 
        event.has_elements = true;
     else 
        event.has_elements = false;
    
    
    // Log array elements (for demonstration purpose)
    console.log('Elements in array: ' + event.fruits);
    console.log('Array has more elements: ' + event.has_elements);
    
    callback(null, event);
;

Process Data

Data Type Conversion Functions

Presto implicity converts numeric and character values to the correct type if such a conversion is possible. Presto does not convert between character and numeric types. For example, a query that expects a varchar does not automatically convert a bigint value to an equivalent varchar.

When necessary, values can be explicitly cast to a particular type.

Conversion Functions

cast(value AS type) → type

Explicitly cast a value as a type. This can be used to cast a varchar to a numeric value type and vice versa.

try_cast(value AS type) → type

Like cast(), but returns null if the cast fails.

String Functions

You can write string functions to extract, find like, replace, and split data.

Learn more about Presto Regular Expression Functions.

Process JSON

Use JSON functions and operators process data.

Learn more about Presto JSON Functions and Operators.

Numbers (integer, float)

There are many mathematical operators available in Presto that you can use to process data.

Here are a few you might find useful:

  • ceil(x) → [same as input]

This is an alias for ceiling().

  • floor(x) → [same as input]

Returns x rounded down to the nearest integer.

  • round(xd) → [same as input]

    Returns x rounded to d decimal places

To review all mathematical operators, visit Presto Documentation: Mathematical Functions and Operators.


  • No labels