Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

All native Presto functions can also be used on Treasure Data. For a complete list of functions, see Presto Function and Operators pages.

Table of Contents
maxLevel1

...

For convenience, we recommend using TD_INTERVAL instead of TD_TIME_RANGE.

Signature

Code Block
linenumberstrue
boolean TD_TIME_RANGE(int/long unix_timestamp,
                      int/long/string start_time,
                      int/long/string end_time
                      [, string default_timezone = 'UTC'])

...

This example selects records with timestamps ‘2013-01-01 00:00:00 PDT’ or later. The time of day ('00:00:00') can be omitted. Alternately, the time of day can be specified up to seconds. In general, the time string should be formatted as 'YYYY-MM-DD' or 'YYYY-MM-DD hh:mm:ss'. For example, '2013-01-01' or '1999-01-01 07:00:00'.

Code Block
linenumberstrue
SELECT ... WHERE TD_TIME_RANGE(time, '2013-01-01 PDT')                 # OK
SELECT ... WHERE TD_TIME_RANGE(time, '2013-01-01', '2013-01-02','PDT') # OK
SELECT ... WHERE TD_TIME_RANGE(time, NULL, '2013-01-01', 'PDT')        # OK
SELECT ... WHERE TD_TIME_RANGE(time, '2013-01-01', NULL, 'PDT')        # OK
SELECT ... WHERE TD_TIME_RANGE(time, '2013-01-01', 'PDT')              # NG

...

TD_SCHEDULED_TIME

Signature

Code Block
linenumberstrue
long TD_SCHEDULED_TIME()

Description

...

TD_INTERVAL(time, interval_string, default_timezone)

Code Block
linenumberstrue
boolean TD_INTERVAL(int/long time,
                    string interval_string,
                    [, string default_timezone = 'UTC'])

...

These examples assume that the scheduled_time (or query start time) is 2018-08-14 01:23:45 (Tue, UTC):

Code Block
linenumberstrue
# The last 7 days [2018-08-07 00:00:00, 2018-08-14 00:00:00)
SELECT  ... WHERE TD_INTERVAL(time, '-7d')
# The last week. Monday is the beginning of the week (ISO standard) [2018-08-05 00:00:00, 2018-08-13 00:00:00)
SELECT  ... WHERE TD_INTERVAL(time, '-1w')
# Today [2018-08-14 00:00:00, 2018-08-15 00:00:00)
SELECT  ... WHERE TD_INTERVAL(time, '1d')
# The last month [2018-07-01 00:00:00, 2018-08-01 00:00:00)
SELECT  ... WHERE TD_INTERVAL(time, '-1M')
# This month [2018-08-01 00:00:00, 2018-09-01 00:00:00)
SELECT  ... WHERE TD_INTERVAL(time, '1M')
# This year [2018-01-01 00:00:00, 2019-01-01 00:00:00)
SELECT  ... WHERE TD_INTERVAL(time, '1y')
# The last 15 minutes [2018-08-14 00:08:00, 2018-08-14 01:23:00)
SELECT  ... WHERE TD_INTERVAL(time, '-15m')
# The last 30 seconds [2018-08-14 01:23:15, 2018-08-14 01:23:45)
SELECT  ... WHERE TD_INTERVAL(time, '-30s')
# The last hour [2018-08-14 00:00:00, 2018-08-14 01:00:00)
SELECT  ... WHERE TD_INTERVAL(time, '-1h')
# From the last hour to now [2018-08-14 00:00:00, 2018-08-14 01:23:45)
SELECT  ... WHERE TD_INTERVAL(time, '-1h/now')
# The last hour since the beginning of today [2018-08-13 23:00:00, 2018-08-14 00:00:00)
SELECT  ... WHERE TD_INTERVAL(time, '-1h/0d')
# The last 7 days since 2015-12-25 [2015-12-18 00:00:00, 2015-12-25 00:00:00)
SELECT  ... WHERE TD_INTERVAL(time, '-7d/2015-12-25')
# The last 10 days since the beginning of the last month [2018-06-21 00:00:00, 2018-07-01 00:00:00)
SELECT  ... WHERE TD_INTERVAL(time, '-10d/-1M')
# The last 7 days in JST
SELECT  ... WHERE TD_INTERVAL(time, '-7d', 'JST')

...

interval_string must be a 'duration/offset' formatted string. The offset is optional and the UDF assumes offset is the current time (the job scheduled time actually) based on your browser timezone, if the offset is omitted. Also, support 'q' for quarters. For example, '-1d' means yesterday and '-3M' means the last 3 months. The interval is calculated in the specified time unit. This means '-30m' ls the last 30 minutes from the beginning of the latest minute, not from just now.

...

interval_string must be a 'duration/offset' formatted string. The offset is optional and the UDF assumes offset is the current time (the job scheduled time actually) based on your browser timezone, if the offset is omitted. Also, support 'q' for quarters.

time zone is used to interpret the timezone of interval_string. If interval_string specifies a timezone (e.g. '-1h/2017-01-23 01:00:00 +0700'), then time zone is ignored. If the time zone is not specified and interval_string does not have a timezone, then UDF uses 'UTC' as the timezone. A list of supported time zones can be found here.

...

TD_TIME_ADD

Signature

Code Block
linenumberstrue
long TD_TIME_ADD(int/long/string time,
                 string duration
                 [, string default_timezone = 'UTC'])

...

This example selects records with timestamps ‘2013-01-01 00:00:00 UTC’ or later but older than ‘2013-01-02 00:00:00 UTC’.

Code Block
linenumberstrue
SELECT ... WHERE TD_TIME_RANGE(time,
                               '2013-01-01',
                               TD_TIME_ADD('2013-01-01', '1d'))

...

For convenience, we recommend using TD_TIME_STRING instead of TD_TIME_FORMAT.

Signature

Code Block
linenumberstrue
string TD_TIME_FORMAT(long unix_timestamp,
                      string format
                      [, string timezone = 'UTC'])

...

This example formats a UNIX timestamp into a date formatted string:

Code Block
linenumberstrue
SELECT TD_TIME_FORMAT(time, 'yyyy-MM-dd HH:mm:ss z') ... FROM ...
SELECT TD_TIME_FORMAT(time, 'yyyy-MM-dd HH:mm:ss z', 'PST') ... FROM ...
SELECT TD_TIME_FORMAT(time, 'yyyy-MM-dd HH:mm:ss z', 'JST') ... FROM ...

...

How does TD_TIME_FORMAT handle Leap Second?

Code Block
linenumberstrue
SELECT
 TD_TIME_FORMAT(1136073600, 'yyyy-MM-dd HH:mm:ss', 'JST') as st,
 TD_TIME_PARSE('2006-01-01 08:59:60', 'JST') as leap,
 TD_TIME_PARSE('2006-01-01 09:00:00', 'JST') as leap2

TD_TIME_PARSE

Signature

Code Block
linenumberstrue
long TD_TIME_PARSE(string time
                   [, string default_timezone = 'UTC'])

...

For convenience, we recommend TD_TIME_STRING over TD_TIME_FORMAT.

Code Block
linenumberstrue
TD_TIME_STRING(time, '(interval string)', time zone?)

...

  • time: unix time (bigint)

  • interval string:

    Code Block
    linenumberstrue
    [yqMwdhm](!)?

If the format string has ! as the suffix, it truncates the date time string at the specified unit.

...

If there is no !, the return value should be the same as:

Code Block
linenumberstrue
TD_TIME_FORMAT(TD_DATE_TRUNC('(interval unit)', time, timezone), 'yyyy-MM-dd HH:mm:ssZ', timezone)

...

TD_DATE_TRUNC

Signature

Code Block
linenumberstrue
long TD_DATE_TRUNC(string unit,
                   long time
                   [, string default_timezone = 'UTC'])

...

This function mimics the functionality of native Presto’s date_trunc function. However, Presto’s date_trunc does not allow specification of the timezone.

Example

Code Block
linenumberstrue
SELECT TD_DATE_TRUNC('day', time) FROM tbl

...

With the same value and timezone ‘PST’ instead,

Code Block
linenumberstrue
SELECT TD_DATE_TRUNC('day', time, 'PST') FROM tbl

the function returns 1416758400 because the start of the day for the ‘PST’ timezone is 8 hours behind the start of the day for ‘UTC’.

TD_SESSIONIZE_WINDOW

Signature

Code Block
linenumberstrue
string TD_SESSIONIZE_WINDOW(int/long unix_timestamp, int timeout)

...

Sessionization of a table of event data groups a series of event rows associated with users into individual sessions for analysis. The series of events to be grouped into a session must be associated with the same user identifier (typically IP address, email, cookie, or similar identifier) and events are separated by no more than a chosen timeout interval.

...

  • The time field specified in the UNIX epoch

  • A timeout interval in seconds (when this amount of time elapses between events, it indicates the start of a new session)

...

The following example is equivalent to the SELECT statement example in the deprecated TD_SESSIONIZE.

Code Block
linenumberstrue
SELECT
  TD_SESSIONIZE_WINDOW(time, 3600) 
    OVER (PARTITION BY ip_address ORDER BY time) 
    as session_id,
  time,
  ip_address,
  path
FROM
  web_logs

...

TD_PARSE_USER_AGENT

Signature

Code Block
linenumberstrue
string TD_PARSE_USER_AGENT(user_agent string [, options string])

...

The example shows the result of parsing user agent from access log.

Code Block
linenumberstrue
SELECT TD_PARSE_USER_AGENT(agent) AS agent FROM www_access
> {user_agent: {family: "IE", major: "9", minor: "0", patch: null}, os: {family: "Windows 7", major: null, minor: null, patch: null, patch_minor: null}, device: {family: "Other"}}
SELECT TD_PARSE_USER_AGENT(agent, 'os') AS agent_os FROM www_access
> {family: "Windows 7", major: null, minor: null, patch: null, patch_minor: null}
SELECT TD_PARSE_USER_AGENT(agent, 'os_family') AS agent_os_family FROM www_access
> Windows 7

...

This UDF returns a Map value of results to parse a user agent string. The UDF is implemented by Woothee.

Signature

Code Block
linenumberstrue
MAP(varchar,varchar) TD_PARSE_AGENT(user_agent varchar)

...

The example shows the result of parsing the user agent from an access log. If you want to extract a specific ‘key’ from the user agent map. TD recommends using the element_at presto function because it is tolerant of non-existent keys. Extracting keys with the [] operator (e.g. TD_PARSE_AGENT(<agent_string>)[‘<keyname>’]) will throw an error if the sought after the key is not present in the map.

Code Block
linenumberstrue
SELECT TD_PARSE_AGENT(agent) AS parsed_agent, agent FROM www_access
> {"os":"Windows 7","vendor":"Google","os_version":"NT 6.1","name":"Chrome","category":"pc","version":"16.0.912.77"},
Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7

SELECT element_at(TD_PARSE_AGENT(agent), 'os') AS os FROM www_access
> Windows 7 => os from user-agent, or carrier name of mobile phones

SELECT element_at(TD_PARSE_AGENT(agent), 'vendor') AS vendor FROM www_access
> Google // => name of vendor

SELECT element_at(TD_PARSE_AGENT(agent), 'os_version') AS os_version FROM www_access
> NT 6.1 // => "NT 6.3" (for Windows), "10.8.3" (for OSX), "8.0.1" (for iOS), ....

SELECT element_at(TD_PARSE_AGENT(agent), 'name') AS name FROM www_access
> Chrome // => name of browser (or string like name of user-agent)

SELECT element_at(TD_PARSE_AGENT(agent), 'category') AS category FROM www_access
> pc // => "pc", "smartphone", "mobilephone", "appliance", "crawler", "misc", "unknown"

SELECT element_at(TD_PARSE_AGENT(agent), 'version') AS version FROM www_access
> 16.0.912.77 => version of browser, or terminal type name of mobile phones

SELECT TD_PARSE_AGENT(agent)['nonexistentkey'] FROM www_access
! The *query errors out* because the <tt>nonexistentkey</tt> key is not present
! in the map returned by <tt>TD_PARSE_AGENT(agent)</tt>.

...

TD_MD5

Signature

Code Block
linenumberstrue
string TD_MD5(col)

Description

This UDF calculates the MD5 hash digest from a given string.

Example

Code Block
linenumberstrue
SELECT TD_MD5(column) FROM tbl

...

URL Decoding

Signature

Code Block
linenumberstrue
string TD_URL_DECODE(col)

...

TD_URL_DECODE applies URL decoding for a given string. This UDF returns half-width space if a character is \r or \n, or \t. This UDF is similar to URL_DECODE(col).

Example

Code Block
linenumberstrue
SELECT TD_URL_DECODE(column) FROM tbl

URL EUC-KR

Signature

Code Block
linenumberstrue
string TD_URL_DECODE(url [, local])

...

TD_URL_DECODE applies URL decoding for a given URL.

 Example

Code Block
linenumberstrue
SELECT TD_URL_DECODE('%BA%ED%B7%E7%C5%F5%BD%BA', 'ko')

SMART_DIGEST

Signature

Code Block
linenumberstrue
string SMART_DIGEST(col [,weight = 1.0])

...

This UDF calculates the variable-length digest from a given string. It usually generates 6-10 characters of digest from the given string. Due to the higher compression ratio, there is a higher collision ratio, around 5% on average. If you want to avoid the collisions, increase the value of the weight parameter.

Example

Code Block
linenumberstrue
SELECT SMART_DIGEST(column) FROM tbl
SELECT SMART_DIGEST(column, 1.5) FROM tbl

...

TD_CURRENCY_CONV

Signature

Code Block
linenumberstrue
string TD_CURRENCY_CONV(string date, string from_currency, string to_currency, float value)

...

Example

Code Block
linenumberstrue
SELECT TD_CURRENCY_CONV('2015-01-01', 'USD', 'JPY', 1.0)

...

jobid

type

td_ip_to_city_name_v6

td_ip_to_latitude_v6

td_ip_to_longitude_v6

td_ip_to_postal_code_v6

218018944

hive

Tokyo

35.685

139.7514

102-0082

218019099

presto

35.6594

139.8533

134-0087

Signature

Code Block
linenumberstrue
string TD_IP_TO_COUNTRY_CODE(string ip)

...

This UDF converts IP address to country code. This UDF supports IPv4 and IPv6.

Example

Code Block
linenumberstrue
SELECT
    TD_IP_TO_COUNTRY_CODE('106.142.252.8') AS ipv4,
    TD_IP_TO_COUNTRY_CODE('2001:268:d005:f5be:c43e:af35:81f:8f60') AS ipv6

...

TD_IP_TO_COUNTRY_NAME

Signature

Code Block
linenumberstrue
string TD_IP_TO_COUNTRY_NAME(string ip)

...

This UDF converts IP address to country code. This UDF supports IPv4 and IPv6.

Example

Code Block
linenumberstrue
SELECT
    TD_IP_TO_COUNTRY_NAME('106.142.252.8') AS ipv4,
    TD_IP_TO_COUNTRY_NAME('2001:268:d005:f5be:c43e:af35:81f:8f60') AS ipv6

...

TD_IP_TO_SUBDIVISION_NAMES

Signature

Code Block
linenumberstrue
array<string> TD_IP_TO_SUBDIVISION_NAMES(string ip)

...

This UDF converts IP address to a list of subdivisions (e.g. US states, JP prefectures, etc). This UDF supports IPv4 and IPv6.

Example

Code Block
linenumberstrue
SELECT
    TD_IP_TO_SUBDIVISION_NAMES('106.142.252.8') AS ipv4,
    TD_IP_TO_SUBDIVISION_NAMES('2001:268:d005:f5be:c43e:af35:81f:8f60') AS ipv6

...

TD_IP_TO_MOST_SPECIFIC_SUBDIVISION_NAME

Signature

Code Block
linenumberstrue
string TD_IP_TO_MOST_SPECIFIC_SUBDIVISION_NAME(string ip)

...

This UDF converts IP address to the most specific subdivisions (e.g. US states, JP prefectures, etc). This UDF supports IPv4 and IPv6.

Example

Code Block
linenumberstrue
SELECT
    TD_IP_TO_MOST_SPECIFIC_SUBDIVISION_NAME('106.142.252.8') AS ipv4,
    TD_IP_TO_MOST_SPECIFIC_SUBDIVISION_NAME('2001:268:d005:f5be:c43e:af35:81f:8f60') AS ipv6

...

TD_IP_TO_LEAST_SPECIFIC_SUBDIVISION_NAME

Signature

Code Block
linenumberstrue
string TD_IP_TO_LEAST_SPECIFIC_SUBDIVISION_NAME(string ip)

...

This UDF converts IP address to the least specific subdivisions (e.g. US states, JP prefectures, etc). This UDF supports IPv4 and IPv6.

Example

Code Block
linenumberstrue
SELECT
    TD_IP_TO_LEAST_SPECIFIC_SUBDIVISION_NAME('106.142.252.8') AS ipv4,
    TD_IP_TO_LEAST_SPECIFIC_SUBDIVISION_NAME('2001:268:d005:f5be:c43e:af35:81f:8f60') AS ipv6

...

TD_IP_TO_CITY_NAME

Signature

Code Block
linenumberstrue
string TD_IP_TO_CITY_NAME(string ip)

...

This UDF converts IP address to city name. This UDF supports IPv4 and IPv6.

Example

Code Block
linenumberstrue
SELECT
    TD_IP_TO_CITY_NAME('106.142.252.8') AS ipv4,
    TD_IP_TO_CITY_NAME('2001:268:d005:f5be:c43e:af35:81f:8f60') AS ipv6

...

TD_IP_TO_LATITUDE

Signature

string
Code Block
linenumberstrue
string TD_IP_TO_LATITUDE(string ip)

...

This UDF converts IP address to latitude. This UDF supports IPv4 and IPv6.

Example

Code Block
linenumberstrue
SELECT
    TD_IP_TO_LATITUDE('106.142.252.8') AS ipv4,
    TD_IP_TO_LATITUDE('2001:268:d005:f5be:c43e:af35:81f:8f60') AS ipv6

...

TD_IP_TO_LONGITUDE

Signature

Code Block
linenumberstrue
string TD_IP_TO_LONGITUDE(string ip)

...

This UDF converts IP address to longitude. This UDF supports IPv4 and IPv6.

Example

Code Block
linenumberstrue
SELECT
    TD_IP_TO_LONGITUDE('106.142.252.8') AS ipv4,
    TD_IP_TO_LONGITUDE('2001:268:d005:f5be:c43e:af35:81f:8f60') AS ipv6

...

TD_IP_TO_METRO_CODE (US Only)

Signature

Code Block
linenumberstrue
string TD_IP_TO_METRO_CODE(string ip)

...

This UDF converts IP address to metro code (US Only). This UDF supports IPv4 and IPv6.

Example

Code Block
linenumberstrue
SELECT
    TD_IP_TO_METRO_CODE('106.142.252.8') AS ipv4,
    TD_IP_TO_METRO_CODE('2001:268:d005:f5be:c43e:af35:81f:8f60') AS ipv6

...

TD_IP_TO_TIME_ZONE

Signature

Code Block
linenumberstrue
string TD_IP_TO_TIME_ZONE(string ip)

...

This UDF converts IP address to time zone. This UDF supports IPv4 and IPv6.

Example

Code Block
linenumberstrue
SELECT
    TD_IP_TO_TIME_ZONE('106.142.252.8') AS ipv4,
    TD_IP_TO_TIME_ZONE('2001:268:d005:f5be:c43e:af35:81f:8f60') AS ipv6

...

TD_IP_TO_POSTAL_CODE

Signature

Code Block
linenumberstrue
string TD_IP_TO_POSTAL_CODE(string ip)

...

This UDF converts IP address to postal code. This UDF supports IPv4 and IPv6.

Example

Code Block
linenumberstrue
SELECT
    TD_IP_TO_POSTAL_CODE('106.142.252.8') AS ipv4,
    TD_IP_TO_POSTAL_CODE('2001:268:d005:f5be:c43e:af35:81f:8f60') AS ipv6

...

TD_IP_TO_CONNECTION_TYPE

Signature

Code Block
linenumberstrue
string TD_IP_TO_CONNECTION_TYPE(string ip)

...

This UDF converts IP address to connection type. This UDF supports IPv4 and IPv6.

Example

Code Block
linenumberstrue
SELECT
    TD_IP_TO_CONNECTION_TYPE('106.142.252.8') AS ipv4,
    TD_IP_TO_CONNECTION_TYPE('2001:268:d005:f5be:c43e:af35:81f:8f60') AS ipv6

...

TD_IP_TO_DOMAIN

Signature

Code Block
linenumberstrue
string TD_IP_TO_DOMAIN(string ip)

...

This UDF converts IP address to domain. This UDF supports IPv4 and IPv6.

Example

Code Block
linenumberstrue
SELECT
    TD_IP_TO_DOMAIN('106.142.252.8') AS ipv4,
    TD_IP_TO_DOMAIN('2001:268:d005:f5be:c43e:af35:81f:8f60') AS ipv6

...

TD_LAT_LONG_TO_COUNTRY

Signature

Code Block
linenumberstrue
string TD_LAT_LONG_TO_COUNTRY(string type, double latitude, double longitude)

Description

This UDF converts geo location geolocation information (latitude/longitude) to the country name.

Example

Code Block
linenumberstrue
SELECT
  TD_LAT_LONG_TO_COUNTRY('FULL_NAME',                 37, -122)
  TD_LAT_LONG_TO_COUNTRY('THREE_LETTER_ABBREVIATION', 37, -122)
  TD_LAT_LONG_TO_COUNTRY('POSTAL_ABBREVIATION',       37, -122)
  TD_LAT_LONG_TO_COUNTRY('SORTABLE_NAME',             37, -122)

...

This function is not supported.

Signature

Code Block
linenumberstrue
string SORTED_GROUP_CONCAT(column, delimiter, orderkey)

...

This UDF returns the concatenation of column with delimiter ordered by orderkeyin a group of values.

Example

Code Block
linenumberstrue
SELECT groupkey, SORTED_GROUP_CONCAT(column, '.', time)
FROM table
GROUP BY groupkey

...

Geometry types are the building blocks of geospatial queries and calculations. They are used as arguments to for geospatial functions. Geometry type is the product of a constructor function.

...

ST_Point

Signature

Code Block
linenumberstrue
point ST_Point(double, double)

...

A constructor function that returns a geometry type point object with the given coordinate values.

Example

Code Block
linenumberstrue
SELECT s.school_Name as School, s.phoneNumber as Phone
FROM schools as s
JOIN counties as c
ON st_contains(c.geo_shape, st_point(s.loc_lng, s.loc_lang))

...

ST_Polygon

Signature

Code Block
linenumberstrue
polygon ST_Polygon(varchar)

...

Returns a geometry type polygon object from WKT representation.

Example

Code Block
linenumberstrue
SELECT r.resevoirName as Name, r.area as Area
FROM resevoirs as r
JOIN parks as p
ON st_contains(p.geo_shape, ST_Polygon(r.geo_shape))

...

ST_Intersection and ST_Intersects are examples of Geospatial Relationship Functions. Relationship functions allow you to find relationships between two different geometric inputs. They return a boolean result type.

...

ST_Intersection

Signature

Code Block
linenumberstrue
Geometry ST_Intersection(Geometry, Geometry) 

...

Returns the geometry value that represents the point set intersection of two geometries.

Example

Code Block
linenumberstrue
SELECT ST_AsText(ST_INTERSECTION(ST_POINT(1,1), ST_POINT(1,1))
FROM tbl1

...

Returns the intersection of 2 points as a text coordinates.

ST_Intersects

Signature

Code Block
linenumberstrue
boolean ST_Intersects(Geometry, Geometry) 

...

Returns true if the given geometries spatially intersect in two dimensions (share any portion of space) and false if they do not (they are disjoint).

Example

Code Block
linenumberstrue
SELECT ST_INTERSECTS(ST_LINE('linestring(8 7, 7 8)'), ST_POLYGON('polygon((1 1, 4 1, 4 4, 1 4))'))
FROM tbl1

...