# Presto 0 205 to 317 Migration 2020

Treasure Data’s Presto release is based on the Presto 317 open source release. Our previous release was based on Presto 0.205. This article includes:

* [New Magic Comment Features](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h1_1921683457)
* [parse_decimal_literals_as_double](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2__584086195)
* [Distributed_Sort](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2__1974809204)
* [time_partitioning_range](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2__1734486412)
* [Deprecated Features](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h1__1821394898)
* [Migrate Your Distributed Joins](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2_2028709490)
* [Updating Presto](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h1__1516417081)
* [About Side-by-Side Environments](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2__1521992254)
* [Running Code in Presto 317 vs. Presto 0.205](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2__1437983654)
* [Query Syntax Fixes](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h1__87043302)
* [Having too many OR predicate. Maximum 500 OR allowed](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2_1448312120)
* [Key not present in map: XXX](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2_756987314)
* [Array subscript out of bounds](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2__599545120)
* [Column alias list has 1 entry but 't' has N columns available'](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2_1644950305)
* [Window does not support distinct](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2__285762603)
* [ARRAY comparison not supported for arrays with null elements](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2__1682355132)
* ['XXX' must be an aggregate expression or appear in a GROUP BY clause](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2_2060302521)
* [Zero-length delimited identifier not allowed](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2__852805306)
* [Unexpected parameters (char(1)) for function codepoint. Expected: codepoint(varchar(1))](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2__190112671)
* [Invalid format: "XXX" is malformed / Invalid timezone: "XXX" is malformed](/products/customer-data-platform/data-workbench/queries/presto-0-205-to-317-migration-2020#h2_1996712627)


We have done internal testing to identify compatibility and performance regression issues.

# New Magic Comment Features

## parse_decimal_literals_as_double

The **parse_decimal_literals_as_double** session property is supported as a magic comment.

Decimal literals without an explicit type specifier (e.g. 1.2) are treated as the values of the DOUBLE type by default.

This magic comment can disable the behavior and use decimal literals as the values of the DECIMAL type. For example:


```sql
-- set session parse_decimal_literals_as_double = 'false'
    SELECT typeof(1.1);
```

## Distributed_Sort

The**distributed_sort** session property is supported as a magic comment.

Distributed sort allows sorting of data, which exceeds `query.max-memory-per-node`. Distributed sort is enabled through the `distributed_sort` session property, or `distributed-sort` configuration property set in `etc/config.properties` of the coordinator. Distributed sort is enabled by default.

When distributed sort is enabled, the sort operator executes in parallel on multiple nodes in the cluster. Partially sorted data from each worker node is then streamed to a single worker node for a final merge. This technique allows to utilization of memory of multiple worker nodes for sorting. The primary purpose of the distributed sort is to allow for the sorting of data sets that don’t normally fit into single node memory. Performance improvement can be expected, but it won’t scale linearly with the number of nodes, since the data needs to be merged by a single node.


```sql
-- set session distributed_sort = 'true'
    SELECT * FROM large_table, small_table
    WHERE small_table.id = large_table.id
```

## time_partitioning_range

The time_partitioning_range session property is supported as a magic comment.


```sql
-- set session time_partitioning_range = 'value`  
  
The value can be

  * none

    * no time partitioning

  * number with unit

    * allowed units are `h, d, mo, q, y` (hour, day, month, quarter, year)

    * ex) 12h, 2d, 1mo, 1q, 1y

# Deprecated Features

## Migrate Your Distributed Joins

The **distributed_join** session property has been removed.

Use **join_distribution_type** session property instead. The **join_distribution_type** system property accepts the following values:

  * PARTITIONED

  * BROADCAST

Deprecated code:

```sql
-- set session distributed_join = 'true'
    SELECT * FROM large_table, small_table
    WHERE small_table.id = large_table.id  
  
Updated code:

```sql
-- set session join_distribution_type = 'PARTITIONED'
    SELECT * 
    FROM large_table, small_table
    WHERE small_table.id = large_table.id
```

# Updating Presto

## About Side-by-Side Environments

For your testing, Presto 0.205 and Presto 317 are available side-by-side to allow testing of the code before the upgrade.

## Running Code in Presto 317 vs. Presto 0.205

During the transition from Presto 317 and Presto 0.205 the two releases will be available side-by-side.

You can control which Presto release runs your code by using the following query hint:

| **Version** | **Query Hint** |
|  --- | --- |
| Presto 317 | `-- @TD engine_version: 317` |
| Presto 0.205 | `-- @TD engine_version: 0.205` |


# Query Syntax Fixes

Running queries that worked in the previous version might cause error messages. For each case, we show the message, a query that causes the error, and an example of how to rewrite the query.

## Having too many OR predicate. Maximum 500 OR allowed

Original code that causes the error:


```sql
SELECT path FROM  ( 
     SELECT * FROM sample_datasets.www_access ) WHERE 
    path = 'foo1' OR path = 'foo2' OR path = 'foo3'
    -- Query having a lot of "OR" clauses
    OR path = 'foo559';
```

Modified code that fixes the error:


```sql
SELECT path FROM ( 
     SELECT * FROM sample_datasets.www_access ) WHERE
    path IN (
     'foo1', 'foo2', 'foo3' ,
    -- Query having a lot of conditions
    'foo559');
```

Typically, using an **IN** clause to rewrite your queries solves the issue.

## 

Key not present in map: XXX

Original code that causes the error:


```sql
SELECT MAP(ARRAY [1, 3], ARRAY [2, 4])[5];
```

Modified code that fixes the error:


```sql
SELECT element_at(MAP(ARRAY [1, 3], ARRAY [2, 4]), 5);
```

You can use the function [element_at](https://prestosql.io/docs/317/functions/map.md) to rewrite your queries. The function [element_at](https://prestosql.io/docs/317/functions/map.md) is supported by both cluster versions, v0.205, and v317. The preceding query runs on both clusters.

## 

Array subscript out of bounds

Original code that causes the error:


```sql
SELECT
     numbers[99] -- key isn't exist
    FROM
     (VALUES (ARRAY[1,2,3]) ) AS t(numbers);
```

Modified code that fixes the error:


```sql
SELECT
      element_at(numbers, 99) -- key isn't exist
    FROM
      (VALUES (ARRAY[1,2,3]) ) AS t(numbers);
```

The function [element_at](https://prestosql.io/docs/317/functions/array.md) is supported by both cluster versions, v0.205, and v317.

The preceding query runs on both clusters.

## 

Column alias list has 1 entry but 't' has N columns available'

Original code that causes the error:


```sql
WITH
    dataset AS (
     SELECT ARRAY[
     CAST(ROW('Amy', 'devops') AS ROW(name VARCHAR, department VARCHAR))
     ] AS users
    ),
    u AS (
     SELECT person
     FROM
     dataset,
     UNNEST(dataset.users) AS t(person)
     )
    SELECT
     person.name,
     person.department
    FROM u;
```

Modified code that fixes the error:


```sql
WITH
    dataset AS (
     SELECT ARRAY[
     CAST(ROW('Amy', 'devops') AS ROW(name VARCHAR, department VARCHAR))
     ] AS users
    ),
    u AS (
     SELECT
     name, department
     FROM
     dataset,
     UNNEST(dataset.users) AS person(name, department)
     )
    SELECT
     name,
     department
    FROM u;
```

## 

Window does not support distinct

Original code that causes the error:


```sql
SELECT 
      UPPER(DISTINCT(id)) AS user_id 
    FROM (VALUES ('foo'), ('bar')) AS t(id);
```

Modified code that fixes the error:


```sql
SELECT 
      UPPER(id) AS user_id 
    FROM (
      SELECT DISTINCT id FROM (VALUES ('foo'), ('bar')
    ) AS t (id));
```

## ARRAY comparison not supported for arrays with null elements

Original code that causes the error:


```sql
SELECT id, array_agg(value)
    FROM (
     VALUES
     (1, 3), (1, 4), (1, 5), (2, 6), (2, 7), (3, null)
    ) AS t(id, value)
    GROUP BY 1 ORDER BY 2;
```

Modified code that fixes the error:


```sql
SELECT id, array_agg(value) FILTER (where value is not null)
    FROM (
     VALUES
     (1, 3), (1, 4), (1, 5), (2, 6), (2, 7), (3, null)
    ) AS t(id, value)
    GROUP BY 1 ORDER BY 2;
```

The function [element_at](https://prestosql.io/docs/317/functions/map.md) is supported by both cluster versions, v0.205, and v317. The preceding query runs on both clusters.

### Details of this error: Behavior change of array_agg function

For Presto 0.205, the array_agg function ignores NULL values.


```
presto> SELECT id, array_agg(value) FROM ( VALUES (1, null), (1, 0) ) AS t(id, value) GROUP BY 1;
     id | _col1
    ----+-------
     1 | [0]
    (1 row)
    
    presto> SELECT id, array_agg(value) FROM ( VALUES (1, null) ) AS t(id, value) GROUP BY 1;
     id | _col1
    ----+-------
     1 | NULL
    (1 row)  
  
In Presto 0.205, _deprecated_._legacyagg_ property, true is enabled in order to keep this legacy behavior of the array_agg function.

This property was removed in Presto 317 and there is no ability to roll back the legacy behavior.

For Presto 317, the array_agg function doesn't ignore NULL values and returns an array including NULL values.
```

presto> SELECT id, array_agg(value) FROM ( VALUES (1, null), (1, 0) ) AS t(id, value) GROUP BY 1;
id | _col1
----+-----------
1 | [null, 0]
(1 row)

presto> SELECT id, array_agg(value) FROM ( VALUES (1, null) ) AS t(id, value) GROUP BY 1;
id | _col1
----+--------
1 | [null]
(1 row)

As a result, when Presto compares with a column that has arrays with NULL values, Presto returns the `ARRAY comparison not supported for arrays with null elements` message.

## 'XXX' must be an aggregate expression or appear in a GROUP BY clause

Original code that causes the error:


```sql
SELECT code FROM sample_datasets.www_access 
    HAVING code IS NOT NULL;
```

Modified code that fixes the error:


```sql
SELECT code FROM sample_datasets.www_access 
    WHERE code IS NOT NULL;
```

If you use a HAVING clause as a WHERE clause, this error appears.

## 

Zero-length delimited identifier not allowed

Original code that causes the error:


```sql
SELECT 1 AS "";
```

Modified code that fixes the error:


```sql
SELECT 1 AS "A";
```

## Unexpected parameters (char(1)) for function codepoint. Expected: codepoint(varchar(1))

Original code that causes the error:


```sql
SELECT codepoint(CAST('a' AS CHAR));
```

Modified code that fixes the error:


```sql
SELECT codepoint(CAST('a' AS VARCHAR(1)));
```

## Invalid format: "XXX" is malformed / Invalid timezone: "XXX" is malformed

Error handling has been improved in v317. When an invalid argument gets passed, the current Presto returns NULL, but the new Presto returns the error.

Original code that causes the error:


```sql
SELECT td_time_range(1567890000, 'yyyy-MM-dd','JST', NULL);
```

Modified code that fixes the error:


```sql
SELECT td_time_range(1567890000, '2019-09-08', 'JST', NULL);
```