Presto Query Engine

Treasure Data is a Hadoop-based Big Data analytics platform. Treasure Data supports Presto as a low-latency query engine.

Table of Contents

What is Presto?

Presto is an open-source parallel SQL execution engine. Unlike Hive, Presto doesn’t use the map reduce framework for its execution. Instead, Presto directly accesses the data through a specialized distributed query engine that is very similar to those found in commercial parallel RDBMSs.


Treasure Data has customized Presto to talk directly with our distributed columnar storage layer. As a result, the end user experience is nearly identical to querying Hive.

Does Presto Replace Hive?

No. Hive is designed for batch processing, while Presto is designed for short interactive queries useful for data exploration.

Presto currently has limited fault tolerance capabilities when querying. If a process fails while processing, the whole query must be re-run. On the other hand, it executes queries 10-30x faster than Hive. Thus, even if there is a process failure and a query must be restarted, the total runtime will often still beat Hive’s significantly.

Another caveat is that Presto has an in-memory only architecture. So if there is a particularly large data set which exceeds the total memory capacity available to Presto, query execution will fail.

Even with Presto as part of our ecosystem, MapReduce and Hive will continue to have many viable use cases (for example: long-running data transformation workloads).

How to Use Presto?

Web Console

Select “Presto” as the query type when using the web console’s query editor.


PostgreSQL Protocol Gateway

Presto also provides PostgreSQL gateway, which talks PostgreSQL protocol. You can issue the query to Treasure Data, as if it’s running PostgreSQL.

CLI

Using the CLI, specify -T presto in the td query command. A v0.10.99 or newer client is required.

$ td query -w -T presto -d testdb \
  "SELECT code, COUNT(1) FROM www_access GROUP BY code"

REST API

For REST API, the endpoint is /v3/job/issue/presto/:database.

Presto Example Query Catalog

If you’re looking for dozens of Presto SQL templates, please visit Treasure Data’s example query catalog page.

Presto Query Language Reference

Presto supports industry-standard SQL-92 syntax.

Untitled-3
Current Presto version is v0.152.3.

Supported UDFs

Known Limitations

See Also


Last modified: Jan 09 2016 02:29:24 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.