Presto Resource Pools enable you to break up your available compute resources into manageable chunks. You can then organize their usage across project, groups, or use cases.
Resource Pools are helpful for the following challenges:
You have a team of analysts who often see significant queuing during the work day because of large scheduled queries that take—for a long period of time—the full account’s resources.
You have critical SLA queries that always have to have resources available to run against, and you want to ensure that other queries (scheduled, or ad hoc) don’t get in the way.
Resource pools are only available for accounts with 5 or greater Presto Compute Units allocated.
This feature is enabled upon request. Contact support or your primary account representative if you want to use Presto Resource Pools.
Understand Presto Resource Pool Functionality
Based on the number of Presto Compute Units your account has, you are able to use up to a specified maximum of the following:
Concurrent Query Limit
Memory Limits (per query)
Split Compute Limits
Resource Pools allow you to allocate the resources, by the percentage of your account’s total available amount.
Your resource pools can either be strictly partitioned or can overlap the allocation of your total available resources.
You might enable a scheduled pool with access with up to 70% of your account’s total resources, and an ad hoc pool with access up to 70% of the account’s resources. In this way, 40% of the account’s resources are shared, and 30% of the account’s resources are dedicated for either ad hoc or scheduled queries.
You might want to set up your pools with 40% to a scheduled pool for a stricter SLA environment, and 60% to your ad hoc pool for development purposes.
For the first example given, queries are prioritized between the pools based on the following logic:
Highest priority queued queries are issued first
First come, first served
Resource Limits Applied to Pools
Resource Pools divide resources in an account as a percentage of their account total. These are based on:
Query & Account Max Memory
Concurrent Queries (total allowed * Pool %, rounded up)
Options for Presto Resource Pool Allocation
You can choose up to a maximum of 3 total resource pools, allocated with percentages of their choosing. Typically, you choose one of the following configurations:
Complete separation of resource usage
directly allocate resources to each pool – so that there are no shared resources between them.
Example: 30%, 70% resource split
Partial shared environment
some overlap between multiple pools. Some resources are shared between pools, while some resources are saved for each pool.
Example: 70%, 70% resource split.
Understand Anytime Split or Simultaneous Splits
Splits are Presto’s way of dividing data into chunks for processing. Your Presto price plan, based on a metric called Presto Units, largely determines how many splits of data can be processed simultaneously, given the compute resources included with your plan. The amount of total Splits a query requires to run is in proportion to the amount of data scanned, and the complexity of the query.
Using Presto resource pools divides the simultaneous splits allocated to your plan to different groups of queries assigned to your resource pools. It also imposes other internal limits on parallel processing to roughly limit your total processing to the amount your plan would consume without resource pools.
If a query can process fewer simultaneous splits, it will take longer to process all the required splits. This can happen any time several queries are running simultaneously in an account, and resources are divided among them by the Presto scheduler. It can also happen because a resource pool's upper bound is set lower than 100%.
Understand Concurrent Queries (CQ) for Resource Pools
CQs allocated across resource pools
The allocation of concurrent queries is based on specified resource pool percentages. The following examples show how concurrent queries are allocated across resources pools with various percentages. In these examples, the account has an overall CQ of 8:
If you specify an allocation as 70% ad hoc and 70% scheduled allocation, then the CQ for each pool is 6 (8 CQ * 0.70 = 5.6, rounded up to 6).
If you specify an allocation as 60% ad hoc, 40% scheduled, the CQ on the ad hoc pool is 5 (8 CQ * 0.6 = 4.8, rounded up to 5) and the CQ on the scheduled pool is 4 (8 CQ * 0.4 = 3.2, rounded up to 4).
The CQ for the overall account still applies. So in the second example, if you use all 5 queries allowed in the ad hoc pool, the scheduled pool is limited to 3 queries until one of the ad hoc queries finishes. The account, in this example, is specified for only 8 concurrent queries.
Select Which Query Pool Your Query Will Run On
By default, resource pools can be enabled for scheduled saved queries and ad hoc queries. If you use the default configuration, you do not need to use the following methods to select query pools. It is necessary to set your resource pool if you use a custom setup.
If you are using the TD Toolkit to issue queries, you can set up additional pools with custom names, as follows:
TD Console Option
Using the TD Console, you can select a specific resource pool for use by adding the following comment at the top of your query: