This article shows how to use Treasure Data with the R language by using the RPresto package.

Install the RPresto Package

  1. Open the R Console.

  2. Next, install the RPresto and RTD Package as shown:

    install.packages(c("RPresto", "devtools", "dplyr", "dbplyr", "ggplot2"))
    devtools::install_github("treasure-data/RTD")
    devtools::install_github("crowding/msgpack-r")

Your Local Endpoint

You can use the endpoints below to access this feature. You can point to Presto JDBC/ODBC for RPresto and API might be used by RTD. Learn more about Treasure Data Sites and Endpoints.

Issuing Queries

You can query with the following examples. Assuming there is a ‘flights’ table in ‘test’ database, you then need to set an environment variable <TD_API_KEY> for your TD API key.

To use a different region, replace  host with the desired region.

  • Using dplyr package 

  • Using DBI package

Example 1 (using RPresto and dplyr):

library(RPresto)
library(dplyr)

db <- src_presto(
  host="https://api-presto.treasuredata.com",
  port=443,
  user=Sys.getenv("TD_API_KEY"),
  schema='test',
  catalog='td-presto'  
)

flights_tbl <- tbl(db, 'flights')

# filter by departure delay and show result
flights_tbl %>% filter(dep_delay == 2)

Example 2 (using RPresto and DBI):

library(DBI)

con <- dbConnect(
  RPresto::Presto(),
  host="https://api-presto.treasuredata.com",
  port=443,
  user=Sys.getenv("TD_API_KEY"),
  schema='test',
  catalog='td-presto'
)

# write your query with dbGetQuery function
flights_preview <- dbGetQuery(con, 'SELECT year, month, day, dep_time, dep_delay, carrier, flight from flights limit 10')
# show query result
flights_preview

View an example notebook.

  • No labels