Because Treasure Data supports so many different sources and targets of data, Treasure Data uses a specific set of primitive data types native to our platform.

TD processes queries using different processing engines (Presto and Hive). Each engine has its own data type system, and the TD native data types map to types available in the query engine.

When read from and written to the underlying storage layer in TD, which uses MessagePack mpc1 format, these TD types map to MessagePack's formats as well.

It is important to understand how all these type systems correspond. Otherwise, you might experience results and data types that might be different than expected.

Schema relation

Treasure Data

Presto

Hive

int

bigint

smallint

int

bigint

int

long

bigint

bigint

double

double

decimal

float

double

float

double

double

double

Convert to string or int

boolean

boolean

string

varchar

string or varchar

string or Convert to long

date

string

string or Convert to long

timestamp

timestamp

Our storage stores data as MessagePack mpc1 format. They support INT as:

positive fixint

uint16

int16

negative fixint

unit32

int32

uint8

int8


For example, the Hive query result for ResultWorker a float data type becomes a double data type.

TD primitive data types

Hive query result for ResultWorker

Presto result msgpack ValueType

int

int > int

Integer

long

bigint > long

integer

float

float > double

float

double

double > double

float

string

string > string

string

  • No labels