Treasure Data uses the same convention as Relational Database Management Systems (RDBMSs) for managing data sets:
Treasure Data allows you to store your data before you define your schema:
- Conventional warehousing platforms: Platforms are schema-dependent, supporting an assumptive analytics model. In the assumptive analytics model, data elements forecasted to yield insights are defined in advance, with the structure of the data store schema.
- Schema annotated databases: You can assign a schema after importing data to a table within Treasure Data. Schema changes are implemented faster than changes to conventional warehouse platforms. Additionally, performance considerations are also important in the initial design and the analyst must have knowledge of the underlying structure to ensure query performance. When new columns are added to the table, the schema must change.
Big Data analysis, however, is largely non-assumptive. The analyst seeks hidden patterns, relationships, or events in the data that were not obvious from the outset. You are able to query the data where ever it is stored and without the burden of performance considerations—and exploration can create requirements for new records to support the analysis trail.
In this model schema, dependence adds a significant tax that can become prohibitive.
This topic contains:
Basic knowledge of Treasure Data, including the TD Toolbelt.
A table with some data. See Running a query and downloading results.
Understanding the TD Default Schema
When a table is created in a TD database, it is created with the following column:
time: The time that each entry was generated, in int64 UNIX time
Defining a custom schema is strongly recommended. You can add columns for various types of data.
Identify the various data types used in your data before defining the schema.
For more information, see About Schema Annotation - Legacy