Frequently Asked Questions

Here are some questions frequently asked by our users.

Table of Contents

How can I write out query results into X? (ex: How can I write query results into MySQL?)

Treasure Data provides the ability to write query results into various datastores such as Treasure Data, MySQL and PostgreSQL. The list of currently supported output options can be found here.

It looks like Treasure Data uses Hive. Does this mean I have to worry about schema changes?

Treasure Data supports HiveQL semantics, but unlike Apache Hive, we allow the user to set and modify the schema at any time. We do not require table schema to be defined upfront. For more information, please refer to this document.

How can I handle nested JSON records?

Nested JSON records can be parsed by using Hive’s native get_json_object UDF function or Presto’s native JSON functions. However, we recommend that you maintain a flat JSON structure since this command is a source of additional CPU performance overhead.

How can I use the td command through a proxy?

Please set the HTTP_PROXY variable. This is supported starting in td v0.10.41.

e.g.
Windows:
$ set HTTP_PROXY=http://proxy_host:8080
Other:
$ export HTTP_PROXY="proxy_host:8080"

You can set user/password for proxy authentication like user:passwd@proxy_host:8080

I cannot import my data to Treasure Data. Can you help?

Sorry to hear that. Here are a couple of scenarios that we can think of.

1. Make sure your network is alive and allow you to access external networks.
2. Does your data have numbers greater than 2e64? If so, please upgrade your td command/td-agent to the latest version. If you are using fluentd instead, please upgrade fluent-plugin-td to version 0.10.15 or above.

Treasure Agent sometimes gets “Failed to upload to TreasureData: Import failed” error. Is it a serious problem?

There are several reasons:

1. Your network becomes unstable
2. Treasure Data's API shutdown the connection by unexpected network / node condition

But Treasure Agent has robust retry mechanism for these cases, so continue to import the data to Treasure Data. You can see “retry succeeded” in your logs after several retries. It is no problem.

If continue to get same error and exceeds the retry limit, it is non-network problem. Please let me know.

When I query my table, I get “null” for some of my integer values. Is this a bug?

Is your integer greater than or equal to 2e64? If it is, then you need to cast it as decimal like this:

SELECT CAST(my_huge_number AS DECIMAL) FROM tbl...

Hive’s reserved keywords (Why do I get a warning about reserved keywords?)

Treasure Data uses Hive as a querying language. One of Hive’s limitations is that a table name cannot be a reserved keyword. Thus, if your table (or database) name is identical to a reserved keyword, you won’t be able to query data in that table (or database).

Hive also has non-reserved keywords which have a proscribed meaning in the language but which are also allowed as identifiers. Hive has additional keywords which allow similar unrestricted usage. In particular, these keywords are allowed as column or table names.

For example if you create an ‘items’ table and store data in it, you won’t be able to execute the following hive query, because ‘items’ is a reserved keyword in Hive.

SELECT v FROM items

The following are Keywords in Hive:

ALL,ALTER,AND,ARRAY,AS,AUTHORIZATION,BETWEEN,BIGINT,BINARY,BOOLEAN,BOTH,BY,CASE,CAST,
CHAR,COLUMN,CONF,CREATE,CROSS,CUBE,CURRENT,CURRENT_DATE,CURRENT_TIMESTAMP,CURSOR,
DATABASE,DATE,DECIMAL,DELETE,DESCRIBE,DISTINCT,DOUBLE,DROP,ELSE,END,EXCHANGE,EXISTS,
EXTENDED,EXTERNAL,FALSE,FETCH,FLOAT,FOLLOWING,FOR,FROM,FULL,FUNCTION,GRANT,GROUP,
GROUPING,HAVING,IF,IMPORT,IN,INNER,INSERT,INT,INTERSECT,INTERVAL,INTO,IS,JOIN,LATERAL,
LEFT,LESS,LIKE,LOCAL,MACRO,MAP,MORE,NONE,NOT,NULL,OF,ON,OR,ORDER,OUT,OUTER,OVER,
PARTIALSCAN,PARTITION,PERCENT,PRECEDING,PRESERVE,PROCEDURE,RANGE,READS,REDUCE,REVOKE,
RIGHT,ROLLUP,ROW,ROWS,SELECT,SET,SMALLINT,TABLE,TABLESAMPLE,THEN,TIMESTAMP,TO,TRANSFORM,
TRIGGER,TRUE,TRUNCATE,UNBOUNDED,UNION,UNIQUEJOIN,UPDATE,USER,USING,VALUES,VARCHAR,WHEN,
WHERE,WINDOW,WITH

The following are Non-reserved Keywords in Hive:

ADD,ADMIN,AFTER,ANALYZE,ARCHIVE,ASC,BEFORE,BUCKET,BUCKETS,CASCADE,CHANGE,CLUSTER,
CLUSTERED,CLUSTERSTATUS,COLLECTION,COLUMNS,COMMENT,COMPACT,COMPACTIONS,COMPUTE,
CONCATENATE,CONTINUE,DATA,DATABASES,DATETIME,DAY,DBPROPERTIES,DEFERRED,DEFINED,
DELIMITED,DEPENDENCY,DESC,DIRECTORIES,DIRECTORY,DISABLE,DISTRIBUTE,ELEM_TYPE,ENABLE,
ESCAPED,EXCLUSIVE,EXPLAIN,EXPORT,FIELDS,FILE,FILEFORMAT,FIRST,FORMAT,FORMATTED,
FUNCTIONS,HOLD_DDLTIME,HOUR,IDXPROPERTIES,IGNORE,INDEX,INDEXES,INPATH,INPUTDRIVER,
INPUTFORMAT,ITEMS,JAR,KEYS,KEY_TYPE,LIMIT,LINES,LOAD,LOCATION,LOCK,LOCKS,LOGICAL,LONG,
MAPJOIN,MATERIALIZED,MINUS,MINUTE,MONTH,MSCK,NOSCAN,NO_DROP,OFFLINE,OPTION,OUTPUTDRIVER,
OUTPUTFORMAT,OVERWRITE,OWNER,PARTITIONED,PARTITIONS,PLUS,PRETTY,PRINCIPALS,PROTECTION,
PURGE,READ,READONLY,REBUILD,RECORDREADER,RECORDWRITER,REGEXP,RELOAD,RENAME,REPAIR,
REPLACE,RESTRICT,REWRITE,RLIKE,ROLE,ROLES,SCHEMA,SCHEMAS,SECOND,SEMI,SERDE,
SERDEPROPERTIES,SERVER,SETS,SHARED,SHOW,SHOW_DATABASE,SKEWED,SORT,SORTED,SSL,STATISTICS,
STORED,STREAMTABLE,STRING,STRUCT,TABLES,TBLPROPERTIES,TEMPORARY,TERMINATED,TINYINT,TOUCH,
TRANSACTIONS,UNARCHIVE,UNDO,UNIONTYPE,UNLOCK,UNSET,UNSIGNED,URI,USE,UTC,UTCTIMESTAMP,
VALUE_TYPE,VIEW,WHILE,YEAR

The following are Reserved Keywords in Hive:

ALL,ALTER,AND,ARRAY,AS,AUTHORIZATION,BETWEEN,BIGINT,BINARY,BOOLEAN,BOTH,BY,CASE,CAST,
CHAR,COLUMN,CONF,CREATE,CROSS,CUBE,CURRENT,CURRENT_DATE,CURRENT_TIMESTAMP,CURSOR,
DATABASE,DATE,DECIMAL,DELETE,DESCRIBE,DISTINCT,DOUBLE,DROP,ELSE,END,EXCHANGE,EXISTS,
EXTENDED,EXTERNAL,FALSE,FETCH,FLOAT,FOLLOWING,FOR,FROM,FULL,FUNCTION,GRANT,GROUP,
GROUPING,HAVING,IF,IMPORT,IN,INNER,INSERT,INT,INTERSECT,INTERVAL,INTO,IS,JOIN,LATERAL,
LEFT,LESS,LIKE,LOCAL,MACRO,MAP,MORE,NONE,NOT,NULL,OF,ON,OR,ORDER,OUT,OUTER,OVER,
PARTIALSCAN,PARTITION,PERCENT,PRECEDING,PRESERVE,PROCEDURE,RANGE,READS,REDUCE,REVOKE,
RIGHT,ROLLUP,ROW,ROWS,SELECT,SET,SMALLINT,TABLE,TABLESAMPLE,THEN,TIMESTAMP,TO,TRANSFORM,
TRIGGER,TRUE,TRUNCATE,UNBOUNDED,UNION,UNIQUEJOIN,UPDATE,USER,USING,VALUES,VARCHAR,WHEN,
WHERE,WINDOW,WITH

If you encounter a reserved keyword issue in your query, please wrap the infringing keyword with ‘`’ as shown below.

SELECT `keyword_column`, COUNT(1) AS cnt FROM table_name GROUP BY `keyword_column`

What protocol is used between I and Treasure Data?

Treasure Agent and td command use HTTPS by default so that we ensure security.

If you have any questions, please feel free to contact us, or chat with us live using the window at the bottom right.

How is import command’s time format set at windows batch file?

‘%’ is recognized Environment variable,so need to set ‘%%’.

td import:prepare --format csv --column-header --time-column 'date' --time-format '%%Y-%%m-%%d' test.csv

How does TD_TIME_FORMAT handle Leap Second?

Please check the following results.

Presto

SELECT
 TD_TIME_FORMAT(1136073600, 'yyyy-MM-dd HH:mm:ss', 'JST') as st,
 TD_TIME_PARSE('2006-01-01 08:59:60', 'JST') as leap,
 TD_TIME_PARSE('2006-01-01 09:00:00', 'JST') as leap2

Presto Leap Second

Hive

SELECT
 TD_TIME_FORMAT(1136073600, 'yyyy-MM-dd HH:mm:ss', 'JST') as st,
 TD_TIME_PARSE('2006-01-01 08:59:60', 'JST') as leap,
 TD_TIME_PARSE('2006-01-01 09:00:00', 'JST') as leap2

Hive Leap Second

Why tdlog plugin warns me about the endpoint change?

You see the following message in the td-agent log.

tdlog plugin will change the API endpoint from api.treasure-data.com to api.treasuredata.com
If want to keep api.treasure-data.com, please set 'endpoint api.treasure-data.com' in tdlog configuration

This is for some users who change the control of network access to api.treasure-data.com. If you don’t set access restrictions based on our API endpoint, please ignore the warning.


Last modified: Dec 24 2016 00:46:56 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.