Handling timezone of timestamp type data by column_options

Timezone of the timestamp data (e.g. 2016/11/23 21:00:33 JST) need to be specified to treat the timestamp data as expected.

We explain how column_options option works to handle the timezone of input and output timestamp data.

Table of Contents

Input data

Let say we have the followig data in S3 and would like to import data into TD as is. time_jst is timestamp data in JST timezone, time_utc is timestamp data in UTC zone.

time_jst,time_utc
2016-11-12 14:58:10.589000,2016-11-12 05:58:10.589000
2016-11-12 14:53:34.807000,2016-11-12 05:53:34.807000
2016-11-12 17:24:52.912000,2016-11-12 08:24:52.912000
2016-11-12 16:28:19.932000,2016-11-12 07:28:19.932000
2016-11-12 00:10:08.185000,2016-11-11 15:10:08.185000
2016-11-12 23:10:03.146000,2016-11-12 14:10:03.146000
2016-11-12 20:21:22.120000,2016-11-12 11:21:22.120000
2016-11-12 17:31:13.673000,2016-11-12 08:31:13.673000
2016-11-12 13:52:53.000000,2016-11-12 04:52:53.000000

Configuration

in:
  type: s3
  access_key_id: xxx
  secret_access_key: xxx
  bucket: xxxxx
  path_prefix: datafile.csv
  parser:
    charset: UTF-8
    newline: CRLF
    type: csv
    delimiter: ','
    quote: '"'
    escape: '"'
    trim_if_not_quoted: false
    skip_header_lines: 1
    allow_extra_columns: false
    allow_optional_columns: false
    columns:
    - {name: time_jst, type: timestamp, format: '%Y-%m-%d %H:%M:%S.%N', timezone: 'Asia/Tokyo'}
    - {name: time_utc, type: timestamp, format: '%Y-%m-%d %H:%M:%S.%N', timezone: 'UTC'}
out:
  column_options:
    time_jst: {format: '%Y-%m-%d %H:%M:%S', timezone: 'Asia/Tokyo'}
    time_utc: {format: '%Y-%m-%d %H:%M:%S', timezone: 'UTC'}
Untitled-3
`colums:` option of `in:` section specifies how to recognize input data using `format` and `timezone` setting.
Untitled-3
`columns_options:` of `out:` section specifies how to output data to TD using `format` and `timezone` setting

Import

You can now import data to TD by following command. Say configuration file as load.yml, database name as db1 and table name as table1.

$ td connector:issue load.yml --database db1 --table table1 --auto-create-table

Result

Here is the result of imported data.

$ td query -d db1 -Tpresto 'select * from table1 limit 10' -w
+---------------------+---------------------+------------+
| time_jst            | time_utc            | time       |
+---------------------+---------------------+------------+
| 2016-11-12 14:58:10 | 2016-11-12 05:58:10 | 1478930290 |
| 2016-11-12 14:53:34 | 2016-11-12 05:53:34 | 1478930014 |
| 2016-11-12 17:24:52 | 2016-11-12 08:24:52 | 1478939092 |
| 2016-11-12 16:28:19 | 2016-11-12 07:28:19 | 1478935699 |
| 2016-11-12 00:10:08 | 2016-11-11 15:10:08 | 1478877008 |
| 2016-11-12 23:10:03 | 2016-11-12 14:10:03 | 1478959803 |
| 2016-11-12 20:21:22 | 2016-11-12 11:21:22 | 1478949682 |
| 2016-11-12 17:31:13 | 2016-11-12 08:31:13 | 1478939473 |
| 2016-11-12 13:52:53 | 2016-11-12 04:52:53 | 1478926373 |
| 2016-11-12 22:59:39 | 2016-11-12 13:59:39 | 1478959179 |
+---------------------+---------------------+------------+
10 rows in set

NOTE: time will be the first timestamp data shown in config if time_column option is not specified explicitly.


Last modified: Nov 24 2016 05:22:48 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.