explode_json_columns output option for Data Connector

The explode_json_columns option for Data Connector explodes columns having json into multiple columns.:

Table of Contents

Configuration

Add the explode_json_columns option with json column name to out: section, for example:

in:
...
out:
  type: td
  explode_json_columns: ["json_column"]
...
Untitled-3
Original json column name would remain in result even if explode it.
Untitled-3
If json data has a name that duplicate other column, you need to use the rename filter. Please refer to Use Case 4.

Use Case 1: Expand one json column

When you have a csv data has a json column, please add the items to y

Example of Source data

id  jsons
1   {"time2":1455829282,"ip":"93.184.216.34","name":"frsyuki"}
2   {"time2":1455829282,"ip":"172.36.8.109","name":"sadayuki"}
3   {"time2":1455829284,"ip":"example.com","name":"Treasure Data"}
4   {"time2":1455829282,"ip":"10.98.43.1","name":"MessagePack"}

Added section. In this case, Data Connector will explode jsons column.

out:
  type: td
  explode_json_columns: ["jsons"]
  mode: append

The result is below:


Use Case 2: Expand multiple json columns

When you have a csv data has multiple json columns, please set explode column name.

Example of Source data

id  jsons   jsons2
1   {"time2":1455829282,"ip":"93.184.216.34","name":"frsyuki"}  {"country":"Japan","language":"Japanese"}
2   {"time2":1455829282,"ip":"172.36.8.109","name":"sadayuki"}  {"country":"USA","language":"English"}
3   {"time2":1455829284,"ip":"example.com","name":"Treasure Data"}  {"country":"Brazil","language":"Portuguese"}
4   {"time2":1455829282,"ip":"10.98.43.1","name":"MessagePack"} {"country":"Spain","language":"Spanish"}

Added section. In this case, Data Connector will explode jsons column.

out:
  type: td
  explode_json_columns: ["jsons","jsons2"]
  mode: append

The result is below:


Use Case 3: json data file

If your source data is json type, please set explode column name as “record”.

Example of Source data

{"time2":1455829282,"ip":"93.184.216.34","name":"frsyuki"}
{"time2":1455829282,"ip":"172.36.8.109","name":"sadayuki"}
{"time2":1455829284,"ip":"example.com","name":"Treasure Data"}
{"time2":1455829282,"ip":"10.98.43.1","name":"MessagePack"}

Added section. In this case, Data Connector will explode json data.

out:
  type: td
  explode_json_columns: ["record"]
  mode: append

The result is below:


Use Case 4: Duplicated column name, exploading 1 column

If json data has a name that duplicate other column, please use rename filter option. More details at add_time filter plugin. In case of sample, jsons has name key which is same as name column in csv.

Example of Source data

name    jsons
A   {"time2":1455829282,"ip":"93.184.216.34","name":"frsyuki"}
B   {"time2":1455829282,"ip":"172.36.8.109","name":"sadayuki"}
C   {"time2":1455829284,"ip":"example.com","name":"Treasure Data"}
D   {"time2":1455829282,"ip":"10.98.43.1","name":"MessagePack"}

Added the rename option to filter section. In this case, Data Connector can explode jsons data.

in:
...
filters:
  - type: rename
    columns:
      name: code
out:
  type: td
  explode_json_columns: ["jsons"]
  mode: append

The result is below:


If you don’t care duplicated name column, explode_json_columns option is not able to explode name element as below.



Last modified: Feb 22 2017 23:40:42 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.