expand_json filter plugin for Data Connector

The expand_json filter plugin for Data Connector expands columns having json into multiple columns. This filter takes five options as explained below:

Option Description
json_column_name column name having json to be expanded (string, required)
root property to start fetching each entries, specify in JsonPath style (string, default: "$.")
expanded_columns expanded into multiple columns (array of hash, required)

name: name of the column. you can define JsonPath style.
type: type of the column (see below)
format: format of the timestamp if type is timestamp
keep_expanding_json_column Not remove the expanding json column from input schema if it’s true (false by default)
stop_on_invalid_record Stop bulk load transaction if an invalid record is included (false by default)

Type of columns

name description
boolean true or false
long 64-bit|signed integers
timestamp Date and time with nano-seconds precision
double 64-bit floating point numbers
string Strings

Table of Contents

Configuration

Add the filter section with type expand_json to your load.yml, for example:

in:
...
filters:
  - type: expand_json
    json_column_name: json_payload
    root: "$."
    expanded_columns:
      - {name: "phone_numbers", type: string}
      - {name: "app_id", type: long}
      - {name: "point", type: double}
      - {name: "created_at", type: timestamp, format: "%Y-%m-%d"}
      - {name: "profile.anniversary.et", type: string}
      - {name: "profile.anniversary.voluptatem", type: string}
      - {name: "profile.like_words[1]", type: string}
      - {name: "profile.like_words[2]", type: string}
      - {name: "profile.like_words[0]", type: string}
 out:
 ...

Use Case 1: Expand one json column

When you have a table has a json column, please add this section to your .yml.

Example of Source data

+-----------+---------------------------------------------------------------+
| code:long |                                                  jsons:string |
+-----------+---------------------------------------------------------------+
|         1 |   {"id":1,"name":"A green door","price":11.50,"tags":"green"} |
|         2 |     {"id":2,"name":"A blue door","price":12.50,"tags":"blue"} |
|         3 |       {"id":3,"name":"A red door","price":13.50,"tags":"red"} |
|         4 |     {"id":4,"name":"A pink door","price":14.50,"tags":"pink"} |
|         5 |   {"id":5,"name":"A white door","price":15.50,"tags":"white"} |
|         6 |   {"id":6,"name":"A black door","price":16.50,"tags":"black"} |
|         7 | {"id":7,"name":"A yellow door","price":17.50,"tags":"yellow"} |
|         8 | {"id":8,"name":"A purple door","price":18.50,"tags":"purple"} |
+-----------+---------------------------------------------------------------+

Added section. In this case, Data Connector will expand jsons column.

filters:
  - type: expand_json
    json_column_name: jsons
    root: "$."
    expanded_columns:
      - {name: "id", type: long}
      - {name: "name", type: string}
      - {name: "price", type: double}
      - {name: "tags", type: string}

The result is below:

+-----------+---------+---------------+--------------+-------------+
| code:long | id:long |   name:string | price:double | tags:string |
+-----------+---------+---------------+--------------+-------------+
|         1 |       1 |  A green door |         11.5 |       green |
|         2 |       2 |   A blue door |         12.5 |        blue |
|         3 |       3 |    A red door |         13.5 |         red |
|         4 |       4 |   A pink door |         14.5 |        pink |
|         5 |       5 |  A white door |         15.5 |       white |
|         6 |       6 |  A black door |         16.5 |       black |
|         7 |       7 | A yellow door |         17.5 |      yellow |
|         8 |       8 | A purple door |         18.5 |      purple |
+-----------+---------+---------------+--------------+-------------+

Use Case 2: Keep original json column

When you want to keep original json column, please add keep_expanding_json_column option to your .yml.

filters:
  - type: expand_json
    json_column_name: jsons
    root: "$."
    keep_expanding_json_column: true
    expanded_columns:
      - {name: "id", type: long}
      - {name: "name", type: string}
      - {name: "price", type: double}
      - {name: "tags", type: string}

The result is below:

+-----------+---------------------------------------------------------------+---------+---------------+--------------+-------------+
| code:long |                                                  jsons:string | id:long |   name:string | price:double | tags:string |
+-----------+---------------------------------------------------------------+---------+---------------+--------------+-------------+
|         1 |   {"id":1,"name":"A green door","price":11.50,"tags":"green"} |       1 |  A green door |         11.5 |       green |
|         2 |     {"id":2,"name":"A blue door","price":12.50,"tags":"blue"} |       2 |   A blue door |         12.5 |        blue |
|         3 |       {"id":3,"name":"A red door","price":13.50,"tags":"red"} |       3 |    A red door |         13.5 |         red |
|         4 |     {"id":4,"name":"A pink door","price":14.50,"tags":"pink"} |       4 |   A pink door |         14.5 |        pink |
|         5 |   {"id":5,"name":"A white door","price":15.50,"tags":"white"} |       5 |  A white door |         15.5 |       white |
|         6 |   {"id":6,"name":"A black door","price":16.50,"tags":"black"} |       6 |  A black door |         16.5 |       black |
|         7 | {"id":7,"name":"A yellow door","price":17.50,"tags":"yellow"} |       7 | A yellow door |         17.5 |      yellow |
|         8 | {"id":8,"name":"A purple door","price":18.50,"tags":"purple"} |       8 | A purple door |         18.5 |      purple |
+-----------+---------------------------------------------------------------+---------+---------------+--------------+-------------+

Use Case 3: Expand multiple json columns

When you have multiple json columns, please add this sections to your .yml.

Example of Source data

+-----------+---------------------------------------------------------------+-----------------------------------+
| code:long |                                                  jsons:string |                     jsons2:string |
+-----------+---------------------------------------------------------------+-----------------------------------+
|         1 |   {"id":1,"name":"A green door","price":11.50,"tags":"green"} |  {"color":"green","size":"Large"} |
|         2 |     {"id":2,"name":"A blue door","price":12.50,"tags":"blue"} |   {"color":"blue","size":"Small"} |
|         3 |       {"id":3,"name":"A red door","price":13.50,"tags":"red"} |   {"color":"red","size":"Medium"} |
|         4 |     {"id":4,"name":"A pink door","price":14.50,"tags":"pink"} |  {"color":"pink","size":"Medium"} |
|         5 |   {"id":5,"name":"A white door","price":15.50,"tags":"white"} |  {"color":"white","size":"Large"} |
|         6 |   {"id":6,"name":"A black door","price":16.50,"tags":"black"} |  {"color":"black","size":"Large"} |
|         7 | {"id":7,"name":"A yellow door","price":17.50,"tags":"yellow"} | {"color":"yellow","size":"Small"} |
|         8 | {"id":8,"name":"A purple door","price":18.50,"tags":"purple"} | {"color":"purple","size":"Large"} |
+-----------+---------------------------------------------------------------+-----------------------------------+

Added section. In this case, Data Connector will expand jsons and jsons2 column.

filters:
  - type: expand_json
    json_column_name: jsons
    root: "$."
    expanded_columns:
      - {name: "id", type: long}
      - {name: "name", type: string}
      - {name: "price", type: double}
      - {name: "tags", type: string}
  - type: expand_json
    json_column_name: jsons2
    root: "$."
    expanded_columns:
      - {name: "color", type: string}
      - {name: "size", type: string}

The result is below:

+-----------+---------+---------------+--------------+-------------+--------------+-------------+
| code:long | id:long |   name:string | price:double | tags:string | color:string | size:string |
+-----------+---------+---------------+--------------+-------------+--------------+-------------+
|         1 |       1 |  A green door |         11.5 |       green |        green |       Large |
|         2 |       2 |   A blue door |         12.5 |        blue |         blue |       Small |
|         3 |       3 |    A red door |         13.5 |         red |          red |      Medium |
|         4 |       4 |   A pink door |         14.5 |        pink |         pink |      Medium |
|         5 |       5 |  A white door |         15.5 |       white |        white |       Large |
|         6 |       6 |  A black door |         16.5 |       black |        black |       Large |
|         7 |       7 | A yellow door |         17.5 |      yellow |       yellow |       Small |
|         8 |       8 | A purple door |         18.5 |      purple |       purple |       Large |
+-----------+---------+---------------+--------------+-------------+--------------+-------------+

Last modified: May 25 2016 02:02:36 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.