Skip to content
Last updated

Using Environment Variables With Bulk Import

In some cases including connection details in your Embulk configuration files is not ideal. In situations where you need to hide or mask certain details you can embed environment variables in your configuration file.

Use of environment variables in Embulk is an experimental feature. The feature might change or be removed in future releases.

Prerequisites

Understanding Environment Variable Naming Conventions

You need to replace some environment variables and follow the variable naming convention: {{ env.replaced_detail }}. replaced_detail being the name of environment variable. For example, if you had set the environment variable for your database password and named it DB_PASSWORD, then the value in your configuration file would be:

{{ env.DB_PASSWORD }}

The convention is env. followed by the name of your environment variable in double curly braces {{ }}.

Setting Environment Variables

An environment variable is a dynamic-named value that can be used by a running process to complete its task. For example, a running process can query the value of the DB_HOST environment variable to discover the IP address of the MySQL database, or the API_KEY variable to find the value of the API key to authenticate with Treasure Data. The procedure to set or change environment variables varies from platform to platform. For example, see Environment variables on Mac OS X.

To use variables in your configuration file

  1. Rename the .yml configuration file so that the extension ends with .yml.liquid. For example, if your configuration file was originally named config.yml, renamed it to config.yml.liquid.

  2. Insert the environment variable into the configuration file by replacing the connection details, using the variable naming convention.

  3. Run Embulk in preview mode to validate your changes. For example:

embulk preview config.yml.liquid  
  1. Run Embulk to set the new configuration file details. For example:
embulk run config.yml.liquid  

Example config.yml.liquid File

For example if the original config.yml file was the following:

    in:
        type: mysql
        host: localhost
        port: 3306
        user: username
        password: password
        database: mysql_db
        select: "col1, col2, datecolumn"
        where: "col4 != 'a'"
    out:
        type: td
        apikey: xxxxxxxxxxxx
        endpoint: api.treasuredata.com
        database: dbname
        table: tblname
        time_column: datecolumn
        mode: replace 
        # by default mode: append is used, if not defined. 
        # Imported records are appended to the target table with this mode.
        # mode: replace, replaces existing target table
        default_timestamp_format: '%d/%m/%Y'

You want to hide the MySQL port, username, password, and database. On the output section, you might want to hide your API key. Using the correct naming conventions: {{ env.replaced_details }}, the file becomes the following:

    in:
        type: mysql
        host: {{ env.db_host }}
        port: {{ env.db_port }}
        user: {{ env.db_username }}
        password: {{ env.db_password }}
        database: {{ env.db_name }}
        select: "col1, col2, datecolumn"
        where: "col4 != 'a'"
    out:
        type: td
        apikey: {{ env.td_apikey }}
        endpoint: {{ env.api_endpoint }}
        database: {{ env.td_db_name }}
        table: {{ env.td_table_name }}
        time_column: datecolumn
        mode: replace 
        #by default mode: append is used, if not defined. Imported records 
        #are appended to the target table with this mode.
        #mode: replace, replaces existing target table
        default_timestamp_format: '%d/%m/%Y'.