Using Environment Variables With Bulk Import
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude
Connect to Cursor
Install MCP server on Cursor
Connect to VS Code
Install MCP server on VS Code

In some cases including connection details in your Embulk configuration files is not ideal. In situations where you need to hide or mask certain details you can embed environment variables in your configuration file.

Use of environment variables in Embulk is an experimental feature. The feature might change or be removed in future releases.

Prerequisites

Basic knowledge of Treasure Data.
Basic Knowledge of Embulk
Environment variables set in your environment
Review the Liquid Template Engine documentation engine, on which Embulk variables are based

Understanding Environment Variable Naming Conventions

You need to replace some environment variables and follow the variable naming convention: {{ env.replaced_detail }}. replaced_detail being the name of environment variable. For example, if you had set the environment variable for your database password and named it DB_PASSWORD, then the value in your configuration file would be:

{{ env.DB_PASSWORD }}

The convention is env. followed by the name of your environment variable in double curly braces {{ }}.

Setting Environment Variables

An environment variable is a dynamic-named value that can be used by a running process to complete its task. For example, a running process can query the value of the DB_HOST environment variable to discover the IP address of the MySQL database, or the API_KEY variable to find the value of the API key to authenticate with Treasure Data. The procedure to set or change environment variables varies from platform to platform. For example, see Environment variables on Mac OS X.

To use variables in your configuration file

Rename the .yml configuration file so that the extension ends with .yml.liquid. For example, if your configuration file was originally named config.yml, renamed it to config.yml.liquid.
Insert the environment variable into the configuration file by replacing the connection details, using the variable naming convention.
Run Embulk in preview mode to validate your changes. For example:

embulk preview config.yml.liquid

Run Embulk to set the new configuration file details. For example:

embulk run config.yml.liquid

Example config.yml.liquid File

For example if the original config.yml file was the following:

    in:
        type: mysql
        host: localhost
        port: 3306
        user: username
        password: password
        database: mysql_db
        select: "col1, col2, datecolumn"
        where: "col4 != 'a'"
    out:
        type: td
        apikey: xxxxxxxxxxxx
        endpoint: api.treasuredata.com
        database: dbname
        table: tblname
        time_column: datecolumn
        mode: replace 
        # by default mode: append is used, if not defined. 
        # Imported records are appended to the target table with this mode.
        # mode: replace, replaces existing target table
        default_timestamp_format: '%d/%m/%Y'

You want to hide the MySQL port, username, password, and database. On the output section, you might want to hide your API key. Using the correct naming conventions: {{ env.replaced_details }}, the file becomes the following:

    in:
        type: mysql
        host: {{ env.db_host }}
        port: {{ env.db_port }}
        user: {{ env.db_username }}
        password: {{ env.db_password }}
        database: {{ env.db_name }}
        select: "col1, col2, datecolumn"
        where: "col4 != 'a'"
    out:
        type: td
        apikey: {{ env.td_apikey }}
        endpoint: {{ env.api_endpoint }}
        database: {{ env.td_db_name }}
        table: {{ env.td_table_name }}
        time_column: datecolumn
        mode: replace 
        #by default mode: append is used, if not defined. Imported records 
        #are appended to the target table with this mode.
        #mode: replace, replaces existing target table
        default_timestamp_format: '%d/%m/%Y'.

Prerequisites

Understanding Environment Variable Naming Conventions

Setting Environment Variables

Example config.yml.liquid File

Was this helpful?