Skip to content
Last updated

Introduction to Custom Scripts

Custom Scripts enable the running of containerized Python scripts from within a Treasure Data Workflow, providing for greater flexibility of custom logic. Typical uses include:

  • Extend the capabilities of data connectors and other integrations.

  • Create efficient data manipulation and processing logic in Python and invoke it from workflows.

  • Productionize your data science work, by enabling Python models to be run as part of regularly scheduled Treasure Workflows.

  • Consolidate your data management into one environment. Use Treasure Workflow to connect multiple data environments.

Custom Script Requirements

Supported Python versions:

  • Python 3.12​
  • Python 3.10
  • Python 3.9

Supported Docker Images

  • treasuredata/customscript-python:3.12.11-td0 [current stable]
  • Python 3.12.11
  • pytd 2.2.0
    • digdag/digdag-python:3.10.1
  • Python 3.10
  • pytd 1.5.1
  • td-pyspark 22.7.1
    • digdag/digdag-python:3.9.22-td1
  • Python 3.9.22
  • pytd 1.5.2
  • td-pyspark 21.3.0​
    • digdag/digdag-python:3.9.2 [deprecated]
  • Python 3.9.2
  • pytd 1.5.1
  • td-pyspark 21.3.0
    • digdag/digdag-python:3.9 [deprecated]
  • Python 3.9.2
  • pytd 1.4.0
  • td-pyspark 20.12.0

For more details of supported Docker images, see Custom Scripts Docker images document.

Example Treasure Workflow Custom Script Syntax

The following snippet is an example from a workflow:

+py_custom_code:
  py>: tasks.printMessage
  docker:
    image: "digdag/digdag-python:3.9.22-td1"

Installing Your own Python Libraries

The Python scripts in Treasure Workflows are managed and run by Treasure Data in isolated Docker containers. Treasure Data provides a number of base Docker images to run in the container.

In addition to the libraries provided by the Docker image, you can install additional 3rd party libraries using the pip install command within the Python script.

You can pick the appropriate Docker image to run your Python script in, based on the Python version and libraries supported by the image.

From within your Python script, add the following syntax to install libraries from the Python script:

os.system(f"{sys.executable} -m pip install asn1==3.1.0")

Links to Other Articles

ArticleDescription
Passing parameters to Custom ScriptsYou can use environment variables to pass parameters and credentials to the Custom Script using _env.
Executing custom script tasks in parallel within a workflowMultiple Python scripts can be run in parallel within a workflow, using the _parallel operator.
Python Custom ScriptingTo walk through a complete Custom Scripts tutorial.
Treasure Workflow Service LimitsPeriod of time until an executed custom script is killed is 1 day.