Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: code formatting changes & moved few sections under "Using Docker Images on Your Local Laptop"

You can run Python scripts from the TD Workflow using the Python operator (py>:). Create the workflow definition using TD Console or using TD Workflow from the command line.

...

  1. Navigate to Data Workbench > Workflows.

  2. Select the workflow to which you would like to add the Python scripts.

  3. Select Launch Project Editor.

  4. Select Edit Files.

  5. Select Add New File.

  6. Type in your dig filename.
  7. Add the py> operator and specify a Docker image that you want to use. Your script might look like this sample:

    Code Block
    languageyml
    linenumberstrue
    +py_custom_code:
    		
      py>: tasks.printMessage
      docker:
    		    image: "digdag/digdag-python:3.9"
    		py>: tasks.printMessage


  8. You can add each script or copy-paste the text of each script into the new script editor window.

  9. Select Save & Commit.   

...

Using td CLI

You can add a python script to your existing workflow using the command line. New users may need to first create a workflow using the command line.

  1. Add a workflow definition .dig file and Python script to the workflow directory.

  2. Specify a Docker image that you want to use for the ‘py>’ py>: operator in the .dig file.

  3. Add syntax similar to the following to your workflow dig file to add the py> operator and specify the Docker image. Your script might look like the following sample: 

  4. Push the workflow to Treasure Data using td CLI command `td wf push <project_name>`
Code Block
languageyml
linenumberstrue
+<wf_task_name>:
  py>: <script_filename>.<function_name>
  docker:
    image: "digdag/digdag-python:3.9"

Running a Workflow with a Python Custom Script 

To run an interactive session, you can run as follows:

Code Block
$ docker run -it --rm digdag/digdag-python:3.9 bash
$ whoami
> td-user

Python interactive shell is launched when running digdag/digdag-python:3.9 without arguments:

Code Block
$ docker run -it --rm digdag/digdag-python:3.9 
> Python 3.9.1 (default, Jan 12 2021, 16:56:42) 
>> 

Docker Images

The Python scripts in TD Workflows are managed and run by Treasure Data in isolated Docker containers. Treasure Data provides a number of base Docker images to run in the container. You can pick the appropriate Docker image to run your Python script in, based on the Python version and libraries supported by the image.

...

View the below sample using the Python 3.9 Docker image. 

Code Block
languageyml
linenumberstrue
+task_name:
  py>: <script_filename>.<function_name>
  docker:
    image: "digdag/digdag-python:3.9"

...

Code Block
languagepy
$ docker run -it --rm digdag/digdag-python:3.9 python --version
> Python 3.9.1

To run an interactive session, you can run as follows:

Code Block
$ docker run -it --rm digdag/digdag-python:3.9 bash
$ whoami
> td-user

Python interactive shell is launched when running digdag/digdag-python:3.9 without arguments:

Code Block
$ docker run -it --rm digdag/digdag-python:3.9 
> Python 3.9.1 (default, Jan 12 2021, 16:56:42) 
>> 

You can get a complete list of library versions using pip freeze:

...