Skip to content
Last updated

Exploratory Data Analysis

This notebook runs Exploratory Data Analysis (EDA) targeting the table specified by the input_table parameter.

Supported analytics methods:

Some example visualizations from the EDA Notebook are shown below:

EDA Workflow Example

Find a sample workflow here in Treasure Boxes.

+run_eda:
  ipynb>:
    notebook: EDA
    input_table: ml_datasets.bank_marketing
    eda: all
    sampling_threshold: 1000000

Parameters

Parameter nameParameter on ConsoleDescriptionDefault Value
docker.task_memDocker Task MemTask memory size. Available values are 64g, 128g (default), 256g, 384g, or 512g depending on your contracted tiers128g
input_tableInput Tablespecify a TD table used for EDA as dbname.table_name-
target_columnTarget Columncolumn name used for the labelNone
ignore_columnsIgnore Columnscolumns to ignore for EDAtime
sampling_thresholdSampling Thresholdthreshold used for sampling. See the executed notebook in detail10_000_000
edaEdaall or comma separated strings to specify types of EDA to runall