This notebook runs Exploratory Data Analysis (EDA) targeting the table specified by the input_table parameter.
Supported analytics methods:
- Basic EDA based on Pandas DataFrame
- Pandas Profiling
- EDA based on Sweetviz
- Missing data visualization based on missingno
Some example visualizations from the EDA Notebook are shown below:



Find a sample workflow here in Treasure Boxes.
+run_eda:
ipynb>:
notebook: EDA
input_table: ml_datasets.bank_marketing
eda: all
sampling_threshold: 1000000| Parameter name | Parameter on Console | Description | Default Value |
|---|---|---|---|
| docker.task_mem | Docker Task Mem | Task memory size. Available values are 64g, 128g (default), 256g, 384g, or 512g depending on your contracted tiers | 128g |
| input_table | Input Table | specify a TD table used for EDA as dbname.table_name | - |
| target_column | Target Column | column name used for the label | None |
| ignore_columns | Ignore Columns | columns to ignore for EDA | time |
| sampling_threshold | Sampling Threshold | threshold used for sampling. See the executed notebook in detail | 10_000_000 |
| eda | Eda | all or comma separated strings to specify types of EDA to run | all |