# Gluon Train This notebook builds a prediction model using [AutoGluon](https://auto.gluon.ai/stable/index.md), automating data processing, feature engineering, model selection, ensembling, and hyperparameter tuning. AutoGluon combines several ML models using stack ensembling. Supported models include: * Neural network (MXNet, FastAI) * Gradient Boosting Models (LightGBM, CatBoost, XGBoost) * Random Forests * Extremely Randomized Trees * k-Nearest Neighbors ## Workflow Example Find a sample workflow in [Treasure Boxes](https://github.com/treasure-data/treasure-boxes/blob/automl/machine-learning-box/automl/ml_experiment.dig). ``` +gluon_train: ml_train>: notebook: gluon_train model_name: gluon_model input_table: ml_dataset.bank_marketing target_column: loan time_limit: 3 * 60 # soft time limit in seconds ``` ## Parameters | Parameter name | Console Name | Description | Default | | --- | --- | --- | --- | | docker.task_mem | Docker Task Mem | Task memory size. Available: 64g, 128g (default), 256g, 384g, 512g (tier dependent). | 128g | | input_table | Input Table | TD table used for EDA as dbname.table_name. | - | | target_column | Target Column | Column name used for the label. | - | | model_name | Model Name | Prediction model name. | - | | problem_type | Problem Type | One of binary, multiclass, regression, quantile. Inferred if not specified. | None | | oversampling_threshold | Oversampling Threshold | Threshold rate of minority class for SMOTE oversampling (binary only). 0 disables. | 0.001 | | proba_calibration | Proba Calibration | Run probability calibration after oversampling. | True | | eval_metric | Eval Metric | Automatically selected if not specified. | None | | ignore_columns | Ignore Columns | Columns to ignore. | time | | time_limit | Time Limit | Soft training limit in seconds (max 24h). Hint to AutoGluon. | 60 * 60 | | sampling_threshold | Sampling Threshold | Threshold used for sampling. See executed notebook. | 10_000_000 | | export_leaderboard | Export Leaderboard | Export leaderboard as TD table if specified. | None | | export_feature_importance | Export Feature Importance | Export feature importance as TD table if specified. | None | | exclude_models | Exclude Model | Models to ignore. | KNN | | hide_table_contents | Hide Table Contents | Suppress showing table contents. | False | | share_model | Share Model | Share trained models in an account. | False | | refit_full | Refit Full | Retrain models on all data. Choices: best, false, default. | default | Accepted eval_metric values: * Binary & Multiclass: accuracy, balanced_accuracy, f1, f1_macro, f1_micro, f1_weighted, average_precision, precision, precision_macro, precision_micro, precision_weighted, recall, recall_macro, recall_micro, recall_weighted, log_loss (multiclass default), pac_score * Binary only: roc_auc (binary default), roc_auc_ovo_macro * Regression: root_mean_squared_error (default), mean_squared_error, mean_absolute_error, median_absolute_error, r2 * Quantile Regression: pinball_loss (default) For more details, see the [AutoGluon documentation](https://auto.gluon.ai/0.3.1/api/autogluon.predictor.md?highlight=eval_metric#module-0).