# AutoML FAQs

* [Product FAQs](#product-faqs)
* [Technical FAQs](#technical-faqs)
* [Licensing FAQs](#licensing-faqs)


# Product FAQs

* **What’s the merit of using TD AutoML when compared to other products?**
Seamless integration with Treasure Data is the biggest merit of using TD AutoML. AutoML provides large reduction in time spent on ML model building and servicing, and users can focus on creating business value over technical tasks.
* **What are the required skills for using AutoML? Can marketers use it?**
TD Console provides UI for running AutoML to enable non-technical users. The expected skillset to use the AutoML offering would be that of a “Marketing Analyst” or “Data Analyst”. They need skills in basic principles of data quality, establishing a business objective, and training a model using a training data set. They can use the basic AutoML solution on a data table matching the requirements by specifying the target value column to train a model, and using default values for other parameters. For more complex end-to-end data processing, a “Data Engineer” or “Data Scientist” typically sets up workflows and evaluates model training. Trained models can then be shared with other users who schedule workflows to execute further predictions.
* **Is there a training program to enable users of the AutoML feature?**
A training program is in development for future release.
* **What is a Unit Hour for AutoML? How can a user monitor the unit hours consumed?**
The amount of time (in hours) that the service is used. Based on the task type, the unit hours consumed vary, with higher task types consuming more. Treasure Data provides utilization dashboards for consumed unit hours. It is updated daily and shows overall unit hour usage for the day. The Workflow log (viewable from TD Console) shows unit hour consumption of individual AutoML tasks. For details, see [Monitor AutoML Usage](/products/customer-data-platform/machine-learning/automl/advanced-topics/automl-model-sharing).
* **Does AutoML always give the correct results?**
Some familiarity with the basic principles of machine learning, result accuracy, data quality, and an understanding of how the results map onto business goals helps non‑data scientists interpret results and get the most from AutoML.
* **How do I ensure that my data is of sufficient quality?**
Use in-built tools in notebooks. For example, the EDA Notebook provides tools for analyzing data sources before creating a model using AutoGluon.
* **As a Data Scientist, how can I use AutoML for quick iteration to speed up deployment?**
Using the built-in training operator (AutoGluon), you can perform iterative training runs, use different data sets and durations, and compare resulting accuracy scores and outputs. After creating the optimal model, use it for scheduled prediction tasks.
* **Can I run my own custom Jupyter notebook using AutoML?**
Not supported. Treasure Data lets you run custom scripts from Data Workbench, but this is outside the AutoML dedicated container.
* **Can I hide the code from the generated notebook?**
Yes. Each notebook has a Toggle Code button.
* **Can I use AutoML for clustering?**
Not supported (planned for future releases).
* **Can I use AutoML for text analysis?**
Not supported (planned for future releases).
* **Can I perform model distillation to reduce unit-hours during prediction?**
Not supported (planned for future releases).
* **Is Task Throttling (random delay) refreshed at the next billing period?**
No. It depends on reported excess usage at the end of each month. Example: if usage exceeded 300 unit hours on May 20, throttling remains until additional unit hours are purchased or overage is resolved at the end of the following month. If overage does not occur for all of June, throttling is disabled for July.
* **What is the minimum contract length for AutoML?**
Contact Treasure Data Sales for current terms.


# Technical FAQs

* **Is AutoML service running on AWS SageMaker or other AWS services?**
AutoML uses AWS ECS with a custom image. It does not currently use SageMaker (it may in future).
* **What’s the technical limitation of Treasure AutoML?**
Tasks time out after 25 hours. Every AutoML task must finish within 25 hours.
* **Do some input data characteristics require longer training time?**
Training time depends on amount, distribution, and complexity of data.
* **Is the AutoML workflow concurrency number separate from normal workflow (query) concurrency?**
Yes, they are independent.
* **How long are models stored or accessible?**
Models are permanently stored. Notebooks are accessible for 365 days from creation; after that they are deleted. Download notebooks if you need long-term access.
* **What are the user permissions for accessing AutoML features? Can I access notebooks executed by other users?**
The user who launches an AutoML workflow session owns the model. Prediction models are visible only to owners. For scheduled workflows, the last editor is the owner. Executed notebooks are only visible to the session owner because they can contain table previews.
* **Is it possible to access prediction models trained by other users?**
Yes, via shared models. A shareable model reference is stored as a task parameter. Others can use the shared model UUID as the *model_name* option of a prediction task. See [AutoML Model Sharing](/products/customer-data-platform/machine-learning/automl/advanced-topics/automl-model-sharing).
* **Is it possible to share notebooks with other users?**
Not supported. Download and share via your own method.
* **Does AutoML perform feature enrichment or augmentation?**
AutoML performs several AutoGluon automatic feature engineering steps: datetime conversion, n‑gram features for space‑separated text (see AutoGluon docs). For CJK text, preprocess with Hivemall tokenizer functions (Hive engine).
* **What are recommended training time limits?**
Depends on task/dataset. Recommended ~6 hours for large production datasets and at least 3 hours for typical training. With a 3‑hour limit (10,800 seconds) and 20+ models, each model can use roughly 9 minutes.
* **What does Treasure AutoML do to avoid overfitting?**
AutoGluon uses n‑repeated k‑fold bagging. Use Shapley values and permutation feature importance to check overfitting.
* **For AutoGluon, can I specify execution models?**
Cannot force specific models, but you can exclude with the [exclude_models option](/products/customer-data-platform/machine-learning/automl/notebook-solutions/gluon-train).
* **How should we handle Multicollinearity in Treasure AutoML?**
Ensembled GBDT models are robust; correlated features are often ignored. Neural nets may use Lasso‑like regularization. AutoML also shows correlation matrices, permutation importance, and SHAP values.
* **Can a user select which model to use?**
Not directly; use exclusion to shape the ensemble.
* **How does AutoML handle imbalanced data?**
Supports SMOTE oversampling and probability calibration; uses robust models (Random Forest, XGBoost, etc.).
* **Can TD AutoML handle Extrapolation in regression?**
Tree models are not robust to extrapolation. Use [Time Series Forecasting](/products/customer-data-platform/machine-learning/automl/notebook-solutions/time-series-forecasting) for future predictions.
* **How can I simplify managing many input/output tables?**
See [ML Experiment Tracking and Model Management](/products/customer-data-platform/machine-learning/automl/advanced-topics/ml-experiment-tracking-and-model-management).
* **For RFM, can I apply arbitrary divisions instead of quartiles?**
Quartiles are fixed.
* **For MTA, can I analyze whether a page’s effect is causal?**
No personalization is applied. The Shapley attribution model treats each channel as a player in a cooperative game.
* **Which label is used for the positive class in binary classification?**
AutoGluon sorts class labels in natural order; the first label is negative and the second is positive. Examples: 1/0 → 1 positive; Yes/No → Yes positive; True/False → True positive; True/false → false positive (uppercase precedes lowercase).


# Licensing FAQs

* **How should I decide my tier?**
Evaluate required memory resources, notebooks, data volume, desired concurrency, and number of users.
* **Can we swap out initially purchased Solution Notebooks with others?**
Yes, up to 3 times per year.
* **Can I decide enabled solution notebooks in advance?**
Purchase at least one bundle. For a single purchase, pick any two notebooks; after evaluation, keep one or switch (max 3 swaps/year).
* **Does viewing an executed AutoML notebook count toward unit hours?**
No. Only running tasks consume unit hours.
* **What happens if account concurrency limit is exceeded?**
New task submissions fail after limit hit; upgrade tier if needed.
* **Do unused hours roll over?**
No. Resets monthly.
* **Is the unit hour limit hard? What if we exceed contracted hours?**
Soft limit. TD monitors usage; if consistently exceeded, a hard limit may apply preventing new tasks that month.
* **Are running tasks killed if limit exceeded?**
No. Existing tasks continue; limits apply to new tasks.
* **How much memory (task_mem) is recommended?**
Depends on dataset. AutoGluon recommends 384 GB; TD suggests ≥256 GB (384 GB for tier 2/3) for training, 128 GB for prediction, 64 GB for exploration.
* **Any other cost from AutoML usage?**
Bulk import of prediction results counts toward bulk import usage (not AutoML usage).