Treasure Machine Learning (ML) is based on Apache Hivemall, which is a scalable machine learning library that runs on Apache Hive. Hivemall is designed to be scalable to the number of training instances and the number of training features. For more information, see the Hivemall User Guide.


Supported Algorithms

Hivemall provides machine learning functionality and feature engineering functions through the Hive user-defined functions (UDFs), user-defined aggregation function (UDAFs), and user-defined tabular functions (UDTFs).

Classification

Regression

Recommendation

k-Nearest Neighbor

Feature Engineering

Hivemall Evaluation UDFs

Evaluation UDFs are useful for evaluating the accuracy of your machine learning model.

Binary Classification Metrics

Signature

auc(probability, truth_label)
fmeasure(truth_label, predicted_label)

Description

See the Hivemall user guide.

Ranking Measures

Signature

auc(recommend_list, truth_list, k)
precision_at(recommend_list, truth_list, k)
recall_at(recommend_list, truth_list, k)
mrr(recommend_list, truth_list, k)
average_precision(recommend_list, truth_list, k)
hitrate(recommend_list, truth_list, k)

Description

See the Hivemall user guide.