# Time Series Analysis for Sales Prediction With Prophet Examples

Treasure Workflow provides for the prediction of time-series values, like sales revenue or page views, using Facebook Prophet. Machine Learning algorithms can be run from a custom Python script as part of your scheduled workflows. Use[ Facebook Prophet](https://facebook.github.io/prophet) in your Python custom script for time series analysis and sales data prediction.

Prophet is a procedure for forecasting time series data. It can learn probability distributions from incomplete data sets and shifts in the trends. Details of the data for these examples can be found in[ the Prophet’s official document](https://facebook.github.io/prophet/docs/non-daily_data.md#monthly-data).

![](/assets/image-20191007-173703.ef21fb3c08a9110670a4b3b5da0c2bacab34f7d6070295e67de7c1b0bb1e1ba7.2e4f387b.png)

These examples predict sales data using Facebook Prophet's time series analysis to predict continuous values using past data. Depending on your access to S3 and Slack, pick the example to use.

## Example Workflow to Predict Future Sales with S3 and Slack

The workflow:

* Fetches past sales data from Treasure Data
* Builds a model with Prophet
* Predicts future sales and write back to Treasure Data
* Uploads predicted figures to Amazon S3
* Sends a notification to Slack


### Prerequisites

* Make sure that the custom scripts feature is enabled for your TD account.
* Download and install the TD Toolbelt and the TD Toolbelt Workflow module. For more information, see [TD Workflow Quickstart.](/products/customer-data-platform/data-workbench/workflows/treasure-workflow-quick-start-using-td-toolbelt-in-a-cli)
* Basic Knowledge of Treasure Workflow's syntax
* S3 bucket and associated credentials
* Create your Slack[ webhook](https://api.slack.com/incoming-webhooks) URL


### Running the Example Workflow

1. Download the project from this [repository](https://github.com/treasure-data/treasure-boxes/tree/master/machine-learning-box/sales-prediction)
2. In the command line Terminal window, change directory to timeseries. For example:


```bash
cd timeseries
```

1. Run *data.sh* to prepare example data. It creates a database called timeseries and table called retail_sales.


```bash
./data.sh
```

1. Run the example workflow as follows:


```bash
td workflow push prophet

td workflow secrets \
--project prophet \
--set apikey \
--set endpoint \
--set s3_bucket \
--set aws_access_key_id \
--set aws_secret_access_key \
--set slack_webhook_url
# Set secrets from STDIN like: apikey=x/xxxxx, endpoint=https://api.treasuredata.com, s3_bucket=$S3_BUCKET,
# aws_access_key_id=AAAAAAAAAA, aws_secret_access_key=XXXXXXXXX,  slack_webhook_url=https://hooks.slack.com/services/XXXXXXX/XXXXXX/XXXXXX 
# Where XXXXXXX/XXXXXX/XXXXXX is the value of the slack URL where you want information to be populated automatically. $ td workflow start prophet predict_sales --session now
```

A notification of prediction results is sent to your Slack:

Prediction results are stored in the predicted_sales table in the timeseries database.

Validate predicted sales values in TD Console with SQL as follows:


```sql
SELECT
  ds,
  yhat,
  yhat_lower,
  yhat_upper
FROM
  timeseries.predicted_sales
ORDER BY
  ds DESC LIMIT 100
```

You can see the predicted sales numbers in that column.

## Example Workflow to Predict Future Sales without S3 or Slack

This example is for when you do not have access to S3 or Slack.

The workflow:

* Fetches past sales data from Treasure Data
* Builds a model with Prophet
* Predicts future sales and writes back to Treasure Data


### Prerequisites

* Make sure the custom scripts feature is enabled for your TD account.
* Download and install the TD Toolbelt and the TD Toolbelt Workflow module. For more information, see [TD Workflow Quickstart.](/products/customer-data-platform/data-workbench/workflows/treasure-workflow-quick-start-using-td-toolbelt-in-a-cli)
* Basic Knowledge of Treasure Workflow's syntax


### Running the Example Workflow

1. Download the project from this [repository](https://github.com/treasure-data/treasure-boxes/tree/master/machine-learning-box/sales-prediction)
2. In the command line Terminal window, change directory to timeseries. For example:


```bash
cd timeseries
```

1. Run *data.sh* to prepare example data. It creates a database called timeseries and table called retail_sales.


```bash
./data.sh
```


```bash
td workflow push prophet

td workflow secrets \
--project prophet \
--set apikey \
--set endpoint
# Set secrets from STDIN like: apikey=x/xxxxx, endpoint=https://api.treasuredata.com


td workflow start prophet predict_sales_simple --session now
```

Predictions are stored in Treasure Data.

### Review the Workflow Custom Python Script

Make sure you are signed into a git account that has permission to see Treasure Data workflow samples.

Review the contents of the directory:

* [predict_sales.dig](https://github.com/treasure-data/workflow-examples/blob/master/machine-learning/time_series/predict_sales.dig) - Example workflow for sales prediction and notification to Slack.
* [predict.py](https://github.com/treasure-data/workflow-examples/blob/master/machine-learning/time_series/predict.py) - Custom Python script with Prophet. It also uploads figures to Amazon S3


If you don't need to send a notification to Slack, you can remove "*+send_graph* " step in the predict_sales.dig.

This example uses a dataset for sales data. You can use your own data in Treasure Data after modifying the query in the read_td function.

You must prepare the *ds* (datestamp) column and the *y* column. The y column represents target numerical values for forecasting, such as sales values or website page views (PVs).

Here is example code for page view logs:


```python
import pytd.pandas_td as td

engine = td.create_engine('presto:sample_datasets')

start_date = '2014-01-01'
end_date = '2018-12-31'

df = td.read_td(f"""select  TD_TIME_FORMAT(TD_DATE_TRUNC('minute', time), 'yyyy-MM-dd HH:mm:ss') as ds,  count(1) as yFrom  www_accessWhere  TD_TIME_RANGE(time, '{start_date}', '{end_date}', 'PDT')group by  1Order by  1""",engine)

m = Prophet(changepoint_prior_scale=0.01).fit(df)

future = m.make_future_dataframe(periods=5, freq='M')

fcst = m.predict(future)
```