Sentiment Analysis With Tensorflow
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude
Connect to Cursor
Install MCP server on Cursor
Connect to VS Code
Install MCP server on VS Code

Treasure Data Workflow provides an easy way to leverage Python custom scripts for sentiment analysis with TensorFlow and export its model to Amazon S3. Machine Learning algorithms can be run as part of your scheduled workflows, using Python Custom scripts. This article introduces the steps to run the ML algorithm Sentimental Analysis within a Treasure Data Workflow.

Sentimental Analysis classifies texts as positive/negative, for movie reviews using TensorFlow and TensorFlow Hub. See the official document.

Sentimental Analysis using Python Custom Scripts

There are two versions of the algorithm discussed in this article:

Example Workflow using TensorFlow with Amazon S3
Example Workflow using TensorFlow without Amazon S3

Example Workflow using TensorFlow with Amazon S3

The workflow:

Fetches review data from Treasure Data
Builds a model with TensorFlow
Stores the model on S3
Predicts polarities for unknown review data and writes it back to Treasure Data

Prerequisites

Make sure the custom scripts feature is enabled for your TD account.
Download and install the TD Toolbelt and the TD Toolbelt Workflow module.
Basic Knowledge of Treasure Data Workflow syntax
AWS S3
S3 Secrets

Run the Example Workflow

Download the sentimental-analysis project from this repository
In the Terminal window, change directory to sentimental-analysis
Run data.sh to ingest training and test data on Treasure Data. About 80 million records are fetched to build the model. The script also creates a database named sentiment and tables named movie_review_train and movie_review_test to store the data. For example:

$ ./data.sh

Assume that the input table is:

rowid	sentence	sentiment	polarity
1-10531	"Bela Lugosi revels in his role as European horticulturist (sic) Dr. Lorenz in this outlandish...	2	0
1-10960	Fragmentaric movie about a couple of people in Austria during a heatwave. This kind of...	3	0
1-24370	I viewed the movie together with my arrogant, film critic friend, my wife and her female friend. So...	7	1

Run the example workflow as follows:

td workflow push sentiment

Set secrets from STDIN like:

apikey=x/xxxxx, endpoint=https://api.treasuredata.com, s3_bucket=my_bucket, or  
aws_access_key_id=AAAAAAAAAA, aws_secret_access_key=XXXXXXXXX

td workflow secrets \
--project sentiment \
--set apikey \
--set endpoint \
--set s3_bucket \
--set aws_access_key_id \
--set aws_secret_access_key
         
# Set secrets from STDIN like: 
apikey=x/xxxxx, endpoint=https://api.treasuredata.com, s3_bucket=my_bucket,aws_access_key_id=AAAAAAAAAA, aws_secret_access_key=XXXXXXXXX

Start the analysis:

td workflow start sentiment sentiment-analysis  --session now

Results of the script are stored in the test_predicted_polarities table in Treasure Data.

To view the table:

Log into TD Console.
Search for the sentiments database.
Locate the test_predicted_polarities table.
The prediction results are stored in this table as shown below:

rowid	predicted_polarity
1-21643	0
1-22967	1

Example Workflow using TensorFlow without Amazon S3

The workflow:

Fetches review data from Treasure Data
Builds a model with TensorFlow
Predicts polarities for unknown review data and writes the data back to Treasure Data

Prerequisites

Make sure this feature is enabled for your TD account.
Download and install the TD Toolbelt and the TD Toolbelt Workflow module.
Basic Knowledge of Treasure Data Workflow syntax

Run the Example Workflow

Download the sentimental-analysis project from this repository.
From the command line Terminal window, change directory to sentimental-analysis. For example:

cd sentiment-analysis

Run data.sh to ingest training and test data on Treasure Data. About 80 million records are fetched to build the model, the script also creates a database named sentiment and tables named movie_review_train and movie_review_test to store the data.

$ ./data.sh

Assume that the input table is as follows:

rowid	sentence	sentiment	polarity
1-10531	"Bela Lugosi revels in his role as European horticulturist (sic) Dr. Lorenz in this outlandish...	2	0
1-10960	Fragmentaric movie about a couple of people in Austria during a heatwave. This kind of...	3	0
1-24370	I viewed the movie together with my arrogant, film critic friend, my wife and her female friend. So...	7	1

Run the example workflow as follows:

td workflow push sentiment

Add secrets from STDIN like: apikey=x/xxxxx, endpoint=https://api.treasuredata.com

td workflow secrets \
  --project sentiment \
  --set apikey \
  --set endpoint

Start the analysis

td workflow start sentiment sentiment-analysis-simple --session now

Results of the script are stored in the test_predicted_polarities table in Treasure Data.

To view the table:

Log into TD Console.
Search for the sentiments database.
Locate the test_predicted_polarities table.

The prediction results should be similar to the following:

rowid	predicted_polarity
1-21643	0
1-22967	1

Review the Workflow Custom Python Script

Review the contents of the sentimental-analysis directory:

sentiment-analysis.dig - This is the TD Workflow YAML file for sentiment analysis with TensorFlow.
sentiment.py - This is the Custom Python script with TensorFlow. It builds a prediction model with existing data and predicts polarity to unknown data.

In this example, we use a pre-trained model in TensorFlowHub for word embedding for English text.

embedded_text_feature_column = hub.text_embedding_column(
  key="sentence",
module_spec="https://tfhub.dev/google/nnlm-en-dim128/1"
)

If you want to change this model to another one, for example, Japanese model, you can modify it as follows:

embedded_text_feature_column = hub.text_embedding_column(
  key="sentence",
  module_spec="https://tfhub.dev/google/nnlm-ja-dim128/1"
)

Before word embedding, you need to prepare tokenized sentences for Japanese.

Because this custom script also saves the trained TensorFlow model with movie reviews to Amazon S3, you can build your prediction server with TensorFlow Serving.

To change the serving_input_receiver_fn , modify the following code:

feature_spec = tf.feature_column.make_parse_example_spec([embedded_text_feature_column])
serving_input_receiver_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)
estimator.export_saved_model(EXPORT_DIR_BASE, serving_input_receiver_fn)

See TensorFlow documentation for details.

Sentiment Analysis With TensorflowCopyCopy for LLMCopy page as Markdown for LLMsView as MarkdownOpen this page as MarkdownOpen in ChatGPTGet insights from ChatGPTOpen in ClaudeGet insights from ClaudeConnect to CursorInstall MCP server on CursorConnect to VS CodeInstall MCP server on VS Code

Sentimental Analysis using Python Custom Scripts

Example Workflow using TensorFlow with Amazon S3

Prerequisites

Example Workflow using TensorFlow without Amazon S3

Prerequisites

Review the Workflow Custom Python Script

Was this helpful?