Amazon Kinesis is a platform for streaming data on AWS, offering load and analysis of streaming data, and also providing the ability for you to build custom streaming data applications.

This article explains how to ingest data from Amazon Kinesis Stream to Treasure Data, by using AWS Lambda.


Prerequisites

  • Basic knowledge of Treasure Data.

  • Basic knowledge of Amazon Kinesis

Retrieve Master API key

You can get the master key from your TD Console.

Set up AWS Lambda function

AWS Lambda is a part of the data ingestion pipeline. By using AWS Lambda, you can execute code in response to triggers from Amazon Kinesis.

Select Blueprint

  1. Select kinesis-process-record-python.


Configure Event Sources

  1. Select Kinesis as Event source type.

  2. Specify the name of stream as Kinesis Stream.


Configure Function

  1. Specify Name, Description.

  2. Select Python 3.x as Runtime.


Configure Lambda_handler Python Script

Streaming events in Kinesis are processed by python function lambda_handler.py and TreasureData provides an example script that imports Kinesis Firehose data stream events as one of the solutions from Treasure Boxes.

Copy and paste the Python script from Treasure Boxes link. README file should give more details about steps to run.

Review the Function

Review your configuration.

Select Create function.

Test the Function

After configuration, as mentioned in the Treasure Boxes, you can test the function with the following event sample.

You can use the following records for the one-time testing from Lambda UI (Actions > Configure test event).

{
  "invocationId": "invocationIdExample",
  "deliveryStreamArn": "arn:aws:kinesis:EXAMPLE",
  "region": "us-west-2",
  "records": [
    {
      "recordId": "49546986683135544286507457936321625675700192471156785154",
      "approximateArrivalTimestamp": 1495072949453,
      "data": "eyJmb28iOiAiYmFyIn0="
    }
  ]
} 

The data part of the records resolves to {"foo": bar}, with base64 encoding.

Confirm Data Upload

To confirm that the data imported successfully:

  1. Navigate to TD Console > Databases.

Data digestion takes 1-3 minutes to complete.




  • No labels