Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Set your master API key as an environment variable before launching Jupyter. The master API KEY can be retrieved from the TD Console profile.

Code Block
linenumberstrue
$ export TD_API_KEY="1234/abcde..."

...

For more information and instructions, see Installing Conda, Pandas, matplotlib, Jupyter Notebook, and pytd.

Run Jupyter and Create your First Notebook

...

  1. Run Notebook using the following syntax:

    Code Block
    linenumberstrue
    (analysis)$ ipython notebook


  2. Your web browser will open:

  3. Select New > Python 3.

  4. Copy and paste the following text into your notebook:

    Code Block
    linenumberstrue
    %matplotlib inline
    
    import os
    import pandas as pd
    import pytd.pandas_td as td
    # Initialize the connection to Treasure Data
    
    con = td.connect(apikey=os.environ['TD_API_KEY'], endpoint='https://api.treasuredata.com')

    Image Modified

  5. Your notebook should now look similar to

    Image Modified
  6. Type Shift-Enter.
    If you get "KeyError: 'TD_API_KEY'" error, try "apikey='<your master apikey>'" instead of "apikey=os.environ['TD_API_KEY']".
    If it works, Jupyter didn't recognize the TD_API_KEY variable from the OS.
    Confirm the TD_API_KEY again and re-launch Jupyter.

  7. Optionally, save your notebook.

...

  • You can sample data. For example, the “nasdaq” “Nasdaq” table has 8,807,278 rows. Setting a limit of 100000 results in 100,000 rows, which is a reasonable size to retrieve:


  • Write SQL and limit data from the server-side. For example, as we are interested only in data related to “AAPL”, let’s count the number of records, using read_td_query:

    It’s small enough, so we can retrieve all the rows and start analyzing data:


...