# Legacy Bulk Import For Aws S3

This article explains how to import data directly from Amazon S3 to Treasure Data.

# Install Bulk Import

First, install the [Toolbelt](https://support.treasuredata.com/hc/en-us/articles/command-line), which includes bulk loader program, on your computer.

## Downloads

- [Toolbelt Installer for Windows](https://toolbelt.treasuredata.com/win)
- [Toolbelt Installer for Mac OS X](https://toolbelt.treasuredata.com/mac)
- [Toolbelt Installer for Linux](https://toolbelt.treasuredata.com/linux)


After the installation, the `td` command will be installed on your computer. Open up the terminal and type `td` to execute the command. Also, make sure you have `java` as well. Run `td import:jar_update` to download the up-to-date version of our bulk loader:


```
$ td
usage: td [options] COMMAND [args]
$ java
Usage: java [-options] class [args...]
$ td import:jar_update
Installed td-import.jar 0.x.xx into /path/to/.td/java
```

# Authenticate

Log in to your Treasure Data account.


```
$ td account -f
Enter your Treasure Data credentials.
Email: xxxxx
Password (typing will be hidden):
Authenticated successfully.
Use 'td db:create db_name' to create a database.
```

# Importing data from Amazon S3

The bulk loader can read data from files stored in Amazon S3 in all three supported file formats:

- CSV (default)
- JSON
- TSV


Suppose you have a file called `data.csv` on Amazon S3 with these contents:


```
"host","log_name","date_time","method","url","res_code","bytes","referer","user_agent"
"64.242.88.10","-","2004-03-07 16:05:49","GET","/twiki/bin/edit/Main/Double_bounce_sender?topicparent=Main.ConfigurationVariables",401,12846,"",""
"64.242.88.10","-","2004-03-07 16:06:51","GET","/twiki/bin/rdiff/TWiki/NewUserTemplate?rev1=1.3&rev2=1.2",200,4523,"",""
"64.242.88.10","-","2004-03-07 16:10:02","GET","/mailman/listinfo/hsdivision",200,6291,"",""
"64.242.88.10","-","2004-03-07 16:11:58","GET","/twiki/bin/view/TWiki/WikiSyntax",200,7352,"",""
```

Execute the following commands to upload the CSV file:


```
$ td db:create my_db
$ td table:create my_db my_tbl
$ td import:auto \
  --format csv --column-header \
  --time-column date_time \
  --time-format "%Y-%m-%d %H:%M:%S" \
  --auto-create my_db.my_tbl \
  "s3://s3_access_key:s3_secret_key@/my_bucket/path/to/data.csv"
```

where the location of the file is expressed as an S3 path with the AWS public and private access keys embedded in it.

Because `td import:auto` executes MapReduce jobs to check the invalid rows, it'll take at least **1-2 minutes**. If the column chosen for `--time-column` is in epoch timestamp (or unix time), you don't need the `--time-format` flag.

In the above command, we assumed that:

- The CSV files are located on Amazon S3, within a bucket called `my_bucket` under this path/key `/path/to/`.
- The first line in the file indicates the column names, hence we specify the `--column-header` option. If the file does not have the column names in the first row, you will have to specify the column names with the `--columns` option (and optionally the column types with `--column-types` option), or use the `--column-types` for each column in the file.
- The time field is called “date_time” and it’s specified with the `--time-column` option
- The time format is `%Y-%m-%d %H:%M:%S` and it’s specified with the `--time-format` option


### Wildcards

The source files to be imported by the bulk loader can be specified as full Amazon S3 paths or using wildcards. Here are some examples:

- `s3://my_bucket/path/to/data*`
All files under `my_bucket/path/to/` with prefix `data`;
- `s3://my_bucket/path/to/data*.csv`
All files under `my_bucket/path/to/` with prefix `data` and extension `.csv`;
- `s3://my_bucket/path/to/*.csv`
All files under `my_bucket/path/to/` with extension `.csv`;
- `s3://my_bucket/path/to/*`
All files under `my_bucket/path/to/`;
- `s3://my_bucket/path/to/*/*.csv`
All files in the direct subfolders of `my_bucket/path/` with extension `.csv`;
- `s3://my_bucket/**/*.csv`
All files in all subfolders of `my_bucket/path/` with extension `.csv`;


For further details, check the following pages:

- [Bulk Import Internals](/int/legacy-bulk-import-internals)
- [Bulk Import Tips and Tricks](/int/legacy-bulk-import-tips-and-tricks)