# Postgresql Export Integration You can export job results from Treasure Data to your existing PostgreSQL instance. For PostgreSQL data import, see [PostgreSQL Import Integration](http://docs.treasuredata.com/display/INT/PostgreSQL+Import+Integration). This topic includes: ## Prerequisites - Basic knowledge of Treasure Data, including the [TD Toolbelt](https://toolbelt.treasuredata.com/). - A **PostgreSQL instance**. ## Static IP Address of Treasure Data Integration If your security policy requires IP whitelisting, you must add Treasure Data's IP addresses to your allowlist to ensure a successful connection. Please find the complete list of static IP addresses, organized by region, at the following link: [https://api-docs.treasuredata.com/en/overview/ip-addresses-integrations-result-workers/](https://api-docs.treasuredata.com/en/overview/ip-addresses-integrations-result-workers/) ## Use the TD Console to Create Your Connection ### Create a New Connection 1. Configure the following field values to create a new connection. ![](/assets/postgresql-export-integration-2024-07-26.55b61de5bfe82f4f7a3cac20923331fca9118744d7cff08100fc6a21535a4ade.272a20fd.png) - **Host**: The host information of the source database, such as an IP address. - **User**: Username to connect to the source database. - **Password**: Password to connect to the source database. - **Use SSL**: Check this box to connect using SSL - **Require a valid SSL certificate?:** Require that a valid SSL certificate is presented on the connection. ### Configure Results Export to Your PostgreSQL Instance Export from Treasure Data uses queries. You can create or reuse a query. In the query, you configure the data connection. 1. Complete the instructions in [Creating a Destination Integration](https://docs.treasuredata.com/display/PD/Creating+a+Destination+Integration). 2. Navigate to **Data Workbench > Queries**. 3. Select a query for which you would like to export data. 4. Run the query to validate the result set. 5. Select **Export Results**. 6. Select an existing integration authentication. ![](/assets/postgresql-export-integration-2024-08-13-1.976a84340b2212cab5deded4fdbd2313767ee9357fc1800030b207f9fc1a3afd.272a20fd.png) 7. Define any additional Export Results details. In your export integration content review the integration parameters. For example, your Export Results screen might be different, or you might not have additional details to fill out: ![](/assets/postgresql-export-integration-2024-08-13-2.cb87c5dd29a338e49ba6b46f2d4046dc95f4cd912a038417158d84f305510c64.272a20fd.png) 8. Select **Done**. 9. Run your query. 10. Validate that your data moved to the destination you specified. ### Set the Export Result Parameters ![](/assets/postgresql-export-integration-2024-07-26-1.169ea3b63d2f08e8e6288aff3e9c8723312703e71198ea92889ee6ea3510be19.272a20fd.png) - **Database name**: The name of the database you are transferring data to. (Example: `your_database_name`) - **Table**: The table to which you would like to export the data. - **Output mode**. Different methods to upload the data. - **Append** (default): The **append** mode is the **default** mode that is used when no mode option is provided in the URL. In this mode, the query results are appended to the table. If the table does not exist, it is created. This mode is atomic. - **Replace**: The **replace** mode consists of replacing an existing table's entire content with the query's resulting output. If the table does not exist yet, a new table is created. The replace mode achieves **atomicity** (so that a consumer of the table always has consistent data) by performing the following three steps in a **single transaction**: 1. Create a temporary table; 2. Write to the temporary table; 3. Replace the existing table with the temporary table using ALTER TABLE RENAME. - **Truncate:** The system first truncates the existing table, then inserts the query results. If the table does not exist yet, a new table is created. This mode is atomic. - **Update:** A row is inserted unless it would cause a duplicate value in the columns specified in the “unique” parameter: in such case, an update is performed instead. The “unique” parameter is required when using the update mode. This mode is atomic. - **Insert Method**. This option controls how the data is written into the Postgres table. The default method is **copy**; it is also recommended for most situations. - **Copy**(default): Data is first stored in a temporary file on the server, then written to Postgres using a [COPY](http://www.postgresql.org/docs/8.1/static/sql-copy.md) transaction. This method is faster than INSERT, so it is useful when handling large data. - **Insert**: Data is written to Postgres using ‘INSERT’ statements. This is the most reliable and compatible method, and it is recommended for most situations. - **Schema**: Defines the schema where the target table is located. If not specified, the default schema is to be used. The default schema depends on the user’s “search_path” setting, but it is usually “public”. - **Foreign Data Wrapper**: This option controls whether or not a data wrapper is used to store the data. The default is none and should work in most instances. - **None** (default) - No foreign-data wrapper. - **Cstore** - used when columnar storage is required/enabled on the destination table. ### (Optional) Schedule Query Export Jobs You can use Scheduled Jobs with Result Export to periodically write the output result to a target destination that you specify. Treasure Data's scheduler feature supports periodic query execution to achieve high availability. When two specifications provide conflicting schedule specifications, the specification requesting to execute more often is followed while the other schedule specification is ignored. For example, if the cron schedule is `'0 0 1 * 1'`, then the 'day of month' specification and 'day of week' are discordant because the former specification requires it to run every first day of each month at midnight (00:00), while the latter specification requires it to run every Monday at midnight (00:00). The latter specification is followed. #### Scheduling your Job Using TD Console 1. Navigate to **Data Workbench > Queries** 2. Create a new query or select an existing query. 3. Next to **Schedule**, select None. ![](/assets/image2021-1-15_17-28-51.f1b242f6ecc7666a0097fdf37edd1682786ec11ef80eff68c66f091bc405c371.0f87d8d4.png) 4. In the drop-down, select one of the following schedule options: ![](/assets/image2021-1-15_17-29-47.45289a1c99256f125f4d887e501e204ed61f02223fde0927af5f425a89ace0c0.0f87d8d4.png) | Drop-down Value | Description | | --- | --- | | Custom cron... | Review [Custom cron... details](#custom-cron-details). | | @daily (midnight) | Run once a day at midnight (00:00 am) in the specified time zone. | | @hourly (:00) | Run every hour at 00 minutes. | | None | No schedule. | #### Custom cron... Details ![](/assets/image2021-1-15_17-30-23.0f94a8aa5f75ea03e3fec0c25b0640cd59ee48d1804a83701e5f2372deae466c.0f87d8d4.png) | **Cron Value** | **Description** | | --- | --- | | `0 * * * *` | Run once an hour. | | `0 0 * * *` | Run once a day at midnight. | | `0 0 1 * *` | Run once a month at midnight on the morning of the first day of the month. | | "" | Create a job that has no scheduled run time. | ``` * * * * * - - - - - | | | | | | | | | +----- day of week (0 - 6) (Sunday=0) | | | +---------- month (1 - 12) | | +--------------- day of month (1 - 31) | +-------------------- hour (0 - 23) +------------------------- min (0 - 59) ``` The following named entries can be used: - Day of Week: sun, mon, tue, wed, thu, fri, sat. - Month: jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec. A single space is required between each field. The values for each field can be composed of: | Field Value | Example | Example Description | | --- | --- | --- | | A single value, within the limits displayed above for each field. | | | | A wildcard `'*'` to indicate no restriction based on the field. | `'0 0 1 * *'` | Configures the schedule to run at midnight (00:00) on the first day of each month. | | A range `'2-5'`, indicating the range of accepted values for the field. | `'0 0 1-10 * *'` | Configures the schedule to run at midnight (00:00) on the first 10 days of each month. | | A list of comma-separated values `'2,3,4,5'`, indicating the list of accepted values for the field. | `0 0 1,11,21 * *'` | Configures the schedule to run at midnight (00:00) every 1st, 11th, and 21st day of each month. | | A periodicity indicator `'*/5'` to express how often based on the field's valid range of values a schedule is allowed to run. | `'30 */2 1 * *'` | Configures the schedule to run on the 1st of every month, every 2 hours starting at 00:30. `'0 0 */5 * *'` configures the schedule to run at midnight (00:00) every 5 days starting on the 5th of each month. | | A comma-separated list of any of the above except the `'*'` wildcard is also supported `'2,*/5,8-10'`. | `'0 0 5,*/10,25 * *'` | Configures the schedule to run at midnight (00:00) every 5th, 10th, 20th, and 25th day of each month. | 1. (Optional) You can delay the start time of a query by enabling the Delay execution. ### Execute the Query Save the query with a name and run, or just run the query. Upon successful completion of the query, the query result is automatically exported to the specified destination. Scheduled jobs that continuously fail due to configuration errors may be disabled on the system side after several notifications. (Optional) You can delay the start time of a query by enabling the Delay execution. ## Activate a Segment in Audience Studio You can also send segment data to the target platform by creating an activation in the Audience Studio. 1. Navigate to **Audience Studio**. 2. Select a parent segment. 3. Open the target segment, right-mouse click, and then select **Create Activation.** 4. In the **Details** panel, enter an Activation name and configure the activation according to the previous section on Configuration Parameters. 5. Customize the activation output in the **Output Mapping** panel. ![](/assets/ouput.b2c7f1d909c4f98ed10f5300df858a4b19f71a3b0834df952f5fb24018a5ea78.8ebdf569.png) - Attribute Columns - Select **Export All Columns** to export all columns without making any changes. - Select **+ Add Columns** to add specific columns for the export. The Output Column Name pre-populates with the same Source column name. You can update the Output Column Name. Continue to select **+ Add Columns**to add new columns for your activation output. - String Builder - **+ Add string** to create strings for export. Select from the following values: - String: Choose any value; use text to create a custom value. - Timestamp: The date and time of the export. - Segment Id: The segment ID number. - Segment Name: The segment name. - Audience Id: The parent segment number. 1. Set a **Schedule**. ![](/assets/snippet-output-connector-on-audience-studio-2024-08-28.a99525173709da1eb537f839019fa7876ffae95045154c8f2941b030022f792c.8ebdf569.png) - Select the values to define your schedule and optionally include email notifications. 1. Select **Create**. If you need to create an activation for a batch journey, review [Creating a Batch Journey Activation](/products/customer-data-platform/journey-orchestration/batch/creating-a-batch-journey-activation). ## (Optional) Export Integration Using the CLI If the TD Console is not available or does not meet your needs, you can use the CLI to issue queries and output results. The following instructions show you how to format the query output results using the CLI. ### td query Command Usage To output the result of a single query to a Postgres server, add the `--result` option to the `td query` command. After the job is finished, the results are written into your database: ```bash td query -w -d testdb \ --result 'postgresql://user:password@host/database/table' \ "SELECT code, COUNT(1) FROM www_access GROUP BY code" ``` To create a scheduled query whose output is systematically written to Postgres add the `--result` option when creating the schedule through td sched:create command: ```bash td sched:create hourly_count_example "0 * * * *" \ -d testdb \ --result 'postgresql://user:password@host/database/table' \ "SELECT COUNT(*) FROM www_access" ``` ### Result Output URL Format The result output target is represented by URL with the following format: `postgresql:``//username:password@hostname:port/database/table` where: - **postgresql** is identified for result output to Postgres; - **username** and **password** are the credentials to the Postgres server; - **the hostname** is the hostname of the Postgres server; - **port** is the port number through which the Postgres server is accessible. “:” is optional and assumed to be 5432 by default; - **database** is the name of the destination database; - **table** is the name of a table within the above-mentioned database. It may not exist at the moment the query output is executed, in which case a table with the specified name is created for the user. ### Options Result output to Postgres supports various options that can be specified as optional URL parameters. The options are compatible with each other and can be combined. Where applicable, the default behavior is indicated. ### SSL Option **ssl** option determines whether to use SSL or not for connecting to the Postgres server. Use SSL from Treasure Data to the Postgres server connection. The Postgres server must be [configured to accept an SSL connection](http://www.postgresql.org/docs/current/static/ssl-tcp.md). `postgresql:``//user:password@host/database/table?ssl=true` Do not use SSL from Treasure Data to the Postgres server connection. `postgresql:``//user:password@host/database/table?ssl=false` ### Schema Option Controls the schema the target table is located. If not specified default schema is to be used. The default schema depends on the user’s “search_path” setting but it is usually “public”. `postgresql:``//user:password@host/database/table?schema=target_schema` ### Update Mode Option Controls the various ways of modifying the database data. All 4 supported modes are **atomic** because they use a temporary table to store the incoming data before attempting to modify the destination table: - Append - Replace - Truncate - Update #### mode=append (default) The **append** mode is the default, used when no mode option is provided in the URL. In this mode, the query results are appended to the table. If the table does not exist, a table is created. Because `mode=append` is the default behavior, these two URLs are equivalent: - postgresql://user:password@host/database/table - postgresql://user:password@host/database/table?mode=append #### mode=replace The **replace** mode consists of replacing the entire content of an existing table with the result output of the query. If the table does not exist yet, a new table is created. The replace mode achieves **atomicity** (so that a consumer of the table always has consistent data) by performing the following three steps in a **single transaction**: 1. Create a temporary table. 2. Write to the temporary table. 3. Replace the existing table with the temporary table using ALTER TABLE RENAME. Example: - postgresql://user:password@host/database/table?mode=replace #### mode=truncate With the **truncate** mode, the system first truncates the existing table, then inserts the query results. If the table does not exist yet, a new table is created. Example: `postgresql://user:password@host/database/table?mode=truncate` Unlike replace, the truncate mode retains the indexes of the table. #### mode=update In the **update** mode, a row is inserted unless the inserted row causes a duplicate value in the columns specified in the “unique” parameter. In such cases, an update to the row is performed instead of an insert. A “unique” parameter is required when using the update mode. Example: 1. postgresql://...?mode=update&unique=col1        # single unique column 2. postgresql://...?mode=update&unique=[col1,col2] # multiple unique columns ### Write method Option The **method** option controls how the data is written into the Postgres table. You can use: - method=insert - method=copy The default method is **insert**and is the recommended method for most situations. #### method=insert (default) With the **insert** method, data is written to Postgres using ‘INSERT’ statements and is the most reliable and compatible method. Because `method=insert` is the default behavior, these two URLs are equivalent: 1. postgresql://user:password@host/database/table 2. postgresql://user:password@host/database/table?method=insert #### method=copy When the **copy** method is used, the data is first stored in a temporary file on the server, then is written to Postgres using a [COPY](http://www.postgresql.org/docs/8.1/static/sql-copy.md) transaction. This method is faster than INSERT and therefore is useful when handling a large amount of data. Example: `postgresql://user:password@host/database/table?method=copy` ## (Optional) Configure Export Results in Workflow Within Treasure Workflow, you can specify the use of this data connector to output data. ```yaml timezone: UTC _export: td: database: sample_datasets +td-result-output-postgresql: td>: queries/sample.sql result_connection: your_connections_name result_settings: database: database_name table: table_name mode: append set_role: new_role ``` Read about [using data connectors in a workflow to export data](http://docs.treasuredata.com/display/PD/About+Using+Workflows+to+Export+Data+with+TD+Toolbelt). See an [example workflow](https://github.com/treasure-data/workflow-examples/tree/master/td/postgresql).