# DELETE Statement Syntax A DELETE statement is used to delete rows from a table. DELETE statements are helpful for deleting unnecessary rows like bad data, duplicated data, outdated data, etc. DELETE statement is supported only by Trino query engine. ## Syntax ```sql DELETE FROM WHERE ``` Where: - is the name of the table for deletion - is the clause to specify a search condition `DELETE` statement without `WHERE` clause isn't allowed to reduce the occurrence of unintentional data deletion. ### Examples Delete all line items shipped by air: ```sql DELETE FROM lineitem WHERE shipmode = 'AIR' ``` Delete all event logs from a July 1st, 2017 UTC: ```sql DELETE FROM lineitem WHERE TD_TIME_RANGE(time, '2017-07-01','2017-07-02') ``` ## Limitations ### No DELETE Without a WHERE Clause To reduce the occurrence of unintentional data deletion, DELETE statements require you to include a WHERE clause. Error sample: ``` Does not support DELETE statement without WHERE condition ``` ### DELETE against Data Tank DELETE statements cannot be issued against data stored in Data Tank. Connect directly to the PostgreSQL Data Tank to issue DELETE statements on data stored there. ### Limit number of partitions to delete at once A single DELETE query can delete up to 500k partitions. A DELETE query fails when it exceeds the limit. #### Reduce affected rows of DELETE statement When a DELETE query fails due to the limit exceeded error or timeout error, you should rewrite the DELETE statement to reduce affected rows. The most effective way to do this is to include a WHERE clause that limits the time range for the DELETE with `TD_TIME_RANGE` or `TD_INTERVAL`. For example, if you have a year's worth of game history data, instead of ``` DELETE FROM game_history WHERE player_id=1000; ``` try ``` DELETE FROM game_history WHERE player_id=1000 AND TD_TIME_RANGE(time, '2018-01-01', '2018-02-01','PDT') ``` and then delete more time ranges until all rows are deleted. ### DELETE statement Timeout Error A DELETE query can take several hours when affected rows of the DELETE statement are very large. The query may time out and the job fail in this case. If the job fails by a timeout error, you should rewrite the DELETE statement to affect fewer rows. Refer to [Reduce affected rows of DELETE statement](#reduce-affected-rows-of-delete-statement). ### Concurrent DELETE statements on the same table may conflict When multiple DELETE statements are executed concurrently on the same table, the execution time may be longer or the query may fail. It is because of the conflict of underlying partition file deletion. The general guideline is to avoid concurrent DELETE statement execution since the possibility of conflict is unpredictable. The possibility depends on various factors, like the condition of the DELETE statement, the data volume and the data distribution of the table. When a DELETE statement fails due to conflict, please retry the query. ### DELETE Statement Resource Consumption DELETE queries issue jobs and consume resources. Because of the storage and indexing methods used in Treasure Data, a DELETE query can be resource intensive. To prevent a DELETE query from performing a full table scan, you should use: - a time expression, like TD_TIME_RANGE or TD_INTERVAL, if the table uses default time-based partitioning - an equality predicate on all partition keys, if the table uses user-defined partitioning For example, if your table is partitioned by time, and you are trying to delete a set of users that were created in June 2018, include the time restriction even if that's not a critical parameter for your DELETE statement. For example, you should: ``` DELETE FROM logtable WHERE userid IN (1234, 1235, 1236) AND TD_TIME_RANGE(time, '2017-06-01','2017-07-01') ``` Including the time restriction significantly improves the speed of your query, and reduces the resources this query consumes. ## Undo Delete Statements If you want to rollback an executed delete statement and recover deleted data, you must contact Treasure Data technical support and provide the jobID of the job that contained the executed DELETE statement. In many cases, you can undo a delete statement. However, certain situations prevent the rollback of DELETE statements. Situations that prevent rollback include: - When new partitions have been deleted by another DELETE statement. - When original partitions have been discarded, which can happen after a retention period. - When a table is dropped and a new table with the same name is created. - When new partitions have been further modified or replaced, which can happen as a result of internal storage maintenance processes.