Minimize disruption to workloads currently running successfully in Hive 0.13.
Maximize the benefits from Hive 2 in stages appropriate to the business.
Treasure Data recommends you run queries with Hive 2020.1 for testing purposes:
Modify Existing Queries: Update your existing queries to remove Hive 2 incompatible syntax. We recommend rewriting your queries for stricter Hive 2020.1-compatible syntax and then run the modified queries in Hive 0.13. For more information on syntaxes, see Hive 0.13 and Hive 2020.1.
New Queries: Make sure they are compatible withHive 2020.1 syntax and avoid the incompatible syntax from Hive 0.13.
Test Workloads: Work with your technical account management or support to test new and existing workloads for use with Hive 2020.1.
Treasure Data suggests that you use Hive 2020.1 for the following workload types:
You should also review and upgrade all other existing workloads.
Default Processing Engine
TD Console: Hive users will seeHive 2020.1 as the default Hive version.
CLI / API: If no version is specified, Hive 2/Tex is used after changing the default processing engine.
Review the following table to understand how to run Hive 0.13 and Hive 2/Tez queries.
Run Hive 0.13 Query
Run Hive 2/Tez Query
Default = Hive 0.13
Override the engine version from None to Stable.
Default = Hive 2020.1
Semantic Checking Process
There are a couple of options available for the semantic checking of Hive queries; both positive and negative impacts are explained.
Run ‘EXPLAIN <query>’
No impact on production
Check SQL compliance easily
The statement prints the query plan only; you cannot look at the result.
Run Query / Insert into Temporary Table
Make comparisons on results.
If you do not rewrite insert queries, Treasure Data will insert the results into production tables for testing.
The EXPLAIN command can help check for errors without having to execute a query.