Hive’s Array column type behavior change
This page describes that our upcoming release for the Hadoop 2 environment will include a Hive change that is backwards incompatible.
Table of Contents
What Is Changing
The change in behavior affects Hive users in the Hadoop 2 clusters when querying a table that contains a column of type Array:
and the column value is
Before the change Hive was treating
NULL values as empty arrays
After, Hive will treat
NULL values as
The Release is set to take place on 2016-02-02
Who Is Affected
All customers already migrated to the Hadoop 2 clusters and using Hive will be affected by this change of behavior. However queries running at the time the release takes place, will not be affected. All customers still assigned to the Hadoop 1 clusters will not be affected. They will only observe this change of behavior after they migrated to an Hadoop 2 cluster.
Furthermore, this change doesn’t affect Presto queries and does not alter the stored data from the existing tables in any way.
Why Are We Changing It
This fix is needed to make the behavior of our Hive engine consistent with that of the standard Apache Hive and requires a change in the integration between Hive and our proprietary storage system called ‘PlazmaDB’.
If you have any questions about this change, please contact us at firstname.lastname@example.org.
Last modified: Dec 23 2016 04:08:01 UTC