Release Note 20150120
Table of Contents
Features & Improvements
- Console: Query Index Filtering
- Console: Password Validation On Signup
- Presto: Retrying for Presto Queries
- Backend: Table Data Expiration
- Backend: Event Collector Saves Backup Files In S3
- Backend: Multipart Upload for Result Export to AWS S3 / Riak CS
- Backend: Hive Result Size Limitation
- Client Tools: First Release of the Python Client Library v0.1.2
- Bug Fixes
Features & Improvements
This is a summary of the new features and improvements introduced in this release:
Console: Query Index Filtering
We improved the filtering in the query index page to:
- default the checkboxes next to each filter to unchecked when the filter is not applied
- update the URL every time a filter is applied/removed to allow the search to be saved
Console: Password Validation On Signup
We modified the password validation logic in the Signup page to make sure it indicates clearly when the password entered does not meet the minimum length requirement of 6 characters.
Presto: Retrying for Presto Queries
We added a retrying mechanism that reruns the Presto query in the occurrence of (rare) Execution Expired exceptions and Rejected Execution exceptions caused by a worker spinning for a restart.
Backend: Table Data Expiration
We created a new feature which allows users to ‘virtually’ expire data from any of their tables after a specified amount of days.
When table expiration is enabled for a table, the table’s data is not actually removed from the table but it rather flagged as removed; as such a query scanning the table will only return records whose time field is at most X days old, where X is the number of days specified for the table’s expiration.
The advantage of ‘virtually’ expiring data from a table is that table expiration can always be revoked, thus allowing a query to fetch all records in the table, or modified, thus modifying the range of valid timestamps for the query. Also since table expiration is not dropping data from a table, it does not reduce nor benefits the number of records' for the account storage count.
For more information please see the related Database and Table Management – Expire Data From A Table documentation page.
Backend: Event Collector Saves Backup Files In S3
The Event Collector used as backend for the Treasure Data SDKs now saves the files storing the records to be imported into AWS S3.
This is done to ensure recoverability in case data is lost because of customer mishandling or non fortuitous deletions.
Backend: Multipart Upload for Result Export to AWS S3 / Riak CS
When exporting the result of a query to AWS S3 / Riak CS, the result set is divided up in smaller chunks (multiparts), downloaded into the worker one at a time and uploaded in the same fashion.
This greatly reduces the disk size requirements in the worker, thus increasing scalability and practically allowing results of any size to be exported to AWS S3 / Riak CS without limitations.
Backend: Hive Result Size Limitation
When the number of records in the result of a Hive query exceeds 232-1, Hive truncates the result thus causing an unexpected behavior, especially when exporting the result somewhere else with a Result Export.
We improved the logic to flag the query as failed when the result size is exceeded – any result export associated to the query is thus avoided since the result would not be correct.
|INSERT INTO clauses are not affected by the 232-1 size limitation and can write the entire content of the query result to a Treasure Data table without limitations.|
Client Tools: First Release of the Python Client Library v0.1.2
We released the first version of our new Python client library (v0.1.2) on Github.
This latest version exposed the global DEFAULT_CONFIG configuration for customizations – this allows customers to create a thin wrapper around the Treasure Data SDK to brand it as their own – see the FAQ for more info.
These are the most important Bug Fixes made in this release:
APIs: Query in Non UTF-8 Format
If the query text of a query contains non UTF-8 byte sequences, the API responds with a 500 error code.
Rails only tolerates UTF-8 byte sequences for strings.
We are catching Rails' ArgumentError exceptions caused by not UTF-8 byte sequences and rendering a 422 error code instead of 500.
Console: Some Users Cannot Logging In
Some users cannot login into the Console. The spinner rotates indefinitely and the page never loads.
In Release 20141216 we released a fix to allow restricted users with import only permissions for a database to list bulk import jobs they created. Because of the inefficient queries generated by Rails to satisfy this requirement, this change caused sensible database overload and caused some users not to be able to login into the Console. We had to revert the original change to mitigate the problem, thus exposing the problem again but allowing users to successfully login into the Console again.
To fix the earlier problem, we opted to not show bulk import jobs created against import-only databases to be listed in the job list but only to allow these users to pull the job status of the each bulk import job. This allow the ‘td import:auto’ command to work correctly, thus solving the original problem.
Backend: Result Export with INSERT INTO
If a Presto/Hive query is using the INSERT INTO clause, any Result Export associated to the query fails.
This is because by construction INSERT INTO queries don’t produce any result since the query result produced by the underlying SELECT query is written directly in the destination database and table specified. Because the result set ‘looks like’ containing no record, the Result Export fails because there are no records to export.
Although this use case is not common (writing the result of a query in two different locations – e.g. a MySQL database and Treasure Data) we tweaked the logic to allow INSERT INTO queries whom indeed produced a result containing one or more records, to export the result to another destination as well.
Backend: Result Export to MySQL with UJIS Japanese Character Encoding
When the character encoding of the destination MySQL database for a Result Export is Japanese UJIS, the result export fails.
We improved the Result Export feature for MySQL to handle this case as well.
Last modified: Feb 22 2017 23:40:42 UTC