Presto is not rounding decimals correctly.
See original GitHub issueWe are currently moving from Hive EMR to Athena Presto. During testing and validation, we noticed penny variances on transactions. Root cause analysis shows that Presto is not rounding correctly beyond 14 digits of decimal precision.
Attempts to resolve these variances only increase the total number of variances by causing more penny variances on other records. So, we solve one record’s value, but create [1 + n]
more problems elsewhere. Is there any way to correct for this other than moving these calculations into an ETL engine and handling it there?
Note: the input values to the ROUND() function are, themselves, errors on floating point operations while we are multiplying two decimal columns together on-the-fly in a SELECT statement.
Expected result from below sample calculation: 6.98
-- 15 digits of decimal precision.
SELECT ROUND(6.984999999999999, 2) -- Hive (EMR): 6.98
SELECT ROUND(6.984999999999999, 2) -- Presto (Athena): 6.99
-- Remove one level of precision (i.e., 14).
SELECT ROUND(6.98499999999999, 2) -- Presto (Athena): 6.98
Issue Analytics
- State:
- Created a year ago
- Comments:9 (3 by maintainers)
Top GitHub Comments
A quirk of Athena, which is not the default behavior in most presto clusters- is that literal values in the query string like 1.0 have an inferred type of DOUBLE instead of DECIMAL, so ROUND(6.984999999999999, 2) is rounding using double arithmetic which loses precision. If instead you force the type to be DECIMAL, then the correct result is returned
I did not know that about Athena. Thank you for raising that awareness with me and explaining how Athena is operating under the hood - I greatly appreciate it!
I will go ahead and close the ticket. Take care. Best regards.
I see the following expected behavior on Presto master (0.273). It is possible that this issue has been fixed since version (0.217).