Floats can lose precision when loading to BigQuery
See original GitHub issueThe float precision is set here: https://github.com/pydata/pandas-gbq/blob/d251db03b159447331ac9ae63e13d295d75bad70/pandas_gbq/load.py#L22
This is insufficient to represent all 64 bit floats without losing precision. For example 26/59 should be represented as 0.4406779661016949 but under this it is represented as 0.440677966101695.
This was added intentionally here to fix a different issue but it causes us some issues as we need perfect reconciliation between systems. It seems like it should be possible to get the best of both worlds and output the correct number of digits in all cases.
The original suggestion was to use %g but this was changed to %.15g – it’s not clear to me what the rationale is for that, it seems like %g is strictly better but I’m sure I’m missing something.
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (1 by maintainers)

Top Related StackOverflow Question
@danielchatfield
Not sure if this is helpful, but I think one of the issues as explained here is that a conservative choice is made in the number of significant digits.
A possible solution if you do need to have larger precision is to use
.parquetformat instead, as suggested here?According to https://en.wikipedia.org/wiki/IEEE_754#Character_representation 17 digits are precision are required to preserve the original binary value. 16 digits was not enough in my testing of https://github.com/pydata/pandas-gbq/pull/336