question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

fingerprint64 results inconsistent with Google BigQuery

See original GitHub issue

Google BigQuery has farm_fingerprint 64 as a built in function. https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#farm_fingerprint

BiqQuery:

SELECT FARM_FINGERPRINT("1footrue");

Returns -1541654101129638711

lovell/farmhash:

farmhash.fingerprint64("1footrue")

yields 16905089972579912905

Shouldn’t they produce the same? For any string that BQ produces a positive hash for i.e. “2applefalse” both this package and bigquery converge on 2794438866806483259.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Reactions:2
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
jakelowencommented, Oct 15, 2019

This also works, no?

const fingerprint64signed = input => {
  return BigInt.asIntN(64, farmhash.fingerprint64(input)).toString();
};

fingerprint64signed("1footrue") yields “-1541654101129638711”

1reaction
kechancommented, Aug 24, 2019

Thanks @lovell - your analysis makes sense. Feel free to close the issue since it seems to be an issue on BQ side and not your library.

On that note: can you think of a helper function that would take the result of your farmhash.fingerprint64 and transform in the same way that BQ does? In essence to recreate their error reliably?

Thanks for clarification on this int type issue. I was taken by surprise as well.

import numpy as np
np.uint64(farmhash.fingerprint64(x)).astype('int64')

then it will give the same result as BigQuery on ‘x’

Read more comments on GitHub >

github_iconTop Results From Across the Web

Random sample in bigquery gives inconsistent results
In this result set ( bquijob_124ad56f_15da8af982e ) I only get a single row, containing the count of bigarticle = 1. Google Cloud Collective....
Read more >
Legacy SQL Functions and Operators | BigQuery - Google Cloud
BIT_AND(), Returns the result of a bitwise AND operation . ... It is similar to a WHERE clause, but different in two important...
Read more >
BigQuery returning different results for the same query - Reddit
I'm seeing queries (Select statements) returning different results overtime they're ran. Any reason why this can be happening?
Read more >
Python vs BigQuery FarmHash Sometimes Do Not Equal
https://cloud.google.com/bigquery/docs/reference/standard-sql/ ... import farmhash print(farmhash.fingerprint64('6823339101')). FlyingTeller 14937. score:5.
Read more >
The Benefits of Using BigQuery with Google Analytics Data
BigQuery is the cloud data warehouse component of Google Cloud Platform. ... The result is customer information alongside web analytics and ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found