RecursionError when pickling bigquery table object
See original GitHub issueDescription
When running my flow against a DaskKubernetes environment I get the following RecursionError:
Unexpected error: RecursionError('maximum recursion depth exceeded')
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/prefect/engine/runner.py", line 48, in inner
new_state = method(self, state, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/prefect/engine/flow_runner.py", line 491, in get_flow_run_state
upstream_states = executor.wait(
File "/usr/local/lib/python3.8/site-packages/prefect/engine/executors/dask.py", line 375, in wait
return self.client.gather(futures)
File "/usr/local/lib/python3.8/site-packages/distributed/client.py", line 1982, in gather
return self.sync(
File "/usr/local/lib/python3.8/site-packages/distributed/client.py", line 832, in sync
return sync(
File "/usr/local/lib/python3.8/site-packages/distributed/utils.py", line 339, in sync
raise exc.with_traceback(tb)
File "/usr/local/lib/python3.8/site-packages/distributed/utils.py", line 323, in f
result[0] = yield future
File "/usr/local/lib/python3.8/site-packages/tornado/gen.py", line 735, in run
value = future.result()
File "/usr/local/lib/python3.8/site-packages/distributed/client.py", line 1876, in _gather
response = await future
File "/usr/local/lib/python3.8/site-packages/distributed/client.py", line 1927, in _gather_remote
response = await retry_operation(self.scheduler.gather, keys=keys)
File "/usr/local/lib/python3.8/site-packages/distributed/utils_comm.py", line 385, in retry_operation
return await retry(
File "/usr/local/lib/python3.8/site-packages/distributed/utils_comm.py", line 370, in retry
return await coro()
File "/usr/local/lib/python3.8/site-packages/distributed/core.py", line 861, in send_recv_from_rpc
result = await send_recv(comm=comm, op=key, **kwargs)
File "/usr/local/lib/python3.8/site-packages/distributed/core.py", line 644, in send_recv
response = await comm.read(deserializers=deserializers)
File "/usr/local/lib/python3.8/site-packages/distributed/comm/tcp.py", line 202, in read
msg = await from_frames(
File "/usr/local/lib/python3.8/site-packages/distributed/comm/utils.py", line 87, in from_frames
res = _from_frames()
File "/usr/local/lib/python3.8/site-packages/distributed/comm/utils.py", line 65, in _from_frames
return protocol.loads(
File "/usr/local/lib/python3.8/site-packages/distributed/protocol/core.py", line 130, in loads
value = _deserialize(head, fs, deserializers=deserializers)
File "/usr/local/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 302, in deserialize
return loads(header, frames)
File "/usr/local/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 64, in pickle_loads
return pickle.loads(x, buffers=buffers)
File "/usr/local/lib/python3.8/site-packages/distributed/protocol/pickle.py", line 75, in loads
return pickle.loads(x)
File "/usr/local/lib/python3.8/site-packages/google/cloud/bigquery/table.py", line 1264, in __getattr__
value = self._xxx_field_to_index.get(name)
File "/usr/local/lib/python3.8/site-packages/google/cloud/bigquery/table.py", line 1264, in __getattr__
value = self._xxx_field_to_index.get(name)
File "/usr/local/lib/python3.8/site-packages/google/cloud/bigquery/table.py", line 1264, in __getattr__
value = self._xxx_field_to_index.get(name)
[Previous line repeated 974 more times]
RecursionError: maximum recursion depth exceeded
Following that error I am getting an error that looks like this:
β [2020-07-29 17:07:14] DEBUG - prefect.CloudFlowRunner | Flow 'customers': Handling state change from Running to Failed β
β distributed.scheduler - INFO - Scheduler closing... β
β distributed.scheduler - INFO - Scheduler closing all comms β
β distributed.scheduler - INFO - Remove worker <Worker 'tcp://10.60.6.2:43271', name: 0, memory: 0, processing: 0> β
β distributed.core - INFO - Removing comms to tcp://10.60.6.2:43271 β
β distributed.scheduler - INFO - Lost all workers β
β [2020-07-29 17:07:19] DEBUG - kubernetes.client.rest | response body: {"kind":"PodList","apiVersion":"v1","metadata":{"selfLink":"/api/v1/namespaces/prefect-production/pods","resourceVersion":"158839033"},"items":[{"metadata":{"name": β
β [2020-07-29 17:07:20] DEBUG - kubernetes.client.rest | response body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"dask-root-11ce59a5-btl5w9","generateName":"dask-root-11ce59a5-b","namespace":"prefect-production","selfLink":"/a β
β [2020-07-29 17:07:20] INFO - dask_kubernetes.core | Deleted pod: dask-root-11ce59a5-btl5w9 β
β [2020-07-29 17:07:20] DEBUG - kubernetes.client.rest | response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"services is forbidden: User \"system:serviceaccount:prefect-production:default\" cann β
β Traceback (most recent call last): β
β File "/usr/local/lib/python3.8/weakref.py", line 642, in _exitfunc β
β f() β
β File "/usr/local/lib/python3.8/weakref.py", line 566, in __call__ β
β return info.func(*info.args, **(info.kwargs or {})) β
β File "/usr/local/lib/python3.8/site-packages/dask_kubernetes/core.py", line 707, in _cleanup_resources β
β services = core_api.list_namespaced_service( β
β File "/usr/local/lib/python3.8/site-packages/kubernetes/client/api/core_v1_api.py", line 13463, in list_namespaced_service β
β (data) = self.list_namespaced_service_with_http_info(namespace, **kwargs) # noqa: E501 β
β File "/usr/local/lib/python3.8/site-packages/kubernetes/client/api/core_v1_api.py", line 13551, in list_namespaced_service_with_http_info β
β return self.api_client.call_api( β
β File "/usr/local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 330, in call_api β
β return self.__call_api(resource_path, method, β
β File "/usr/local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 162, in __call_api β
β response_data = self.request( β
β File "/usr/local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 352, in request β
β return self.rest_client.GET(url, β
β File "/usr/local/lib/python3.8/site-packages/kubernetes/client/rest.py", line 237, in GET β
β return self.request("GET", url, β
β File "/usr/local/lib/python3.8/site-packages/kubernetes/client/rest.py", line 231, in request β
β raise ApiException(http_resp=r) β
β kubernetes.client.rest.ApiException: (403) β
β Reason: Forbidden β
β HTTP response headers: HTTPHeaderDict({'Audit-Id': 'ea92e80a-d7cd-4fb4-b731-6b695e45e945', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'Date': 'Wed, 29 Jul 2020 17:07:20 GMT', 'Content-Length': '316'}) β
β HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"services is forbidden: User \"system:serviceaccount:prefect-production:default\" cannot list resource \"services\" in API group \"\" in β
β stream closed
Expected Behavior
When running locally with flow.run()
it gives no errors, but I get the above error when running with DaskKubernetes
Reproduction
The code Iβm running looks something like the following:
from random import randrange
import pandas as pd
import prefect
from prefect import Flow, task
from prefect.tasks.gcp.bigquery import BigQueryTask
get_customer_ids = BigQueryTask(
name="Get unique customer names",
query="""
select distinct id from cloud_sql.customers
where redacted_id is not null limit 100
""",
)
@task
def get_customer_balance(customer_record):
logger = prefect.context.get("logger")
customer_id = customer_record.get("id")
logger.info(f"Requesting balance for customer {customer_id}")
# For now we just generate a random number, later this will be a API call
balance = randrange(0, 100)
return (customer_id, balance)
@task
def prepare_balances(balances):
logger = prefect.context.get("logger")
# convert the list of tuples into a dataframe
df = pd.DataFrame(balances, columns=["customer_id", "balance"])
logger.info(df)
return df
@task
def save_balances(balances):
logger = prefect.context.get("logger")
logger.info("Storing balances in BigQuery")
# convert the df to sql or a file and store in bigquery
return True
with Flow("customers") as flow:
customer_records = get_customer_ids()
synapse_balances = get_customer_balance.map(customer_records)
balances = prepare_balances(synapse_balances)
save_balances(balances)
if __name__ == "__main__":
flow.run()
Additionally, when running on my cluster, the CI/CD process executes another Python file that looks like:
from os import environ, path
import docker
from customers.flow import flow as customers
from prefect.environments import DaskKubernetesEnvironment
from prefect.environments.storage import Docker
# The following TLS config is required for CircleCI as it uses a "docker in docker" approach
tls_config = docker.tls.TLSConfig(
client_cert=(
path.join(environ.get("DOCKER_CERT_PATH"), "cert.pem"),
path.join(environ.get("DOCKER_CERT_PATH"), "key.pem"),
)
)
customers.storage = Docker(
registry_url="gcr.io/redacted/redacted",
base_url=environ.get("DOCKER_HOST"), # required for CircleCI
tls_config=tls_config, # required for CircleCI
python_dependencies=["pandas", "prefect[google,kubernetes]"],
)
customers.environment = DaskKubernetesEnvironment(min_workers=1, max_workers=3)
customers.register(project_name="prefect-test-1")
Environment
On my local machine:
{
"config_overrides": {},
"env_vars": [],
"system_information": {
"platform": "macOS-10.15.5-x86_64-i386-64bit",
"prefect_version": "0.12.6",
"python_version": "3.8.4"
}
}
From inside the prefect agent container on my Kubernetes cluster:
{
"config_overrides": {},
"env_vars": [
"PREFECT__CLOUD__API",
"PREFECT__CLOUD__AGENT__AUTH_TOKEN",
"PREFECT__CLOUD__AGENT__LABELS",
"PREFECT__CLOUD__AGENT__AGENT_ADDRESS",
"PREFECT__BACKEND"
],
"system_information": {
"platform": "Linux-4.19.104+-x86_64-with-debian-10.4",
"prefect_version": "0.12.5",
"python_version": "3.6.11"
}
}
Iβve noticed the Python versions are different.
Issue Analytics
- State:
- Created 3 years ago
- Comments:7
Top Results From Across the Web
_pickle.PicklingError when trying to read a table from ...
However, when I am trying to call the module in an Airflow DAG, it gives an error: _pickle.PicklingError: Pickling client objects is explicitlyΒ ......
Read more >Changelog β great_expectations documentation
added special cases for handling BigQuery - table names are already qualified with ... Use pickle to generate hash for dataframes with unhashable...
Read more >Class Table (3.4.0) | Python client library - Google Cloud
Clustering fields are immutable after table creation. Note: BigQuery supports clustering for both partitioned and non-partitioned tables. created.
Read more >Changelog β great_expectations documentation - Great Expectations!
[BUGFIX] Fix issue with temporary table creation in MySQL #2389 ... Use pickle to generate hash for dataframes with unhashable objects.
Read more >Multiprocessing and Pickle, How to Easily fix that?
As a data scientist, you may sometimes require to send complex object hierarchies over a network or save your objects' internal state to...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
That indeed looks to be due to pickling issues. Two things Iβd check (in order):
bigquery
library is the same. If you ran the above withDaskExecutor()
(with no parameters) and got that error then this isnβt the issue, since in that case the worker and flow runner environments are the same.cloudpickle.loads(cloudpickle.dumps(table))
as a check should be good enough.This might be crazy, but is it possible that pickling fails when there is code that:
vs.
Iβm not sure what the requirements are for pickling, but that seems to solve some cases.