Fit on array of size >= 4096 returns _TracedPicklingError: Failed in nopython mode pipeline (step: nopython mode backend)
See original GitHub issueA simple fit of umap on a random numpy array of size 4096 (or more) fails with the traceback detailed below. If the array is of size less than 4096 everything goes fine. What’s going wrong? A very similar issue: https://github.com/lmcinnes/umap/issues/477
Framework: Spark-based environment with umap==0.5.0
, numba==0.52.0
, pynndescent==0.5.1
, scipy==1.4.1
Minimal example:
import umap
import numpy as np
reducer = umap.UMAP()
np.random.seed(0)
reducer.fit(np.random.rand(4096,16))
This gives the following error:
---------------------------------------------------------------------------
PicklingError Traceback (most recent call last)
/databricks/python/lib/python3.7/site-packages/numba/core/serialize.py in save(self, obj)
304 try:
--> 305 return super().save(obj)
306 except _TracedPicklingError:
/databricks/python/lib/python3.7/pickle.py in save(self, obj, save_persistent_id)
503 if f is not None:
--> 504 f(self, obj) # Call unbound method with explicit self
505 return
/databricks/python/lib/python3.7/pickle.py in save_type(self, obj)
1015 return self.save_reduce(type, (...,), obj=obj)
-> 1016 return self.save_global(obj)
1017
/databricks/python/lib/python3.7/pickle.py in save_global(self, obj, name)
959 "Can't pickle %r: it's not found as %s.%s" %
--> 960 (obj, module_name, name)) from None
961 else:
PicklingError: Can't pickle <class 'collections.FlatTree'>: it's not found as collections.FlatTree
During handling of the above exception, another exception occurred:
_TracedPicklingError Traceback (most recent call last)
<command-103842> in <module>
3 reducer = umap.UMAP()
4 np.random.seed(0)
----> 5 reducer.fit(np.random.rand(4096,16))
/databricks/python/lib/python3.7/site-packages/umap/umap_.py in fit(self, X, y)
2387 use_pynndescent=True,
2388 n_jobs=self.n_jobs,
-> 2389 verbose=self.verbose,
2390 )
2391
/databricks/python/lib/python3.7/site-packages/umap/umap_.py in nearest_neighbors(X, n_neighbors, metric, metric_kwds, angular, random_state, low_memory, use_pynndescent, n_jobs, verbose)
337 low_memory=low_memory,
338 n_jobs=n_jobs,
--> 339 verbose=verbose,
340 )
341 knn_indices, knn_dists = knn_search_index.neighbor_graph
/databricks/python/lib/python3.7/site-packages/pynndescent/pynndescent_.py in __init__(self, data, metric, metric_kwds, n_neighbors, n_trees, leaf_size, pruning_degree_multiplier, diversify_prob, n_search_trees, tree_init, init_graph, random_state, low_memory, max_candidates, n_iters, delta, n_jobs, compressed, verbose)
789 current_random_state,
790 self.n_jobs,
--> 791 self._angular_trees,
792 )
793 leaf_array = rptree_leaf_array(self._rp_forest)
/databricks/python/lib/python3.7/site-packages/pynndescent/rp_trees.py in make_forest(data, n_neighbors, n_trees, leaf_size, rng_state, random_state, n_jobs, angular)
999 result = joblib.Parallel(n_jobs=n_jobs, prefer="threads")(
1000 joblib.delayed(make_dense_tree)(data, rng_states[i], leaf_size, angular)
-> 1001 for i in range(n_trees)
1002 )
1003 except (RuntimeError, RecursionError, SystemError):
/databricks/python/lib/python3.7/site-packages/joblib/parallel.py in __call__(self, iterable)
1015
1016 with self._backend.retrieval_context():
-> 1017 self.retrieve()
1018 # Make sure that we get a last message telling us we are done
1019 elapsed_time = time.time() - self._start_time
/databricks/python/lib/python3.7/site-packages/joblib/parallel.py in retrieve(self)
907 try:
908 if getattr(self._backend, 'supports_timeout', False):
--> 909 self._output.extend(job.get(timeout=self.timeout))
910 else:
911 self._output.extend(job.get())
/databricks/python/lib/python3.7/multiprocessing/pool.py in get(self, timeout)
655 return self._value
656 else:
--> 657 raise self._value
658
659 def _set(self, i, obj):
/databricks/python/lib/python3.7/multiprocessing/pool.py in worker(inqueue, outqueue, initializer, initargs, maxtasks, wrap_exception)
119 job, i, func, args, kwds = task
120 try:
--> 121 result = (True, func(*args, **kwds))
122 except Exception as e:
123 if wrap_exception and func is not _helper_reraises_exception:
/databricks/python/lib/python3.7/site-packages/joblib/_parallel_backends.py in __call__(self, *args, **kwargs)
606 def __call__(self, *args, **kwargs):
607 try:
--> 608 return self.func(*args, **kwargs)
609 except KeyboardInterrupt:
610 # We capture the KeyboardInterrupt and reraise it as
/databricks/python/lib/python3.7/site-packages/joblib/parallel.py in __call__(self)
254 with parallel_backend(self._backend, n_jobs=self._n_jobs):
255 return [func(*args, **kwargs)
--> 256 for func, args, kwargs in self.items]
257
258 def __len__(self):
/databricks/python/lib/python3.7/site-packages/joblib/parallel.py in <listcomp>(.0)
254 with parallel_backend(self._backend, n_jobs=self._n_jobs):
255 return [func(*args, **kwargs)
--> 256 for func, args, kwargs in self.items]
257
258 def __len__(self):
/databricks/python/lib/python3.7/site-packages/numba/core/dispatcher.py in _compile_for_args(self, *args, **kws)
431 e.patch_message('\n'.join((str(e).rstrip(), help_msg)))
432 # ignore the FULL_TRACEBACKS config, this needs reporting!
--> 433 raise e
434
435 def inspect_llvm(self, signature=None):
/databricks/python/lib/python3.7/site-packages/numba/core/dispatcher.py in _compile_for_args(self, *args, **kws)
364 argtypes.append(self.typeof_pyval(a))
365 try:
--> 366 return self.compile(tuple(argtypes))
367 except errors.ForceLiteralArg as e:
368 # Received request for compiler re-entry with the list of arguments
/databricks/python/lib/python3.7/site-packages/numba/core/compiler_lock.py in _acquire_compile_lock(*args, **kwargs)
30 def _acquire_compile_lock(*args, **kwargs):
31 with self:
---> 32 return func(*args, **kwargs)
33 return _acquire_compile_lock
34
/databricks/python/lib/python3.7/site-packages/numba/core/dispatcher.py in compile(self, sig)
855 self._cache_misses[sig] += 1
856 try:
--> 857 cres = self._compiler.compile(args, return_type)
858 except errors.ForceLiteralArg as e:
859 def folded(args, kws):
/databricks/python/lib/python3.7/site-packages/numba/core/dispatcher.py in compile(self, args, return_type)
75
76 def compile(self, args, return_type):
---> 77 status, retval = self._compile_cached(args, return_type)
78 if status:
79 return retval
/databricks/python/lib/python3.7/site-packages/numba/core/dispatcher.py in _compile_cached(self, args, return_type)
89
90 try:
---> 91 retval = self._compile_core(args, return_type)
92 except errors.TypingError as e:
93 self._failed_cache[key] = e
/databricks/python/lib/python3.7/site-packages/numba/core/dispatcher.py in _compile_core(self, args, return_type)
107 args=args, return_type=return_type,
108 flags=flags, locals=self.locals,
--> 109 pipeline_class=self.pipeline_class)
110 # Check typing error if object mode is used
111 if cres.typing_error is not None and not flags.enable_pyobject:
/databricks/python/lib/python3.7/site-packages/numba/core/compiler.py in compile_extra(typingctx, targetctx, func, args, return_type, flags, locals, library, pipeline_class)
600 pipeline = pipeline_class(typingctx, targetctx, library,
601 args, return_type, flags, locals)
--> 602 return pipeline.compile_extra(func)
603
604
/databricks/python/lib/python3.7/site-packages/numba/core/compiler.py in compile_extra(self, func)
350 self.state.lifted = ()
351 self.state.lifted_from = None
--> 352 return self._compile_bytecode()
353
354 def compile_ir(self, func_ir, lifted=(), lifted_from=None):
/databricks/python/lib/python3.7/site-packages/numba/core/compiler.py in _compile_bytecode(self)
412 """
413 assert self.state.func_ir is None
--> 414 return self._compile_core()
415
416 def _compile_ir(self):
/databricks/python/lib/python3.7/site-packages/numba/core/compiler.py in _compile_core(self)
392 self.state.status.fail_reason = e
393 if is_final_pipeline:
--> 394 raise e
395 else:
396 raise CompilerError("All available pipelines exhausted")
/databricks/python/lib/python3.7/site-packages/numba/core/compiler.py in _compile_core(self)
383 res = None
384 try:
--> 385 pm.run(self.state)
386 if self.state.cr is not None:
387 break
/databricks/python/lib/python3.7/site-packages/numba/core/compiler_machinery.py in run(self, state)
337 (self.pipeline_name, pass_desc)
338 patched_exception = self._patch_error(msg, e)
--> 339 raise patched_exception
340
341 def dependency_analysis(self):
/databricks/python/lib/python3.7/site-packages/numba/core/compiler_machinery.py in run(self, state)
328 pass_inst = _pass_registry.get(pss).pass_inst
329 if isinstance(pass_inst, CompilerPass):
--> 330 self._runPass(idx, pass_inst, state)
331 else:
332 raise BaseException("Legacy pass in use")
/databricks/python/lib/python3.7/site-packages/numba/core/compiler_lock.py in _acquire_compile_lock(*args, **kwargs)
30 def _acquire_compile_lock(*args, **kwargs):
31 with self:
---> 32 return func(*args, **kwargs)
33 return _acquire_compile_lock
34
/databricks/python/lib/python3.7/site-packages/numba/core/compiler_machinery.py in _runPass(self, index, pss, internal_state)
287 mutated |= check(pss.run_initialization, internal_state)
288 with SimpleTimer() as pass_time:
--> 289 mutated |= check(pss.run_pass, internal_state)
290 with SimpleTimer() as finalize_time:
291 mutated |= check(pss.run_finalizer, internal_state)
/databricks/python/lib/python3.7/site-packages/numba/core/compiler_machinery.py in check(func, compiler_state)
260
261 def check(func, compiler_state):
--> 262 mangled = func(compiler_state)
263 if mangled not in (True, False):
264 msg = ("CompilerPass implementations should return True/False. "
/databricks/python/lib/python3.7/site-packages/numba/core/typed_passes.py in run_pass(self, state)
447
448 # TODO: Pull this out into the pipeline
--> 449 NativeLowering().run_pass(state)
450 lowered = state['cr']
451 signature = typing.signature(state.return_type, *state.args)
/databricks/python/lib/python3.7/site-packages/numba/core/typed_passes.py in run_pass(self, state)
373 lower.lower()
374 if not flags.no_cpython_wrapper:
--> 375 lower.create_cpython_wrapper(flags.release_gil)
376
377 if not flags.no_cfunc_wrapper:
/databricks/python/lib/python3.7/site-packages/numba/core/lowering.py in create_cpython_wrapper(self, release_gil)
242 self.context.create_cpython_wrapper(self.library, self.fndesc,
243 self.env, self.call_helper,
--> 244 release_gil=release_gil)
245
246 def create_cfunc_wrapper(self):
/databricks/python/lib/python3.7/site-packages/numba/core/cpu.py in create_cpython_wrapper(self, library, fndesc, env, call_helper, release_gil)
160 fndesc, env, call_helper=call_helper,
161 release_gil=release_gil)
--> 162 builder.build()
163 library.add_ir_module(wrapper_module)
164
/databricks/python/lib/python3.7/site-packages/numba/core/callwrapper.py in build(self)
120
121 api = self.context.get_python_api(builder)
--> 122 self.build_wrapper(api, builder, closure, args, kws)
123
124 return wrapper, api
/databricks/python/lib/python3.7/site-packages/numba/core/callwrapper.py in build_wrapper(self, api, builder, closure, args, kws)
185
186 retty = self._simplified_return_type()
--> 187 obj = api.from_native_return(retty, retval, env_manager)
188 builder.ret(obj)
189
/databricks/python/lib/python3.7/site-packages/numba/core/pythonapi.py in from_native_return(self, typ, val, env_manager)
1387 "prevented the return of " \
1388 "optional value"
-> 1389 out = self.from_native_value(typ, val, env_manager)
1390 return out
1391
/databricks/python/lib/python3.7/site-packages/numba/core/pythonapi.py in from_native_value(self, typ, val, env_manager)
1401
1402 c = _BoxContext(self.context, self.builder, self, env_manager)
-> 1403 return impl(typ, val, c)
1404
1405 def reflect_native_value(self, typ, val, env_manager=None):
/databricks/python/lib/python3.7/site-packages/numba/core/boxing.py in box_namedtuple(typ, val, c)
502 Convert native array or structure *val* to a namedtuple object.
503 """
--> 504 cls_obj = c.pyapi.unserialize(c.pyapi.serialize_object(typ.instance_class))
505 tuple_obj = box_tuple(typ, val, c)
506 obj = c.pyapi.call(cls_obj, tuple_obj)
/databricks/python/lib/python3.7/site-packages/numba/core/pythonapi.py in serialize_object(self, obj)
1362 gv = self.module.__serialized[obj]
1363 except KeyError:
-> 1364 struct = self.serialize_uncached(obj)
1365 name = ".const.picklebuf.%s" % (id(obj) if config.DIFF_IR == 0 else "DIFF_IR")
1366 gv = self.context.insert_unique_const(self.module, name, struct)
/databricks/python/lib/python3.7/site-packages/numba/core/pythonapi.py in serialize_uncached(self, obj)
1333 """
1334 # First make the array constant
-> 1335 data = serialize.dumps(obj)
1336 assert len(data) < 2**31
1337 name = ".const.pickledata.%s" % (id(obj) if config.DIFF_IR == 0 else "DIFF_IR")
/databricks/python/lib/python3.7/site-packages/numba/core/serialize.py in dumps(obj)
166 with io.BytesIO() as buf:
167 p = pickler(buf)
--> 168 p.dump(obj)
169 pickled = buf.getvalue()
170
/databricks/python/lib/python3.7/pickle.py in dump(self, obj)
435 if self.proto >= 4:
436 self.framer.start_framing()
--> 437 self.save(obj)
438 self.write(STOP)
439 self.framer.end_framing()
/databricks/python/lib/python3.7/site-packages/numba/core/serialize.py in save(self, obj)
312 m = (f"Failed to pickle because of\n {type(e).__name__}: {e}"
313 f"\ntracing... \n{perline(self.__trace)}")
--> 314 raise _TracedPicklingError(m)
315 finally:
316 self.__trace.pop()
_TracedPicklingError: Failed in nopython mode pipeline (step: nopython mode backend)
Failed to pickle because of
PicklingError: Can't pickle <class 'collections.FlatTree'>: it's not found as collections.FlatTree
tracing...
[0]: <class 'type'>: 94412180539632
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:8
Top Results From Across the Web
Failed in nopython mode pipeline (step - Stack Overflow
The main problem is TypingError: invalid setitem with value of array(int32, 1d, C) to element of array(int64, 1d, C) in function ...
Read more >numba/numba - Gitter
TypingError: Failed in nopython mode pipeline (step: nopython frontend) Use of unsupported NumPy function 'numpy.array' or unsupported use of the function.
Read more >Pp.neighbors 'nopython mode pipeline error' - Help - Scanpy
When i try to use it I get an error saying: 'LoweringError: Failed in nopython mode pipeline (step: nopython mode backend)
Read more >[Solved]-TypingError: Failed in nopython mode pipeline (step
Other than that, the only problem was the initialization of interval . Numba doesn't want you to pass a numpy array to a...
Read more >LoweringError: Failed in nopython mode pipeline (step - 简书
<pre style="box-sizing: border-box; overflow: auto; font-family: monospace; font-size: inherit; display: block; padding: 1px 0px; ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hello 🙋♂️
I finally found a hack. It seems that there is some weird internal overriding that makes the
collections.FlatTree
class not pickeable.I hope this could help you!
It is working like charm now. 😁 Thank you so much @oscarorti tried it on 5k pbmc for now, will keep the community updated for 68k