Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Solving environment: failed with initial frozen solve. Retrying with flexible solve. Solving environment: | failed with repodata from current_repodata.json, will retry with next repodata source.

See original GitHub issue

System information

** Linux Ubuntu 18.04.5 LTS**:
Modin version (modin.__version__):
Python 3.8.13:
Code we can use to reproduce:

Describe the problem

freezes at install

conda install modin modin-all modin-core modin-dask modin-omnisci modin-ray

Source code / logs

Collecting package metadata (current_repodata.json): done Solving environment: failed with initial frozen solve. Retrying with flexible solve. Solving environment: | failed with repodata from current_repodata.json, will retry with next repodata source.

If I install with pip sometimes 1. it gives and error message than it appears without an error message at the import but than it doesn’t works

pip install modin[all]

CODE on 1st run

'''import pandas as pd'''
import modin.pandas as pd
#from distributed import Client
#client = Client()

print(pd.__version__)

ERROR on 1st run

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-1-26dafc1a8b6a> in <module>
      1 '''import pandas as pd'''
----> 2 import modin.pandas as pd
      3 #from distributed import Client
      4 #client = Client()
      5 

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/pandas/__init__.py in <module>
    170 
    171 from .. import __version__
--> 172 from .dataframe import DataFrame
    173 from .io import (
    174     read_csv,

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/pandas/dataframe.py in <module>
     46 from .series import Series
     47 from .base import BasePandasDataset, _ATTRS_NO_LOOKUP
---> 48 from .groupby import DataFrameGroupBy
     49 from .accessor import CachedAccessor, SparseFrameAccessor
     50 

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/pandas/groupby.py in <module>
     32     wrap_into_list,
     33 )
---> 34 from modin.backends.base.query_compiler import BaseQueryCompiler
     35 from modin.data_management.functions.default_methods.groupby_default import GroupBy
     36 from modin.config import IsExperimental

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/backends/__init__.py in <module>
     17 __all__ = ["BaseQueryCompiler", "PandasQueryCompiler"]
     18 try:
---> 19     from .pyarrow import PyarrowQueryCompiler  # noqa: F401
     20 except ImportError:
     21     pass

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/backends/pyarrow/__init__.py in <module>
     14 """The module represents the query compiler level for the PyArrow backend."""
     15 
---> 16 from .query_compiler import PyarrowQueryCompiler
     17 
     18 __all__ = ["PyarrowQueryCompiler"]

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/backends/pyarrow/query_compiler.py in <module>
     26 from pandas.core.computation.ops import UnaryOp, BinOp, Term, MathCall, Constant
     27 
---> 28 import pyarrow as pa
     29 import pyarrow.gandiva as gandiva
     30 

~/anaconda3/envs/tfall/lib/python3.7/site-packages/pyarrow/__init__.py in <module>
     52 
     53 
---> 54 from pyarrow.lib import cpu_count, set_cpu_count
     55 from pyarrow.lib import (null, bool_,
     56                          int8, int16, int32, int64,

~/anaconda3/envs/tfall/lib/python3.7/site-packages/pyarrow/ipc.pxi in init pyarrow.lib()

AttributeError: type object 'pyarrow.lib.Message' has no attribute '__reduce_cython__'

than

CODE 2nd run

'''import pandas as pd'''
import modin.pandas as pd
from distributed import Client
client = Client()

print(pd.__version__)

OURPUT 2nd run

0.11.3

CODE load in csv

df1 = pd.read_csv("my_data.csv")
df1

ERROR - load in csv

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-5-771e5138b146> in <module>
      3 DATA_URL1 = "normalized_correct_appid.csv"
      4 
----> 5 df1 = pd.read_csv(DATA_URL1)
      6 #df1
      7 

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/pandas/io.py in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, skipfooter, doublequote, delim_whitespace, low_memory, memory_map, float_precision, storage_options)
    133     _, _, _, f_locals = inspect.getargvalues(inspect.currentframe())
    134     kwargs = {k: v for k, v in f_locals.items() if k in _pd_read_csv_signature}
--> 135     return _read(**kwargs)
    136 
    137 

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/pandas/io.py in _read(**kwargs)
     58     Engine.subscribe(_update_engine)
     59     squeeze = kwargs.pop("squeeze", False)
---> 60     pd_obj = FactoryDispatcher.read_csv(**kwargs)
     61     # This happens when `read_csv` returns a TextFileReader object for iterating through
     62     if isinstance(pd_obj, pandas.io.parsers.TextFileReader):

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/data_management/factories/dispatcher.py in read_csv(cls, **kwargs)
    176     @_inherit_docstrings(factories.BaseFactory._read_csv)
    177     def read_csv(cls, **kwargs):
--> 178         return cls.__factory._read_csv(**kwargs)
    179 
    180     @classmethod

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/data_management/factories/factories.py in _read_csv(cls, **kwargs)
    204     )
    205     def _read_csv(cls, **kwargs):
--> 206         return cls.io_cls.read_csv(**kwargs)
    207 
    208     @classmethod

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/engines/base/io/file_dispatcher.py in read(cls, *args, **kwargs)
     66         postprocessing work on the resulting query_compiler object.
     67         """
---> 68         query_compiler = cls._read(*args, **kwargs)
     69         # TODO (devin-petersohn): Make this section more general for non-pandas kernel
     70         # implementations.

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/engines/base/io/text/csv_dispatcher.py in _read(cls, filepath_or_buffer, **kwargs)
    167             skipfooter=kwargs.get("skipfooter", None),
    168             parse_dates=kwargs.get("parse_dates", False),
--> 169             nrows=kwargs.get("nrows", None) if should_handle_skiprows else None,
    170         )
    171         return new_query_compiler

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/engines/base/io/text/csv_dispatcher.py in _get_new_qc(cls, partition_ids, index_ids, dtypes_ids, index_col, index_name, column_widths, column_names, skiprows_md, header_size, **kwargs)
    298             New query compiler, created from `new_frame`.
    299         """
--> 300         new_index, row_lengths = cls._define_index(index_ids, index_name)
    301         # Compute dtypes by getting collecting and combining all of the partitions. The
    302         # reported dtypes from differing rows can be different based on the inference in

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/engines/base/io/text/csv_dispatcher.py in _define_index(cls, index_ids, index_name)
    241             Partitions rows lengths.
    242         """
--> 243         index_objs = cls.materialize(index_ids)
    244         if len(index_objs) == 0 or isinstance(index_objs[0], int):
    245             row_lengths = index_objs

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/engines/dask/task_wrapper.py in materialize(cls, future)
     62         """
     63         client = default_client()
---> 64         return client.gather(future)

~/anaconda3/envs/tfall/lib/python3.7/site-packages/distributed/client.py in gather(self, futures, errors, direct, asynchronous)
   1986                 direct=direct,
   1987                 local_worker=local_worker,
-> 1988                 asynchronous=asynchronous,
   1989             )
   1990 

~/anaconda3/envs/tfall/lib/python3.7/site-packages/distributed/client.py in sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
    852         else:
    853             return sync(
--> 854                 self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
    855             )
    856 

~/anaconda3/envs/tfall/lib/python3.7/site-packages/distributed/utils.py in sync(loop, func, callback_timeout, *args, **kwargs)
    352     if error[0]:
    353         typ, exc, tb = error[0]
--> 354         raise exc.with_traceback(tb)
    355     else:
    356         return result[0]

~/anaconda3/envs/tfall/lib/python3.7/site-packages/distributed/utils.py in f()
    335             if callback_timeout is not None:
    336                 future = asyncio.wait_for(future, callback_timeout)
--> 337             result[0] = yield future
    338         except Exception as exc:
    339             error[0] = sys.exc_info()

~/.local/lib/python3.7/site-packages/tornado/gen.py in run(self)
    760 
    761                     try:
--> 762                         value = future.result()
    763                     except Exception:
    764                         exc_info = sys.exc_info()

~/anaconda3/envs/tfall/lib/python3.7/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker)
   1845                             exc = CancelledError(key)
   1846                         else:
-> 1847                             raise exception.with_traceback(traceback)
   1848                         raise exc
   1849                     if errors == "skip":

~/anaconda3/envs/tfall/lib/python3.7/site-packages/distributed/protocol/pickle.py in loads()
     73             return pickle.loads(x, buffers=buffers)
     74         else:
---> 75             return pickle.loads(x)
     76     except Exception as e:
     77         logger.info("Failed to deserialize %s", x[:10000], exc_info=True)

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/backends/__init__.py in <module>
     17 __all__ = ["BaseQueryCompiler", "PandasQueryCompiler"]
     18 try:
---> 19     from .pyarrow import PyarrowQueryCompiler  # noqa: F401
     20 except ImportError:
     21     pass

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/backends/pyarrow/__init__.py in <module>
     14 """The module represents the query compiler level for the PyArrow backend."""
     15 
---> 16 from .query_compiler import PyarrowQueryCompiler
     17 
     18 __all__ = ["PyarrowQueryCompiler"]

~/anaconda3/envs/tfall/lib/python3.7/site-packages/modin/backends/pyarrow/query_compiler.py in <module>
     26 from pandas.core.computation.ops import UnaryOp, BinOp, Term, MathCall, Constant
     27 
---> 28 import pyarrow as pa
     29 import pyarrow.gandiva as gandiva
     30 

~/anaconda3/envs/tfall/lib/python3.7/site-packages/pyarrow/__init__.py in <module>
     52 
     53 
---> 54 from pyarrow.lib import cpu_count, set_cpu_count
     55 from pyarrow.lib import (null, bool_,
     56                          int8, int16, int32, int64,

~/anaconda3/envs/tfall/lib/python3.7/site-packages/pyarrow/ipc.pxi in init pyarrow.lib()

AttributeError: type object 'pyarrow.lib.Message' has no attribute '__reduce_cython__'

Issue Analytics

State:
Created a year ago
Comments:6 (2 by maintainers)

Top GitHub Comments

5reactions

mvashishthacommented, Jul 26, 2022

@stromal I am trying to reproduce this issue on an an Amazon EC2 instance with Ubuntu 20.04.3 LTS (Focal Fossa).

For conda forge I did:

Create a new conda environment: conda create --name py37-install-4719 python=3.7
Activate the new environment: conda activate py37-install-4719
conda install everything you listed: conda install modin modin-all modin-core modin-dask modin-omnisci modin-ray
I can successfully run the script:

import modin.pandas as pd
print(pd.__version__)

For pip I did:

Create a new conda environment: conda create --name py37-install-4719-pip python=3.7
Activate the environment: conda activate py37-install-4719-pip
install modin: pip install modin[all]
I can successfully run the script:

import modin.pandas as pd
print(pd.__version__)

Could you please try following the steps I gave to start from scratch in a new environment?

If that doesn’t work, could you try:

conda update --all?

That seemed to work for some people here

cc @modin-project/modin-core in case anyone can tell what’s going wrong here.

0reactions

stromalcommented, Jul 28, 2022

@mvashishtha

Have not worked

I have tried what you have mentioned in your 1st comment but it gives me the following error:

1.code

conda create --name py37-install-4719 python=3.7
conda activate py37-install-4719
conda install modin modin-all modin-core modin-dask modin-omnisci modin-ray

1.output

RUNS for hours on a AWS ec2 g4dn.4xlarge with no other load I have monitored it with htop

Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: |

This have worked

Previously pip install modin[all] code might gave me an error because of:

Maybe because I have used other python 3 version I think 3.9 other trials did not worked before this.
I have also had the actual DASK package installed in other environments
Any other library incompatibility

Currently Fully working solution

conda create --name py37-install-4719-pip python=3.7
conda activate py37-install-4719-pip
pip install modin[all]

Ex Code Usage

import modin.pandas as pd
from distributed import Client
client = Client()
df = pd.read_csv('my_single_csv_name.csv')

Top Results From Across the Web

failed with initial frozen solve. Retrying with flexible ...

Retrying with flexible solve. Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.

python - Conda install and update do not work also solving ...

Retrying with flexible solve. Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.

failed with initial frozen solve. Retrying with flexible ...

Retrying with flexible solve. Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.

[Linux/Ubuntu] How to fix 'Solving environment: failed with ...

Retrying with flexible solve' while installing a certain package in your environment, then here's the fix without creating a new environment!

[rDock-list-def] How can I install rDock on Ubuntu 18.04?

Retrying with flexible solve. Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source. Collecting package ...