question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Move PYSPARK backend to the backends/ directory

See original GitHub issue

Move the pyspark/ backend to backends/pyspark.

The current Ibis directory structure is a bit confusing, since backends are mixed with core modules. For example, it’s not obvious to know if pandas/ is a backend, or core functionality. Many of the backends live in the root directory, and some are grouped, like in a directory sql/. This makes things even harder and difficult to maintain. Also, ibis uses attributes to access the backends (e.g. ibis.spark.connect), and ibis.spark in code seems to be the ibis/spark/ module, but it’s not, making things confusing, and even creating problems with pytest.

We’re moving backends to a standardized structure, in the backends/ directory, one backend at a time (each backend in a separate pull request). We started by the OmniSciDB backend, and its PR can be a reference for the changes that need to be made here: #2392

The exact tasks to perform are:

  • Move the backend from its original location to backends/ (if the backend is in a subdirectory, like sql/, ignore the subdirectory and just have the backend inside backends/, so sql/sqlite/ -> backends/sqlite)
  • Move the content of backends/<your-backend>/api.py to backends/<your-backend>/__init__.py. Backends are using api.py for historical reasons, but the public backend API should be in the standard __init__.py file of the module.
  • Update ibis/__init__.py so the backend is imported from the new module (from ibis.<your-backend> import api -> from ibis import <your-backend>)
  • Look for places in the code where your backend is being imported, and update the imports. For imports inside the same backend, please use relative imports (e.g. import ibis.<your-backend>.whatever -> import .whatever). See the reference PR for examples.
  • Update the documentation (in doc/source/backends/<your-backend>.rst) so the API is generated from the right location.

After performing the changes, the CI should be green, except for the packaging build (which requires changes in the conda-forge feedstock). You can ignore that error, those changes will be performed later for all moved backends.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
datapythonistacommented, Oct 23, 2020

No, this was closed by #2479.

All the moving has been done now, the only thing that is pending are couple of files left of #2466. If you want to move that, it would be great.

Something else that would be very valuable, and I think it shouldn’t be so complex is #2389. There is a PR open, but the approach ended up not being great. So working on the original approach of making the scripts work on windows would be better. And the CI has been giving us lots of trouble lately, so being able to move everything to Actions would be really great.

1reaction
datapythonistacommented, Oct 1, 2020

The content of api.py needs to be moved to __init__.py, and api.py needs to be removed.

Read more comments on GitHub >

github_iconTop Results From Across the Web

GitHub - triton-inference-server/python_backend
Triton backend that enables pre-process, post-processing and other logic to be implemented in Python. - GitHub - triton-inference-server/python_backend: ...
Read more >
Backends — Matplotlib 3.6.2 documentation
Setting this environment variable will override the backend parameter in any matplotlibrc , even if there is a matplotlibrc in your current working...
Read more >
Ibis: Seamless Transition Between Pandas and Apache Spark
In this talk, we will introduce a Spark backend for ibis and demonstrate how users can go between pandas and Spark with the...
Read more >
PySpark - Ibis Project
For some backends, the tables may be files in a directory, or other equivalent entities in a SQL database. A pattern in Python's...
Read more >
Using the Spark Connector - Snowflake Documentation
Moving Data from Spark to Snowflake¶. The steps for saving the contents of a DataFrame to a Snowflake table are similar to writing...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found