question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

SkyCoord is very slow with pandas dataframe

See original GitHub issue

Description

Defining a SkyCoord object using a pandas dataframe is hundreds of times slower than simply using a numpy array

Expected behavior

Same performance

Steps to Reproduce


import numpy as np
import astropy.units as u
from astropy.coordinates import SkyCoord
import pandas as pd
import time as t

N = 100000
ra = np.random.uniform(1, 200, N)
de = np.random.uniform(-70, 70, N)

df = pd.DataFrame()
df['ra'], df['dec'] = ra, de

# Very slow
s = t.time()
gc = SkyCoord(ra=df['ra'] * u.degree, dec=df['dec'] * u.degree)
print(t.time() - s)

# Not slow
s = t.time()
gc = SkyCoord(ra=df['ra'].values * u.degree, dec=df['dec'].values * u.degree)
print(t.time() - s)

System Details

Linux-5.5.0-050500-generic-x86_64-with-glibc2.17 Python 3.8.12 (default, Oct 12 2021, 13:49:34) [GCC 7.5.0] Numpy 1.21.2 pyerfa 2.0.0 astropy 5.0 Scipy 1.7.3 Matplotlib 3.5.0

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:14 (13 by maintainers)

github_iconTop GitHub Comments

2reactions
taldcroftcommented, May 26, 2022

I removed the Close? and wont-fix labels. This is something that is worth addressing albeit at moderate priority.

1reaction
eerovahercommented, May 25, 2022

The underlying issue is described in more detail in #11247, and it has nothing to do with SkyCoord.

A second solution would be to use << instead of * because pandas does not think it can do that so astropy gets the chance of creating a Quantity.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Efficiently export Astropy SkyCoord array to array/DataFrame/file
My question is about writing a SkyCoord array to a text file, possibly using a Numpy array or Pandas DataFrame as an intermediary....
Read more >
Astronomical Coordinate Systems (astropy.coordinates)
In addition, looping over a SkyCoord object can be slow. If you need to transform the coordinates to a different frame, it is...
Read more >
astropy, numpy: applying function over coordinates is very slow
DataFrame () df['ra'], df['dec'] = ra, de # Very slow s = t.time() gc = SkyCoord(ra = df['ra'] * u.degree, dec = df['dec']...
Read more >
Is Pandas really that slow? - Medium
Therefore treating Pandas dataframes as a usual python data structure disregards the advantages of its optimized C code.
Read more >
Basic Queries - Foundations of Astronomical Data Science
Astropy provides a SkyCoord object that represents sky coordinates relative to a ... Make a Pandas DataFrame and use a Boolean Series to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found