question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Asymmetric behavior between index and columns when getting incomplete label

See original GitHub issue

Code Sample, a copy-pastable example if possible

In [2]: df = pd.DataFrame([[1,2], [3,4]], index=pd.MultiIndex.from_tuples([['a', 'b'], ['c', '']]))

In [3]: df.loc['c'].shape
Out[3]: (1, 2)

In [4]: df.transpose().loc[:, 'c'].shape
Out[4]: (2,)

Problem description

Maybe the “fill an incomplete key with empty string(s)” rule is not implemented at all for rows? (also in light of #17024 ) If this the case, then I think it should be.

Expected Output

The same as Out[4] but reversed.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: 9e7666dae3b3b10d987ce154a51c78bcee6e0728 python: 3.5.3.final.0 python-bits: 64 OS: Linux OS-release: 4.9.0-3-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: it_IT.UTF-8 LOCALE: it_IT.UTF-8

pandas: 0.21.0.dev+265.g9e7666dae pytest: 3.0.6 pip: 9.0.1 setuptools: None Cython: 0.25.2 numpy: 1.12.1 scipy: 0.19.0 xarray: None IPython: 5.1.0.dev sphinx: 1.5.6 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: 1.2.1 tables: 3.3.0 numexpr: 2.6.1 feather: 0.3.1 matplotlib: 2.0.2 openpyxl: None xlrd: 1.0.0 xlwt: 1.1.2 xlsxwriter: 0.9.6 lxml: None bs4: 4.5.3 html5lib: 0.999999999 sqlalchemy: 1.0.15 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: 0.2.1

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
chris-b1commented, Jul 19, 2017

Sure, our basic behavior is that indexing operations that are “slice-like” (e.g. selecting an entire level) on a MultiIndex return back a DataFrame. Couple examples:

In [4]: idx = pd.MultiIndex.from_tuples([('a', ''), ('b', '1'), ('c', '1'), ('c', '2')])

In [5]: df = pd.DataFrame(np.arange(16).reshape(4,4), index=idx, columns=idx)

In [6]: df
Out[6]: 
      a   b   c    
          1   1   2
a     0   1   2   3
b 1   4   5   6   7
c 1   8   9  10  11
  2  12  13  14  15

In [7]: type(df.loc['b', :])
Out[7]: pandas.core.frame.DataFrame

In [8]: type(df.loc['c', :])
Out[8]: pandas.core.frame.DataFrame

In [9]: type(df.loc[:, 'b'])
Out[9]: pandas.core.frame.DataFrame

In [10]: type(df.loc[:, 'c'])
Out[10]: pandas.core.frame.DataFrame

But, as an undocumented “convenience” feature (linked issue), if the selection is on the columns, and all deeper levels are labeled with empty strings, the selection collapses into a Series - this collapsing doesn’t happen with a row selection (this issue)

In [12]: df.loc[:, 'b']
Out[12]: 
      1
a     1
b 1   5
c 1   9
  2  13

In [13]: df.loc[:, 'a']
Out[13]: 
a        0
b  1     4
c  1     8
   2    12
Name: a, dtype: int32

In [16]: type(df.loc[:, 'a'])
Out[16]: pandas.core.series.Series

In [17]: df.loc['a', :]
Out[17]: 
  a  b  c   
     1  1  2
  0  1  2  3

In [18]: type(df.loc['a', :])
Out[18]: pandas.core.frame.DataFrame

0reactions
toobazcommented, Jul 20, 2017

The expected shape is just the dimensions reversed (it’s a transposition).

My example was maybe a bit cryptic, sorry. The thing is that a shape (1,2) when transposed gives (2,1), not (2,).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Symmetric and asymmetric behavior: choosing contact and ...
Hello everyone,. I'm wondering if Ansys chooses the contact and target surfaces automatically only if I use symmetric behavior. Instead, if I use...
Read more >
Molecular Replacement - Phaserwiki
Automated Molecular Replacement combines the anisotropy correction, likelihood enhanced fast rotation function, likelihood enhanced fast ...
Read more >
Interpreting Residual Plots to Improve Your Regression
When you run a regression, Stats iQ automatically calculates and plots residuals to help you understand and improve your regression model.
Read more >
HPLC troubleshooting guide - CCC
Since this peak is highly asymmetric, it is best to measure the width as close to the baseline as possible to get a...
Read more >
JEEA-FBBVA LECTURE 2019: Consumption Insurance in ...
Abstract. This paper uses a dataset from Tanzania with information on consumption, income, and income shocks within and across family ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found