question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

best practice for combining partial string indexing + regular mask?

See original GitHub issue

Consider the simple example as follows

import pandas as pd

dataframe = pd.DataFrame({'time' : [pd.to_datetime('2016-06-06'),
                                    pd.to_datetime('2016-06-06'),
                                    pd.to_datetime('2016-06-07'),
                                    pd.to_datetime('2016-06-08')],
                            'value' : [1,2,3,4],
                            'group' : ['A','B','A','B']})

dataframe.set_index('time', inplace = True)

dataframe
Out[13]: 
           group  value
time                   
2016-06-06     A      1
2016-06-06     B      2
2016-06-07     A      3
2016-06-08     B      4

Now I want to use all the cool partial string indexing functions, but also be able to filter on other variables. The solution I come up with is the syntax

dataframe.loc[dataframe['group'] == 'A'].loc['2016-06-06']

which looks really horrible with the two loc combined. Is that the correct pandonic way to do so (while keeping the index?) I have read and read again the documentation but I cannot find the answer.

Thanks!

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:15 (9 by maintainers)

github_iconTop GitHub Comments

2reactions
gfyoungcommented, Jul 19, 2017

Right, but you get a Series returned from that boolean test. The indices in that Series won’t match those of the dataframe.

1reaction
jrebackcommented, Jul 19, 2017

We don’t directly have an exposed partial string indexer. Instead .get_loc will do this, but it can return various things (e.g. a single index, array of booleans, or slice), so it is not directly user facing.

In [15]: dfi = dataframe.index

In [16]: dataframe.loc[(dataframe['group'] == 'A') & dfi.isin(dfi[dfi.get_loc('2016-06-06')])]
Out[16]: 
           group  value
time                   
2016-06-06     A      1
Read more comments on GitHub >

github_iconTop Results From Across the Web

Filter a Pandas DataFrame by a Partial String or Pattern in 8 ...
The & means combine the masks and return True where both masks are True, while | means return True where any of the...
Read more >
Best Practices for Regular Expressions in .NET - Microsoft Learn
This article outlines some of the best practices that developers can adopt to ensure that their regular expressions achieve optimal ...
Read more >
String Manipulation and Regular Expressions
Here we've first compiled a regular expression, then used it to split a string. Just as Python's split() method returns a list of...
Read more >
Regular Expression (Regex) Tutorial
Regular Expression, or regex or regexp in short, is extremely and amazingly powerful in searching and manipulating text strings, particularly in processing ...
Read more >
Manipulating Characters in a String (The Java™ Tutorials ...
Getting Characters and Substrings by Index. You can get the character at a particular index within a string by invoking the charAt() accessor...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found