question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Clean way to make pandas DataFrame from search results?

See original GitHub issue

I would like to create a DataFrame from Elasticsearch Query DSL results. Try as I might, I cannot figure out how to create a DataFrame just from the results. E.g. I have tried:

# Create a basic ES client
client = Elasticsearch(['url'])

# Search
search = Search(using=client)

results = search.execute()

search_dict = results.hits.hits

results_df = pd.DataFrame(search_dict)

However, my DataFrame looks like this: screenshot_20170915_153533

How can I get the values of the _source column to populate a pandas DataFrame?

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

3reactions
honzakralcommented, Oct 27, 2017
s = Search()
# add any filters/queries....
s = s.query(...)

results_df = pd.DataFrame((d.to_dict() for d in s.scan()))

should be exactly what you need. Alternatively omit the .scan() if you only want the top 10 results (useful for testing).

Hope this helps

0reactions
honzakralcommented, Aug 29, 2018

@abhimanyu3 I am sorry, but I cannot help you with any details that are related to pandas or python in general. This issue is for elasticsearch_dsl library. You did get all of the data out of elasticsearch as dicts, to trasnform them into a data frame just pass them into pd.DataFrame as seen in previous examples on this very same ticket. Otherwise please seek help in a forum dedicated to pandas directly on how to transform an interator of dictionaries into a data frame.

Thank you!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Simplify your Dataset Cleaning with Pandas | by Ulysse Petit
Get to work · Remove useless characters · Extract relevant content from a Series · Check NaN values · Change the type of...
Read more >
Creating DataFrame from ElasticSearch Results
I am trying to build a DataFrame in pandas, using the results of a very basic query to Elasticsearch. I am getting the...
Read more >
Data Cleaning and Preparation in Pandas and Python - Datagy
In this tutorial, you'll learn how to clean and prepare data in a Pandas DataFrame. You'll learn how to work with missing data,...
Read more >
Pythonic Data Cleaning With Pandas and NumPy - Real Python
Using the DataFrame.applymap() function to clean the entire dataset, element-wise; Renaming columns to a more recognizable set of labels; Skipping unnecessary ...
Read more >
Data Cleaning Using Pandas | Data Cleaning for Beginners
To import the dataset we use the read_csv() function of pandas and store it in the DataFrame named as data. As the dataset...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found