question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

how to make custom image scraper

See original GitHub issue

I’d like to make a custom image scraper for vtki with a goal of eventually contributing it here. I have a feeling this should be pretty similar to what’s going on in the Mayavi scraper but I’m unsure on how to get started making a custom scraper.

Could someone help me understand what the scraper is doing and outline some psuedocode on how to make my own?

Application

I’d love to use sphinx-gallery in the vtki documentation but I’m struggling to figure out how to have sphinx-gallery recognize the figures generated by vtki.

Current implementation

Currently, we might have a vtki example like the following on an rst doc page.

import vtki
from vtki import examples

# Load St Helens DEM and warp the topography
mesh = examples.download_st_helens().warp_by_scalar()

mesh.plot(opacity='linear', screenshot='opacity-linear.png')

I then use make doctest to run all the code snippets to save the screenshots to png files then import the image in the rst with a directive

.. image:: ../../../images/opacity-linear.png

opacity-linear

Goal

I’m thinking I’ll have to implement something that keeps track of all active plotting windows in vtki then within the scraper, access the plotting windows and save the figures much like what’s done with the Mayavi scraper. This would avoid the need to save the png figures generated with make doctest

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:23 (19 by maintainers)

github_iconTop GitHub Comments

1reaction
stefanvcommented, Apr 4, 2019

I got GIFs working this week; it wasn’t too bad, apart from the annoying png-extension-gif issue mentioned above. But with movies, since you need a video tag, it’s a bit trickier. GIFs really aren’t feasible unless you don’t mind massive documentation: they easily come in at many tens of megabytes each, even for short movies.

For now, my solution is to generate the movie, write out the video tag by hand, and to make sure the output file gets copied to the right place. There’s no gif preview needed, because the browser video control already shows one frame.

1reaction
larsonercommented, Apr 4, 2019

Would be great to have an example to embed a movie because gif is a terrible format from a bandwidth standpoint.

FWIW I think you can already embed arbitrary RST, so you extract the movie and add the RST/HTML to embed it and write out the expected png (which will only really be used to create a thumbnail) you might already be able to do it.

this gif has a .png extension and relies on browser smarts to auto-detect that it is in fact a gif:

This PNG reliance is relaxed in #471 (which only adds SVG), and makes it clear where we need to change things to improve support.

Perhaps maintainers would prefer a new issue to be raised?

There is an issue about animations:

https://github.com/sphinx-gallery/sphinx-gallery/issues/150

I suggest we first get GIF working, because it’s simpler, then see what needs to be done to generalize to movie embed + GIF thumbnail/preview of the movie in the gallery page after that. I guess a movie-embed-plus-gif-preview is a bit beyond this, so feel free to open an issue if you want. (Or see what @choldgraf comes up with in the GIF PR first.)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Building Simple Web Scrapers for Image Data Collection
Using this scraper, we will attempt to collect and curate a custom image dataset for a computer vision project. The Concept of Web...
Read more >
How to Scrape Website Images with Python - Oxylabs
Understanding how to build a basic data extraction tool will make creating a Python image scraper significantly easier. Additionally, we will use parts...
Read more >
A Tutorial on Scraping Images from the Web Using ...
To do this, you'll want to import another package called shutil. Then, instead of kind of rambling on about what's going on with...
Read more >
Using Google Custom Search as a image scraper?
I'd recommend you to use google search for scrape images that coincide with specific text. Remember that google is sensitive to the web...
Read more >
Scrape Google Images to Create Custom Database ... - SerpApi
Creating a custom database for a machine learning project can be messy, and most of the time, complicated. In the context of example, ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found