Dev Observability
Product
Pricing
Docs
Resources
Blog
Company
Debug Wordle

question-mark

Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Proper crawler setup

See original GitHub issue

If I want to crawl only *.example.com (both http and https) and want to exclude images, js files, css files - what should my crawler setup look like?

I tried many combinations but I either get external sites crawled or only http and not https and looks like excluded urls are overwritten by included so I managed to keep crawler to stick with the domain more or less but can’t make it ignore unwanted files.

My setup looks like this:

Thank you!

Issue Analytics

State:
Created 5 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

2reactions

marevolcommented, Feb 1, 2019

Exclude the unwanted filetypes - these files still get indexed

1 line is 1 regex.

0reactions

marevolcommented, Feb 2, 2019

https?://.*\.example\.com/.* It’s Java regex.

Read more comments on GitHub >

Top Results From Across the Web

How to configure your first crawler - Algolia

You can access the crawler's configuration through the Editor tab of the Crawler Admin. After selecting or creating a crawler, click on the...

RC Setup Guides for Rock Crawlers - So Dialed

Find setup tips and tuning guides for RC rock crawlers here at So Dialed, and find apps that help you keep score of...

Off-Road Outlaws: BEST Crawler Setup? Maxed Out Rock ...

Off-Road Outlaws: BEST Crawler Setup ? Maxed Out Rock Bouncer w/ Tracks + 4 Wheel Steer!

Setting crawler configuration options - AWS Glue

Learn about how to configure what a crawler does when it encounters schema changes and partition changes in your data store.

Tech Corner: A Beginner's Guide to Building a Rock Crawler ...

All it takes is a good amount of momentum behind a wheel and a sudden gain of traction for everything to go to...

Top Related Medium Post

No results found

Top Related StackOverflow Question

No results found

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Top Related Reddit Thread

No results found

Top Related Hackernoon Post

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Top Related Hashnode Post

No results found

Unable to export xlsx

Default crawler does not run all Web Crawlers