question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error attempting to claim book from newsletter

See original GitHub issue
~ $ python script/spider.py --config config/prod.cfg --notify ifttt --claimOnly

                      __   __              __                                   __
    ____  ____ ______/ /__/ /_____  __  __/ /_        ______________ __      __/ /__  _____
   / __ \/ __ `/ ___/ //_/ __/ __ \/ / / / __ \______/ ___/ ___/ __ `/ | /| / / / _ \/ ___/
  / /_/ / /_/ / /__/ ,< / /_/ /_/ / /_/ / /_/ /_____/ /__/ /  / /_/ /| |/ |/ / /  __/ /
 / .___/\__,_/\___/_/|_|\__/ .___/\__,_/_.___/      \___/_/   \__,_/ |__/|__/_/\___/_/
/_/                       /_/

Download FREE eBook every day from www.packtpub.com
@see github.com/niqdev/packtpub-crawler

[*] 2017-01-31 10:30 - fetching today's eBooks
[*] configuration file: /app/config/prod.cfg
[*] getting daily free eBook
[*] fetching url... 200 | https://www.packtpub.com/packt/offers/free-learning
[*] fetching url... 200 | https://www.packtpub.com/packt/offers/free-learning
[*] fetching url... 200 | https://www.packtpub.com/account/my-ebooks
[+] book successfully claimed
[+] notification sent to IFTTT
[*] getting free eBook from newsletter
[*] fetching url... 200 | https://www.packtpub.com/packt/free-ebook/practical-data-analysis
[-] <type 'exceptions.IndexError'> list index out of range | spider.py@123
Traceback (most recent call last):
  File "script/spider.py", line 123, in main
    packtpub.runNewsletter(currentNewsletterUrl)
  File "/app/script/packtpub.py", line 160, in runNewsletter
    self.__parseNewsletterBookInfo(soup)
  File "/app/script/packtpub.py", line 98, in __parseNewsletterBookInfo
    title = urlWithTitle.split('/')[4].replace('-', ' ').title()
IndexError: list index out of range
[+] error notification sent to IFTTT
[*] done
~ $

It has successfully claimed the book from the newsletter already, but on subsequent days I’m getting the above error.

And it sends an IFTTT notification for the second one 😦

Issue Analytics

  • State:open
  • Created 7 years ago
  • Comments:39 (30 by maintainers)

github_iconTop GitHub Comments

1reaction
CrazySerGocommented, Apr 9, 2017

Hi Guys, I’m creating google script that parsing PacktPab tweets(it comes from @juzim google script). I’m not sure but there is a chance that all books from newsletters also will be published on their Twitter and no needs to fix it 😃 joking. It’s not finished - should exclude duplicates and check does link still available or not. If you have time, please look on output if it’s fine for crawler or not https://goo.gl/AXtAC8

1reaction
mkarpiarzcommented, Apr 9, 2017

Looks like some of the divs has been renamed on the newsletter’s landing page. I compared the page for an older book:

    <div class="book-top-block-wrapper cf">
        <div class="cf section-inner">
            <div class="float-left promo-landing-book-picture">
                <div itemprop="image" itemtype="http://schema.org/URL" itemscope>
                    <a href="/web/20170113204509/https://dz13w8afd47il.cloudfront.net/networking-and-servers/mastering-aws-development">
                        <img src="/web/20170113204509im_/https://d1ldz4te4covpm.cloudfront.net/sites/default/files/3632EN_Mastering%20AWS%20Development.jpg" class="bookimage" />
                    </a>
                </div>
            <div class="float-left promo-landing-book-info">
                <div class="promo-landing-book-body-title">
                                    </div>
                <div class="promo-landing-book-body">
                    <div><h1>Claim your free 416 page Amazon Web Services eBook!</h1>
<p>This book is a practical guide to developing, administering, and managing applications and infrastructures with AWS. With this, you'll be able to create, design, and manage an entire application life cycle on AWS by using the AWS SDKs, APIs, and the AWS Management Console.</p>
</div>
                </div>
                            </div>

with the current one:

<div id="main-book" class="cf nano" itemscope itemtype="http://schema.org/Book">
    <div class="book-top-block-wrapper cf">
        <div class="cf section-inner">
            <div class="float-left nano-book-main-image">
                <div itemprop="image" itemtype="http://schema.org/URL" itemscope>
                    <a class="fancybox" href="///d1ldz4te4covpm.cloudfront.net/sites/default/files/imagecache/nano_main_image/5612_WYNTKAngular_eBook_500x617.jpg">
                        <img src="//d1ldz4te4covpm.cloudfront.net/sites/default/files/imagecache/nano_main_image/5612_WYNTKAngular_eBook_500x617.jpg" class="bookimage" />
                    </a>
                </div>
            <div class="float-left nano-book-text">
                <h1>What you need to know about Angular 2</h1>
                <div><strong>Get to grips with the ins and outs of one of the biggest web dev revolutions of this decade with the aid of this free eGuide! From setting up the very basics of Angular to making the most of Directives and Components you’ll discover everything you need to get started building your own web apps today.</strong></div>
                <div id="nano-learn">
                    <div id="nano-learn-title">
                        <div id="nano-learn-title-text">
                            <span id="nano-learn-title-text-inner">
                                What You Will Learn                            </span>
                        </div>
                    </div>

and came up with this hotfix: https://github.com/niqdev/packtpub-crawler/compare/master...mkarpiarz:fix_newsletter_divs I haven’t tested email notifications yet, so I’m not sure how the description would look like, but claiming a newsletter ebook seems to work now. Happy to submit a PR if @juzim haven’t started working on this yet.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Clear Browser Cookies to Resolve “Error 500
If you are receiving an “Error 500 – Internal Server Error” message while trying to log into Canvas, you will need to clear...
Read more >
David Wallace-Wells Newsletter - The New York Times
The best-selling science writer and essayist explores climate change, technology, the future of the planet and how we live on it. ... Try...
Read more >
Five Chats to Help You Understand ChatGPT - The Atlantic
computer with a thoughtful face writing into a book with a quill ... True to its claim, ChatGPT has stolen the show this...
Read more >
FTX Trading Ltd. - Restructuring Administration Cases
Case Number Debtor Name Petition Date Case Number 22‑11068 Debtor Name FTX Trading Ltd. Petition Date November 11... Case Number 22‑11066 Debtor Name Alameda Research...
Read more >
Overview of New York Construction Claim Litigation - A Construction ...
An error occurred, please try again later. Tashia Rasul, Partner at Lois Law Firm and Construction Defense Team Leader, discusses the patchwork of...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found