question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Link parsing: Pinboard private feeds don't seem to get parsed properly

See original GitHub issue

I would love to have the cron job that monitors my Pocket feed also monitor my private Pinboard feed. However, no matter which method I use to pass the feed to bookmark-archiver using the instructions, all have their own unique failure.

If I pass a public feed, like http://feeds.pinboard.in/rss/u:username/, it works fine. But if I pass a private feed, like https://feeds.pinboard.in/rss/secret:xxxx/u:username/private/, it errors out. I have tried the RSS, JSON, and Text feeds, and none work.

Examples here: (I’ve simply replaced the actual feed I used to test, with the demo URL Pinboard provides) ./archive "https://feeds.pinboard.in/rss/secret:xxxx/u:username/private/"

[*] [2018-10-18 21:14:03] Downloadinghttps://feeds.pinboard.in/rss/secret:xxxx/u:username/private/ > output/sources/feeds.pinboard.in-1539897243.txt
[X] No links found :(

./archive "https://feeds.pinboard.in/json/secret:xxxx/u:username/private/"

[*] [2018-10-18 21:13:46] Downloading https://feeds.pinboard.in/json/secret:xxxx/u:username/private/ > output/sources/feeds.pinboard.in-1539897226.txt
Traceback (most recent call last):
  File "./archive", line 161, in <module>
    links = merge_links(archive_path=out_dir, import_path=source)
  File "./archive", line 53, in merge_links
    raw_links = parse_links(import_path)
  File "/home/USERNAME/datahoarding/bookmark-archiver/archiver/parse.py", line 54, in parse_links
    links += list(parser_func(file))
  File "/home/USERNAME/bookmark-archiver/archiver/parse.py", line 108, in parse_json_export
    url = erg['url']
KeyError: 'url'

./archive "https://feeds.pinboard.in/text/secret:xxxx/u:username/private/"

[*] [2018-10-18 21:17:57] Downloading https://feeds.pinboard.in/text/secret:xxxx/u:username/private/ > output/sources/feeds.pinboard.in-1539897477.txt
[X] No links found :(

Even though the script says that links are not found, they are definitely there, and simply pasting the URL into a browser outputs the feed in the proper format. I used this script successfully with other methods, like the Pinboard manual export, Pocket manual export AND RSS feed, and browser export. Is this just not a supported method for importing/monitoring?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:19 (16 by maintainers)

github_iconTop GitHub Comments

2reactions
f0086commented, Nov 19, 2018

From the settings->backup page:

Legacy HTML (seems to be broken HTML/XML?)

<!DOCTYPE NETSCAPE-Bookmark-file-1>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
<TITLE>Pinboard Bookmarks</TITLE>
<H1>Bookmarks</H1>
<DL>
<p>

<DT><A HREF="https://github.com/trailofbits/algo" ADD_DATE="1542616733" PRIVATE="1" TOREAD="1" TAGS="vpn,scripts,toread">Algo VPN scripts</A>
<DT><A HREF="http://www.ulisp.com/" ADD_DATE="1542374412" PRIVATE="1" TOREAD="1" TAGS="arduino,avr,embedded,lisp,toread">uLisp</A>

</DL>
</p>

XML

<?xml version="1.0" encoding="UTF-8"?>
	<posts user="aaronmueller">
<post href="https://github.com/trailofbits/algo" time="2018-11-19T08:38:53Z" description="Algo VPN scripts" extended="" tag="vpn scripts" hash="18d708f67bb26d843b1cac4530bb52aa"  shared="no" toread="yes" />
<post href="http://www.ulisp.com/" time="2018-11-16T13:20:12Z" description="uLisp" extended="" tag="arduino avr embedded lisp" hash="2a17ae95925a03a5b9bb38cf7f6c6f9b"  shared="no" toread="yes" />
</posts>

JSON

[{"href":"https:\/\/github.com\/trailofbits\/algo","description":"Algo VPN scripts","extended":"","meta":"62325ba3b577683aee854d7f191034dc","hash":"18d708f67bb26d843b1cac4530bb52aa","time":"2018-11-19T08:38:53Z","shared":"no","toread":"yes","tags":"vpn scripts"},
{"href":"http:\/\/www.ulisp.com\/","description":"uLisp","extended":"","meta":"7bd0c0ef31f69d1459e3d37366e742b3","hash":"2a17ae95925a03a5b9bb38cf7f6c6f9b","time":"2018-11-16T13:20:12Z","shared":"no","toread":"yes","tags":"arduino avr embedded lisp"}]

Private RSS feed:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns="http://purl.org/rss/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:cc="http://web.resource.org/cc/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://pinboard.in">
    <title>Pinboard (private aaronmueller)</title>
    <link>https://pinboard.in/u:aaronmueller/private/</link>
    <description></description>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="https://mehkee.com/"/>
        <rdf:li rdf:resource="https://qmk.fm/"/>
      </rdf:Seq>
    </items>
  </channel>

  <item rdf:about="https://mehkee.com/">
    <title>Mehkee - Mechanical Keyboard Parts &amp; Accessories</title>
    <dc:date>2018-11-08T21:29:32+00:00</dc:date>
    <link>https://mehkee.com/</link>
    <dc:creator>aaronmueller</dc:creator>
    <dc:subject>keyboard gadget diy</dc:subject>
    <dc:source>http://pinboard.in/</dc:source>
    <dc:identifier>http://pinboard.in/u:aaronmueller/b:xxx/</dc:identifier>
    <taxo:topics>
      <rdf:Bag>
        <rdf:li rdf:resource="http://pinboard.in/u:aaronmueller/t:keyboard"/>
        <rdf:li rdf:resource="http://pinboard.in/u:aaronmueller/t:gadget"/>
        <rdf:li rdf:resource="http://pinboard.in/u:aaronmueller/t:diy"/>
      </rdf:Bag>
    </taxo:topics>
  </item>
  <item rdf:about="https://qmk.fm/">
    <title>QMK Firmware - An open source firmware for AVR and ARM based keyboards</title>
    <dc:date>2018-11-06T22:36:21+00:00</dc:date>
    <link>https://qmk.fm/</link>
    <dc:creator>aaronmueller</dc:creator>
    <dc:subject>firmware keyboard</dc:subject>
    <dc:source>http://pinboard.in/</dc:source>
    <dc:identifier>http://pinboard.in/u:aaronmueller/b:xxx/</dc:identifier>
    <taxo:topics>
      <rdf:Bag>
        <rdf:li rdf:resource="http://pinboard.in/u:aaronmueller/t:firmware"/>
        <rdf:li rdf:resource="http://pinboard.in/u:aaronmueller/t:keyboard"/>
      </rdf:Bag>
    </taxo:topics>
  </item>
</rdf:RDF>
2reactions
f0086commented, Oct 19, 2018

I’ve ran into the same problem. I solved this with a little go program which will login to pinboard and klick the actual “backup my bookmarks in legacy Netscape format” button – which works fine for me.

package main

import (
  "gopkg.in/headzoo/surf.v1"
  "os"
  "flag"
)

var username = flag.String("username", "", "pinboard username")
var password = flag.String("password", "", "pinboard password")

func main() {
  flag.Parse()

  bow := surf.NewBrowser()
  err := bow.Open("https://pinboard.in/")
  if err != nil {
    panic(err)
  }

  form, formErr := bow.Form("form[name=login]")
  if formErr != nil {
    panic(formErr)
  }

  form.Input("username", *username)
  form.Input("password", *password)
  if form.Submit() != nil {
    panic(err);
  }

  err = bow.Open("https://pinboard.in/export/format:html/")
  if err != nil {
    panic(err)
  }

  bow.Download(os.Stdout)
}
$ export GOPATH=.
$ go get gopkg.in/headzoo/surf.v1
$ go build src/aaron-fischer.net/fupin/main.go
$ ./fuPin -username=[USERNAME] -password=[PASSWORD] > bookmarks.html
Read more comments on GitHub >

github_iconTop Results From Across the Web

Pinboard private feeds don't seem to get parsed properly
Even though the script says that links are not found, they are definitely there, and simply pasting the URL into a browser outputs...
Read more >
Emails are parsed incorrectly in Zapier
Emails parsed with Zapier's Email Parser may occasionally be parsed incorrectly. This problem can be identified by data from new parsed emails ...
Read more >
How to handle TlsNotSupported and call an HTTPS URL ...
I'm trying to call an API using Network.HTTP.Client and am trying to figure out how to properly handle a TlsNotSupported exception and call...
Read more >
It's time to head back to RSS?
Many RSS feeds just have a truncated sentence and a link. ... It seems to handle re: and fwd: correctly nearly all the...
Read more >
Using VMware vRealize Log Insight Cloud
The Cloud Native Collector is a Docker container that can be installed on any cloud VM. It provides log aggregation and configuration management,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found