synergies with libraries.io
See original GitHub issue@arfon pointed me to http://libraries.io which has an implementation of a very codemeta-spirit project for crawling standard package repositories and aggregating their data such as dependencies, versions, and so forth.
For instance, here’s their parser for crawling R-package DESCRIPTION files on CRAN and extracting metadata: https://github.com/librariesio/librarian-parsers/blob/master/lib/parsers/cran.js
The metadata model seems somewhat minimal, for instance, that script seems to focus only on dependencies, I don’t see any author / title metadata parsing in that cran.js
parser. (The CRAN dependency parser was only just implemented, so this data is not yet up on the landing pages).
They seem to be parsing GitHub data separately, and the landing pages for the packages show mostly data extracted from that (e.g. authors, title, etc as reported by GitHub, rather than the DESCRIPTION file).
I think they are also parsing the CRAN html records separately. For instance, compare the records of a package that is on CRAN & GitHub (testthat: https://libraries.io/cran/testthat) to one that is on CRAN but not GitHub (Matrix: https://libraries.io/cran/Matrix):
Issue Analytics
- State:
- Created 7 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
👋 @andrew. This sounds like a great idea. I suspect Matt and Carl are pretty focussed on finalizing the schema right now but adding something along these lines to the Libraries API would be ✨
Forgot to mention that there’s a JSON API for Libraries.io here: https://libraries.io/api
I’m also going to be applying for a grant from MOSS (https://wiki.mozilla.org/MOSS/Mission_Partners) which feels like a really good alignment with is project.