question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add corpus for Corpus Cyrillo-Methodianum Helsingiense (OCS)

See original GitHub issue

@for15pounds Has been made the inaugural member of the @cltk/slavic team, which will maintain Old Church Slavonic, Old East Slavic, and others. Valerie, once you accept, I’ll assign this task to you.

All that is necessary for this first step is to copy or crawl the ASCII texts. I’ll show then show you how to this repo to the CLTK core software.

In a another ticket, I would like to work with you to create a character transliteration map based (as here). Then, we’ll be have two versions of the docs, one in a dir called ascii and another in unicode.

Thank you!

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
michaalbertcommented, Oct 17, 2016

I haven’t seen any licensing information on their site and yeah, I already started working on a mapper for the Kyrillic texts (Codex Suprasliensis, Savvina kniga and the Vitae). The others are written in Glagolitic alphabet and will require a different mapper. That seems to be all for OCS from that site.

0reactions
kylepjohnsoncommented, Oct 17, 2016

Micah do you see any licensing info on this site? And are these all the OCS from this site?

For another project, I recommend making a mapper to convert their odd encoding to Unicode.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Corpus of Old Church Slavonic Texts - Kielipankki
The Corpus Cyrillo-Methodianum Helsingiense (CCMH) is a corpus of Old Church Slavonic (OCS) texts. It was collected at the University of Helsinki from...
Read more >
An Electronic Corpus of Old Church Slavonic Texts
The Corpus Cyrillo-Methodianum Helsingiense (CCMH) is an electronic corpus of the most important Old Church Slavonic (OCS) texts.
Read more >
Corpus Cyrillo-Methodianum Helsingiense - Korp
The Corpus Cyrillo-Methodianum Helsingiense (CCMH) is a corpus of Old Church Slavonic (OCS) texts. It was collected at the University of Helsinki from...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found