question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItΒ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

StorageLoader: add support for Spark-based Shredder's directory structure

See original GitHub issue

Due to the changes introduced by #3034, the directory structure will now look like:

shredded/
    good/
        run=2016-11-26-21-48-42/
            atomic-events/
                part-00000
                part-00001
                ...
            shredded-types/
                vendor=com.acme/
                    name=event/
                        format=jsonschema/
                            version=1-0-0/
                                part-00001-00010
                    name=context/
                        format=jsonschema/
                            version=1-0-0/
                                part-00001-00010

as opposed to what was before

shredded/
    good/
        run=2016-11-26-21-48-42/
            atomic-events/
                part-00000
                part-00001
                ...
            com.acme/
                event/
                    jsonschema/
                        1-0-0/
                            part-00001-00010
                context/
                    jsonschema/
                        1-0-0/
                            part-00001-00010

There are effectively two changes:

  • The field names of SchemaKey (vendor, name, format, version) are part of the path, what was com.acme/event/jsonschema/1-0-0 is now vendor=com.acme/name=event/format=jsonschema/version=1-0-0
  • A shredded-types layer has been added

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
alexanderdeancommented, Mar 23, 2017

Updated title, please update commit!

0reactions
alexanderdeancommented, Jun 12, 2017

StorageLoader no space

Read more comments on GitHub >

github_iconTop Results From Across the Web

Snowplow 89 Plain of Jars released, porting Snowplow to Spark
Spark Enrich and RDB Shredder; Under the hood ... StorageLoader has been updated to read Spark's output directory structure (#3044)Β ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found