question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Table Schema v1: Add preface

See original GitHub issue

There are some basic concepts which are used throughout the spec but are never explained. I think we should add a preface to the spec explaining its purpose, some basic assumptions and concepts that a reader should understand before diving deep into the details.

It should go along the lines of these bullet points:

Tabular Data has a logical representation - rows, columns, fields etc… Each field has a name and a data type (string, number etc.) When loading the data from a physical representation (e.g. file on disk), the data may have some data type information (e.g. JSON) or none (e.g. CSV), where all data is represented in string form.

The spec is about the logical representation of tabular data, as well as about how to convert a physical representation into the logical one:

  • format (and accompanying properties) contain rules on converting from physical to logical representations
  • constraints contain rules for validating the physical representation of the data.

By reusing these concepts (e.g. “missingValues works on the physical representation”) it should be much clearer than current wording.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:9 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
pwalshcommented, Jun 19, 2017

@rufuspollock @akariv @roll

A text for the preface. I suggest adding this to the existing “concepts” section in the introduction. The current content of concepts could be titled “Tabular data” and this section title “Physical and logical representation of data”


Physical and logical representation data

In order to talk about the representation and processing of tabular data from text-based sources, it is useful to introduce the concepts of the physical and the logical representation of data.

The physical representation of data refers to the representation of data as text on disk, for example, in a CSV or JSON file. This representation may have some type information (JSON, where the primitive types that JSON supports can be used) or not (CSV, where all data is represented in string form).

The logical representation of data refers to the “ideal” representation of the data in terms of primitive types, data structures, and relations, all as defined by the specification. We could say that the specification is about the logical representation of data, as well as about ways in which to handle to conversion of a physical representation to a logical one.

In this document, we’ll explicitly refer to either the physical or logical representation in places where it prevents ambiguity for those engaging with the specification, especially implementors. For example, constraints should be tested on the logical representation of data, whereas a property like missingValues applies to the physical representation of the data.

0reactions
rufuspollockcommented, Jun 21, 2017

@pwalsh looks great and either a direct addition to the concepts section (maybe a subheading) or a separate subsection of the intro.

Assigning you for the glory of the PR 😉

Read more comments on GitHub >

github_iconTop Results From Across the Web

frictionlessdata/tableschema-py: A Python library for ... - GitHub
A Python library for working with Table Schema. Contribute to frictionlessdata/tableschema-py development by creating an account on GitHub.
Read more >
A Walkthrough of SQL Schema - SQLShack
Introduction to Schema · Retrieve all schema and their owners in a database · Specify default SQL schema while creating a new login....
Read more >
SQL Introduction - osquery - Read the Docs
To see schema in your shell for tables foreign to your OS, like kernel modules on macOS, use the --enable_foreign command line flag....
Read more >
Introduction to data.table
Introduction to data.table. 2022-11-15. This vignette introduces the data.table syntax, its general form, how to subset rows, ...
Read more >
Custom Database Tables - Optimizely
Each version of the database schema targets a specific version of the product. Optimizely CMS supports several SQL Server high-availability options for ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found