Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Intake Integration

See original GitHub issue

Intake is a “lightweight package for finding, investigating, loading and disseminating data.” It would be nice to figure out how the JupyterLab data registry could integrate with this package.

Catalogs

Having JupyterLab be aware of Intake’s “Data catalogs” are probably a good place to start. They “provide an abstraction that allows you to externally define, and optionally share, descriptions of datasets, called catalog entries.”

Local

For example, if you have a catalog as a file on disk in a catalog.yaml file, we might want to be able to see the datasets it defines in the data registry. This is similar to how currently if you have a .ipynb file, you can view the the datasets in its cell outputs. To do this, we would have to be able to parse it’s YAML format in javascript, and map the different entries to URLs.

For example, this catalog.yml file:

metadata:
  version: 1
sources:
  example:
    description: test
    driver: random
    args: {}

  entry1_full:
    description: entry1 full
    metadata:
      foo: 'bar'
      bar: [1, 2, 3]
    driver: csv
    args: # passed to the open() method
      urlpath: '{{ CATALOG_DIR }}/entry1_*.csv'

  entry1_part:
    description: entry1 part
    parameters: # User parameters
      part:
        description: section of the data
        type: str
        default: "stable"
        allowed: ["latest", "stable"]
    driver: csv
    args:
      urlpath: '{{ CATALOG_DIR }}/entry1_{{ part }}.csv'

Might map to a number of nested URLs:

./dataset.yml#/sources/example
./dataset.yml#/sources/entry1_full
./dataset.yml#/sources/entry1_part

And the ones that point to CSV files, would also point to some nested URLs, like dataset.yml#/sources/entry1_part would point to:

./entry1_latest.csv
./entry1_stable.csv

This basically requires re-implementing the logic of the all the drivers, so that they can work client side.

Remote

We could also support loading a remote Intake data catalog. If you loaded a URL like intake://catalog1:5000 in the data registry you would want to be able to see the datasets available. Here, the proxy mode might be useful:

Proxied access: In this mode, the catalog server uses its local drivers to open the data source and stream the data over the network to the client. The client does not need any special drivers to read the data, and can read data from files and data servers that it cannot access, as long as the catalog server has the required access.

If we implement a client API for this server protocol, then we can let it handle all the data parsing and just expose the results it returns to the user. We would have to look more in depth in its specification.

Issue Analytics

State:
Created 4 years ago
Reactions:2
Comments:8 (3 by maintainers)

Top GitHub Comments

2reactions

martindurantcommented, Oct 15, 2019

Create new repo for this work

I have no preference where this lives. On jupyterlab or other related org or in Intake, all are fine.

0reactions

martindurantcommented, Dec 3, 2020

@saulshanabrook - this dropped off the table at some point. Are you still interested?

Top Results From Across the Web

Intake Process - Integration Services

The Integration Services team uses separate intake processes for integration requests depending on whether the integration work will be a Project or ...

3 Ways to Automate Client Intake - Clio

3 Ways to Automate Client Intake · 1. Use client intake tools · 2. Use Zapier · 3. Use a website service that...

INTEGRAL INTAKE - Desiree D. Lowit LICSW

From The Integral Intake: A Guide to Comprehensive Idiographic Assessment in Integral Psychotherapy, by Marquis (2008). New York: Routledge.

Intapp Intake • Intapp

Streamline management of the entire business-acceptance lifecycle with a configurable interface that automates manual intake processes and lets you leverage ...

Does IntakeQ Integrate with X Software?

IntakeQ offers direct integrations with certain 3rd party services: ... Dropbox: You can have intake forms and treatment notes uploaded to ...