question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RFC: User-defined colors

See original GitHub issue

RFC: User-defined colors

This is a proposed use case for public comment. We are considering adding this in a future release. If you have comments, please leave them, or just give us a thumbs up/down reaction. Thanks!

Authors: @mweiden, @bkmartinjr Document status: Open for Review Last date for comments: EOD 2020-04-06

Need

Researchers publishing datasets with cellxgene need the ability to ensure colors match their paper publication colors. Some example deployments include:

Note that these data publisher user stories are distinct from use cases of users consuming data through the cellxgene deployment. Further, we are not prioritizing custom colors for continuous variables at this time.

For more, see the user stories below.

In-scope user stories:

  • As a cellxgene user hosting datasets accompanying my paper, I want to specify custom colors for data with a category label per dataset, so that I can make my cellxgene deployments and/or images match my publication.

Not-in-scope user stories:

  • As a cellxgene user using a cellxgene deployment to explore data, I want to override colors in the UI with my own custom colors, so that I can better interpret the data and generate images for new publications.
  • As a cellxgene user hosting datasets accompanying my paper, I want to select from a set of colormaps for continuous variables per dataset, so that I can make my cellxgene deployments and/or images match my publication.

Sources:

Definitions:

  • Category-label pair: Currently in annotations, you create a category, then labels for that category. So colors will be assigned per combination of 1) category 2) label for that category. In CS-speak this is a 2-tuple of (category_name, label_name). Example: (organ, spleen)
  • .cxg: .cxg is cellxgene’s native file format.

Approach

The implementation would follow this user flow:

  1. Upon launch, cellxgene would inspect the data file loaded for color information and use it if its usage is clear. Each data file type (cxg, h5ad, and others in the future) will require its own format-specific standard for recovering this data.
  2. If there are no colors specified in the data file, cellxgene uses its own, default palettes and colormaps. If the user provides colors for some but not all of the category-label pairs in a category, cellxgene uses a best effort strategy, using the colors specified by the user first, then falling back to default colors.

Note that if the user starts with a .loom file and would like to add custom colors, they will have to use cellxgene prepare to convert their .loom file into an .h5ad and add custom colors to that .h5ad file using the scanpy standard. With this model, we imagine the following user stories:

  • Sally has an H5AD file and is using ScanPy.plotting to explore various visualizations. As part of this, Sally has set the ScanPy color map, ie, .uns['{var}_colors] to preferred colors. When Sally loads this dataset in cellxgene, it will display the categories using the same colors.
  • Jane has a Loom file, and has converted it to an H5AD using cellxgene prepare. Jane would like to prepare the dataset so that the categorical colors match her collaborator’s preference. Jane uses the ScanPy/AnnData package to set the colormap in the anndata object, and saves it. She loads the resulting H5AD in cellxgene, and verifies that the colors match her expectations.

Supported file formats

This section enumerates the file formats that cellxgene can draw color information from and the heuristics it uses to do so.

Heuristics for other common data formats may be possible. We plan to add these incrementally. As always, it is important to keep these heuristics independent of cellxgene’s core data model.

.h5ad

We propose to adapt color information from scanpy objects. This involves pulling categorical color information from .uns["{category}_colors"] and zipping it together, in order, with category labels to form a mapping from category-label pairs to colors.

.cxg

In .cxg, we propose that color information can be stored as JSON in the cxg_group_metadata. The dictionary should have the following format:

{
	"<category_name>": {
		"<label_name>": "<color_hex_code>",
		...
	},
	...
}

Ellipses are included for brevity.

For example, see the JSON below:

{
	"louvain": {
		"Dendritic cells": "#1f77b4",
		"FCGR3A+ Monocytes": "#ff7f0e",
		"CD14+ Monocytes": "#2ca02c",
		"NK cells": "#d62728",
		...
	},
	...
}

To support loading color information into .cxg files, the cxgtool.py conversion tool will draw color information from the scanpy standard color format.

.loom

To our understanding, .loom matrix files do not support a color data structure, nor is it part of common conventions. Please call it out if we’re wrong here!

Server-client interaction

After the cellxgene server extracts color information from either the base data file or from user configuration, that color configuration will be encoded in JSON and exposed via the /api/v0.2/colors API endpoint. The JSON data structure returned by the API will match the format stored in .cxg files.

Benefits

Principles motivating the approach:

  • Simplicity, both in terms of system design and usability
  • Extensibility across data formats supported by cellxgene in the future
  • Isolation of internal cellxgene data model from external data formats
  • PnP - is interoperable with .h5ad files created by scanpy users
  • Uses existing data formats

Known drawbacks:

  • We’ll have to maintain small modules for loading color information from common color annotation patterns in each data format we support.

Alternatives

  1. Not implement this feature
    1. Pros
      1. Keeps the system simpler
    2. Cons
      1. High demand for this feature
  2. Just take color information from the CLI
    1. Pros
      1. Keeps the system simpler
      2. Avoids building in more external format-specific standards into our code
    2. Cons
      1. Decreases ease of use. Newcomers to cellxgene should be able to PnP (Plug-n-play) their data into the CLI and see their data as they expect it to look.
  3. Specify color in the UI
    1. Pros
      1. WYSIWYG editing of colors
      2. No configuration files nor code required
    2. Cons
      1. Does not fulfill the core user stories, which are for data publishers not data consumers. Data publishers need to be able to deploy the data configured and RO.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:11 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
mweidencommented, Mar 30, 2020

@dburkhardt The rationale behind the publisher design is that you only need to set the colors once. After that point, all access of color information can be RO. Setting colors in the front-end (FE) means making colors RW, configurable more than once. For this reason, you could consider FE color configuration to be an added feature layer on top of the publisher case.

Also, there are a bunch of questions that arise out of a FE color configuration feature when cellxgene is deployed to the web that we haven’t sorted through yet. If a user changes the colors of category-label pairs, are those automatically reflected for other users? If so, do we need authentication and authorization such that only approved users can do this?

We may take this on in the future, but, for now, we’re proposing keeping the scope tight on the publisher’s problem so that we can quickly deliver the core of the needed functionality.

0reactions
mweidencommented, Apr 7, 2020

@nh3 note that this RFC scopes user-defined colors to category-labels. We may address colormaps, but in a future feature set.

The comment period is now closed. Thanks everyone for your thoughtful input!

Read more comments on GitHub >

github_iconTop Results From Across the Web

RFC 2083 - IETF
In grayscale and truecolor images, a single pixel value can be identified as being "transparent".
Read more >
RFC Errata Report » RFC Editor
A user interface for choosing a named color has either to offer the user the possibility to choose from a pre-filled list of...
Read more >
Custom Color Match Guidelines & Procedures
Carlisle Roof Foam and Coatings (CRFC) acrylic and silicone roof coatings are made in standard colors. Custom colors can also be created and ......
Read more >
X11 color names - Wikipedia
In computing, on the X Window System, X11 color names are represented in a simple text file, which maps certain strings to RGB...
Read more >
5.9. COLOR Property | New Properties for iCalendar (RFC 7986)
This property specifies a color that clients MAY use when presenting the relevant data to a user. Typically, this would appear as the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found