question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Proposal] Meta-behaviors

See original GitHub issue

TL;DR: This issue outlines a proposal for how we can apply system functionality across metadata instances by the use of system-defined meta-properties. It does so via a concrete example for how defaulting behaviors might be implemented.

NOTE: This proposal probably gets into implementation too much, but that is only to illustrate the kinds of considerations that must be taken into account. It does not intend to impose a specific implementation.

Introduction

On occasion, there is a need to be able to apply functionality or behavior across all instances (or their properties) of a given schema or namespace. This kind of functionality is referred to as Meta-behaviors. Some examples of meta-behaviors might be:

  1. Defaulting, where a single instance among several is considered the default. Only one instance, from the set of instances, can be considered the default.
  2. Data sensitivity, where a given property is considered to hold sensitive data, that property’s value should be encrypted or masked to prevent exposure of sensitive information.
  3. Uniqueness, where a given property should be unique across the same property of other instances. Today we “manually” impose uniqueness constraints on display_name, and let the filesystem impose this constraint on the name property since that property is used as the base name of the persisted instance.

This issue proposes how we might go about introducing meta-behaviors and uses Defaulting as a concrete example of how this might be implemented, illustrating the items that must be considered along the way.

Meta-behaviors

Meta-behaviors will be associated with system-owned schema properties. As a result, we will identify such properties by a common prefix. For the sake of this proposal, we’ll specify a prefix of schema_ but could just as well be something like elyra_. The reason schema_ is being proposed is that meta-behaviors are essentially schema wide. I.e., they span instances of a given schema (or namespace of for that matter).

Because these special schema properties are system-owned, we can associate functionality (i.e., behaviors) to each such property. As noted above, defaulting behaviors would then be associated with a boolean-valued instance property named schema_default, meaning that instances marked with schema_default = True are considered the default instance for that schema.

Other schema properties indicating meta-behaviors won’t necessarily be instance properties, but rather properties of the schema. For example, if we wanted to associate encryption with instance properties that contain sensitive data, we could indicate such behavior via a list-valued schema property named schema_encrypted which would be an enumerated list of properties upon which encryption should be applied to their values. For instance, the kfp schema might define schema_encrypted as:

{
  "schema_encrypted": ["cos_password"]
}

indicating that instances of schema ‘kfp’ need special behavior when reading/writing instances that contain a value for cos_password. Whereas, the airflow schema might need additional properties handled in a similar fashion:

{
  "schema_encrypted": ["cos_password", "airflow_password"]
}

Concrete example: schema_default

Because we’re faced with determining how to address defaulting in general (see #462), let’s use that behavior as a concrete example.

Use case

Both the front-end and server need the notion of a default runtime-image to use. The front-end needs this default so it can enhance the user experience by pre-selecting a runtime-image from a set of images. The server needs this default because the underlying runtime might require an image and, if one was not provided by the caller, the default should be used.

Implementation

The following sections address how an implementation might be done to support a schema-wide defaulting mechanism.

Schema definition

In this case, we’d add a new property to the runtime-image.json file that defines the schema for runtime-images:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "Runtime Image metadata",
  "name": "runtime-image",
  "namespace": "runtime-images",
  "properties": {
    "schema_name": {
      "type": "string",
      "pattern": "^[a-z][a-z0-9-_]*[a-z0-9]$",
      "minLength": 1
    },
    "display_name": {
      "description": "The display name of the Runtime Image",
      "type": "string",
      "pattern": "^[a-zA-Z][a-zA-Z0-9-_. (){}]*[a-zA-Z0-9)}]$"
    },
    "schema_default": {
      "description": "Indicates that the Runtime Image instance is the default",
      "type": "boolean",
      "default": false
    },
    "metadata": {
      "description": "Additional data specific to this Runtime Image",
      "type": "object",
      "properties": {
        "image_name": {
          "description": "Runtime Image description",
          "type": "string"
        }
      },
      "required": ["image_name"]
    }
  },
  "required": ["schema_name", "display_name"]
}

Note that it will never be the case that a system-owned property that connotes meta-behaviors will be defined in the metadata: stanza of a given schema. However, a meta-behavior property may indeed reference a property defined in the metadata: stanza (see the schema_encrypted example from above).

Instance creation

When creating an instance corresponding to a schema that enabled a default, only the default instance needs to include --schema_default = True since all others will default schema_default to False. When such an instance is created, it will first be validated against its schema. This step only validates that the values adhere to their corresponding constraints. No schema-wide (cross-instance) constraints are performed.

Once the instance values have been validated against its schema, the validation logic will then check for its system-owned schema_-prefixed properties to apply the corresponding meta-behavior. For schema_default that check is: does this instance have a value of True for schema_default?. In such cases, it needs to check that other instances are NOT marked as the default since this meta-behavior says that only one instance, of the result set of instances, can be considered the default.

PR #462 introduced the notion of bringing your own images. In doing so, it also introduced the notion of a hierarchical persistence structure where an ordered set of directories are used to locate instances. The order of those directories is very important and that order starts local to the current user - moving outward to more system-specific directories. What the directories are doesn’t matter as much as there’s a specific and consistent order of directories to search for instances. When a user “brings their own image”, they are writing an instance of runtime-image to their most local directory - always. Only an administrator can populate the non-user-specific directories. (Note, installation of elyra populates a non-user-specific directory with the factory-defined set of runtime-image instances.)

Before talking about creating an instance marked with schema_default = True, it’s probably worthwhile describing how retrieval of runtime-image instances would work relative to schema_default.

Retrieval

When the instances of a given schema (runtime-image in this example) are requested, the order of directories must be searched to gather instances. Since these directories are based on their schema’s namespace, each file must be loaded to check that its schema_name matches the schema for the instances being loaded. (Since the runtime-images namespace only has the single runtime-image schema at the moment, the software still must make this check since other namespaces support multiple schemas.)

There are two options to take here:

  1. Check each directory in the ordered list in reverse order. Using this approach, if duplicates are encountered, they will unconditionally overwrite the previous in the result set. This will occur if a user wishes to modify a system-defined default.
  2. Check each directory in the prescribed order - local to outer. Using this approach, instances are only loaded into the result set if they do not already exist in the result set.

For the sake of imposing the defaulting constraint, I suspect option 2 will be better. Note that all user-created (or user-modified) instances are always created in the “most local” location for that user. So, using option 2, we start at the “most local” directory (i.e., the first in the list), and load each file. If the file corresponds to the target schema and does not already exist in the result set (which it won’t on the first directory), its inserted into the result set. We also note if its schema_default property is True. If so, that item name is noted. If at this directory we encounter a second instance with schema_default = True we raise a constraint violation (or could mark that instance as non-default in the result set with a warning (TBD)). Note that this condition should not occur once we finish the persistence discussion.

As we move up the directory hierarchy, we continue applying the same algorithm, check schema_name, check if schema_default = True. However, when encountering an instance marked with schema_default = True we check:

  1. Is this instance already in the result set? If so, skip the instance. Note, the existing instance may NOT be marked as a default.
  2. If not, add the instance to the result set and note the instance as the current default if a default is not currently marked. If a default is currently marked, we still add the instance to the result but unmark it as the default only in the result set.

This continues until all instances have been checked. Upon completion, if there are instances and none of which are considered the default, we should probably raise a ValidationError since this introduces an invariant condition. (See Persistence)

Persistence

Now that we’ve defined how load occurs, we can finish looking at instance creation (persistence).

After the schema has been validated, a check will be made for each of the meta-behavior-related properties. For instances in which schema_default = True, we’ll check for other instances only in that directory. If another instance is marked as a default, we raise a ValidationError indicating the name of the instance already marked as a default. Otherwise, we create this instance with it marked as the default.

Since we know this instance is the only instance marked as the default in this directory, then we know the schema-wide constraint of default has been satisfied, based on the previously described retrieval behavior since any defaults in the other directories will be un-defaulted when added to the result set.

Deletion

Deletion is more difficult when deleting default instances. Deleting non-defaulted instances is trivial and the retrieval rules will take care of that. When deleting a default instance, we must check if another default instance will be unmasked at a higher level since we know there are no defaults from this level. To do this, perform a silent load (no ValidationError is raised if no defaults are found). If the load still yields a default, that implies that the deleted instance was masking another default at a higher level. However, if we find there are NO defaults enabled after the silent load, then that means there’s an instance that overrode a higher-level instance and turned off its default marking. In this case, we will need to take the non-defaulted instance that is masking its higher-level defaulted instance and persist it as the default. As a result, one instance is created and a second instance is updated (and persisted in the “most local” directory).

Other meta-behaviors

Each meta-behavior will result in different algorithms. Here are some examples for how we might handle some previously mentioned meta-behaviors.

schema_unique

This meta-behavior is imposed solely at the time of persistence. When an instance contains properties that are included in the schema’s schema_unique list, it must check if any previously loaded instances have similar values in their properties. If others are encountered, a ValidationError is raise - with one exception - that being, that if the other occurrences reside in the same-named instance at a higher level than this instance will be masking. In that case, the constraint will not be violated since all instances of the retrieved result set will still meet the constraint requirements.

schema_encrypt

This meta-behavior will be applied at persistence and also at “time of use”. At persistence, the schema will be checked to see if any properties are contained in the list-valued schema_encrypt property. If so, the instance’s corresponding property will be encrypted. What is used as the key and other encryption details are TBD, but they would be configured somewhere.

Retrieval will not decrypt the value. This means we’d need to determine if a given value is or is not encrypted during things like updates.

At “time of use” means that when one of these properties is conveyed to its “service”, it will likely need to be decrypted (or not, perhaps depending on the service).

This meta-behavior is more hypothetical and is largely TBD (but should be addressed somehow).

Conclusion

I believe we’re going to need some kind of system-defined behaviors to address these use cases, as well as others. For some, we may want to choose a different approach. However, we should strongly consider solutions that are metadata-driven in order to retain deterministic and objective behaviors.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:1
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
kevin-batescommented, Dec 6, 2020

Another example of a meta-behavior would be to support Transient Properties. A transient property is a property that is not persisted and, really only comes from the client. We could probably associate default values to such properties such that they are “received” by the user with a value set, but that value, even if changed, would not be persisted.

An example of this might occur with credentials, where say, they always come from the client application and only exist for the lifetime of that particular “session”. We could also tie these properties to other meta-properties that, say, load their value from a different location or something more custom.

We could also have a meta-behavior that simply indicates “this has custom behavior” and, with these kinds of meta-behaviors, the metadata_class_name is assumed to have a value in the schema definition and it’s this custom metadata class that ensures such “custom properties” are persisted (by virtue of still residing in the JSON prior to its persistence), validated (perhaps a password comparison), and/or not persisted (providing the transient behavior described above).

0reactions
ptitzlercommented, Aug 24, 2020
  1. Ah yes, sorry I didn’t realize some of the nuances and therefore assumed incorrectly that the proposal extended to all properties not just children of properties.metadata.
Read more comments on GitHub >

github_iconTop Results From Across the Web

Interpretable Models & Metabehaviors: A Proposed Study of ...
Our research interest is examining how humans come to evaluate AI decision-making without explicit explanations and how contingencies in explanations alter the ...
Read more >
Metabehaviors as Discriminative Stimuli for Planned Cultural ...
Essential to this complex goal is the identification of appropriate dependent measures of aggregate population behavior (metabehaviors).
Read more >
Thinking About Behavior: Perspective on Meta ... - Frontiers
The proposed conceptual framework, named the meta-behavior framework, underscores the importance of the thinking process before an ...
Read more >
Kelly Slaughter - Neeley School Of Business
“Interpretable Models & Metabehaviors: A Proposed Study of Microlending Gaming” with Preston, D. (2021) in proceedings of the American Conference on ...
Read more >
Beyond Finite State Machines: Managing Complex ... - AWS
Meta-behaviors. Canonical behavior sequences are modified by player ... Reflection (meta-behaviors) ... Coordinate through sensing (but plan recog. hard).
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found