Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Prototype loading privacy declarations directly from source code

See original GitHub issue

The separate system declarations are a potential burden for users. A good middle-ground between code analysis and what we have now is to co-locate the declarations and the code.

The two implementation methods I can think of for the POC are as follows:

A very python-specific implementation where we ingest the python code, extract the docstrings and then extract the system declarations from there
- A major issue here is that this is not generalizable to other languages
We go for a more general approach, and treat each source code file as a txt file. We then use regex to look for matching cases and attempt to load it into a system declaration
- Because we would still expect it to be yaml-like, this would only work in languages with multi-line comments

Option 1: Declaration inside of the docstring

def some_func(some_parameter: str) -> None:
    """
    Do something important with user data.

    system:
      - fides_key: demo_analytics_system
        name: Demo Analytics System
        description: A system used for analyzing customer behaviour.
        system_type: Service
        privacy_declarations:
          - name: Analyze customer behaviour for improvements.
            data_categories:
              - user.provided.identifiable.contact
              - user.derived.identifiable.device.cookie_id
            data_use: improve.system
            data_subjects:
              - customer
            data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
            dataset_references:
              - demo_users_dataset
    """"

    user_data = get_user_data(some_parameter)
    advertise_to(user_data)

Option 2: Declaration as a multi-line comment:

"""
system:
  - fides_key: demo_analytics_system
    name: Demo Analytics System
    description: A system used for analyzing customer behaviour.
    system_type: Service
    privacy_declarations:
      - name: Analyze customer behaviour for improvements.
        data_categories:
          - user.provided.identifiable.contact
          - user.derived.identifiable.device.cookie_id
        data_use: improve.system
        data_subjects:
          - customer
        data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
        dataset_references:
          - demo_users_dataset
"""

def some_func(some_parameter: str) -> None:
    """
    Do something important with user data.
    """

    user_data = get_user_data(some_parameter)
    advertise_to(user_data)

An additional caveat here is that it would be extremely difficult if not impossible for a plugin to help with these annotations, as they’re embedded in other source code.

Additional questions to think about:

Do we have the user define a system in a system.yaml file, and then attribute all of the nearby code declarations to that?
Do they need to define a system-per-declaration? that seems weird, so this ^ option seems better
How should this be handled during evaluations? Should it be done at apply/evaluate time, or should there be a separate command that generates a full system.yaml file from the source code declarations?

Issue Analytics

State:
Created 2 years ago
Comments:20 (19 by maintainers)

Top GitHub Comments

2reactions

ThomasLaPianacommented, Dec 14, 2021

@edthedev With the feature I proposed, the coverage report would show how many of your system declarations had associated code files. The new system declaration would look like this

system:
  - fides_key: fidesctl_system
    name: Fidesctl System
    code_paths:
      - src/some_code_file.py
      - src/another_code_file.py

This method has the benefit of being language agnostic, and we can then throw errors for when the code_paths section is empty. We could also move it down into the declarations section

1reaction

iamkellllycommented, Dec 7, 2021

Most specifically relevant to item (2) above, opened issues for additional documentation:

Top Results From Across the Web

A Privacy-Preserving Validation Server Prototype

During any part of the research process, that means no researcher would have direct access to view the confidential data in any form....

Creating Prototype Nodes - ScienceDirect

Prototype instances match a specific node type, and can be used anywhere in a scene graph that matches the node type and is...

Prototype-polluting function - CodeQL - GitHub

Prototype pollution is a type of vulnerability in which an attacker is able to modify Object.prototype . Since most objects inherit from the...

Accessing private member variables from prototype-defined ...

The simplest way to construct objects is to avoid prototypal inheritance altogether. Just define the private variables and public functions ...

Object prototypes - Learn web development | MDN

This article has covered JavaScript object prototypes, including how ... This code creates a Date object, then walks up the prototype chain, ...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Prototype loading privacy declarations directly from source code

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Ideate how we can add anonymous analytics to fidesctl

Dataset section in Fides resource types is out of sync with the current implementation.