question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Prototype loading privacy declarations directly from source code

See original GitHub issue

The separate system declarations are a potential burden for users. A good middle-ground between code analysis and what we have now is to co-locate the declarations and the code.

The two implementation methods I can think of for the POC are as follows:

  1. A very python-specific implementation where we ingest the python code, extract the docstrings and then extract the system declarations from there
    • A major issue here is that this is not generalizable to other languages
  2. We go for a more general approach, and treat each source code file as a txt file. We then use regex to look for matching cases and attempt to load it into a system declaration
    • Because we would still expect it to be yaml-like, this would only work in languages with multi-line comments

Option 1: Declaration inside of the docstring

def some_func(some_parameter: str) -> None:
    """
    Do something important with user data.

    system:
      - fides_key: demo_analytics_system
        name: Demo Analytics System
        description: A system used for analyzing customer behaviour.
        system_type: Service
        privacy_declarations:
          - name: Analyze customer behaviour for improvements.
            data_categories:
              - user.provided.identifiable.contact
              - user.derived.identifiable.device.cookie_id
            data_use: improve.system
            data_subjects:
              - customer
            data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
            dataset_references:
              - demo_users_dataset
    """"

    user_data = get_user_data(some_parameter)
    advertise_to(user_data)

Option 2: Declaration as a multi-line comment:

"""
system:
  - fides_key: demo_analytics_system
    name: Demo Analytics System
    description: A system used for analyzing customer behaviour.
    system_type: Service
    privacy_declarations:
      - name: Analyze customer behaviour for improvements.
        data_categories:
          - user.provided.identifiable.contact
          - user.derived.identifiable.device.cookie_id
        data_use: improve.system
        data_subjects:
          - customer
        data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
        dataset_references:
          - demo_users_dataset
"""

def some_func(some_parameter: str) -> None:
    """
    Do something important with user data.
    """

    user_data = get_user_data(some_parameter)
    advertise_to(user_data)

An additional caveat here is that it would be extremely difficult if not impossible for a plugin to help with these annotations, as they’re embedded in other source code.

Additional questions to think about:

  • Do we have the user define a system in a system.yaml file, and then attribute all of the nearby code declarations to that?
  • Do they need to define a system-per-declaration? that seems weird, so this ^ option seems better
  • How should this be handled during evaluations? Should it be done at apply/evaluate time, or should there be a separate command that generates a full system.yaml file from the source code declarations?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:20 (19 by maintainers)

github_iconTop GitHub Comments

2reactions
ThomasLaPianacommented, Dec 14, 2021

@edthedev With the feature I proposed, the coverage report would show how many of your system declarations had associated code files. The new system declaration would look like this

system:
  - fides_key: fidesctl_system
    name: Fidesctl System
    code_paths:
      - src/some_code_file.py
      - src/another_code_file.py

This method has the benefit of being language agnostic, and we can then throw errors for when the code_paths section is empty. We could also move it down into the declarations section

1reaction
iamkellllycommented, Dec 7, 2021

Most specifically relevant to item (2) above, opened issues for additional documentation:

Read more comments on GitHub >

github_iconTop Results From Across the Web

A Privacy-Preserving Validation Server Prototype
During any part of the research process, that means no researcher would have direct access to view the confidential data in any form....
Read more >
Creating Prototype Nodes - ScienceDirect
Prototype instances match a specific node type, and can be used anywhere in a scene graph that matches the node type and is...
Read more >
Prototype-polluting function - CodeQL - GitHub
Prototype pollution is a type of vulnerability in which an attacker is able to modify Object.prototype . Since most objects inherit from the...
Read more >
Accessing private member variables from prototype-defined ...
The simplest way to construct objects is to avoid prototypal inheritance altogether. Just define the private variables and public functions ...
Read more >
Object prototypes - Learn web development | MDN
This article has covered JavaScript object prototypes, including how ... This code creates a Date object, then walks up the prototype chain, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found