question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Linking source code to software applications (entrypoints and service endpoints aka SaaS) with regard for interface type

See original GitHub issue

Linking source code to software applications (entrypoints and service endpoints aka SaaS) with regard for interface type

The aim of this proposal is to:

  • explicitly specify the interface type(s) provided by software
  • make an explicit distinction and explicit link between software and software as a service
  • allow linking to software instances (services) from the source code metadata
  • relate SoftwareSourceCode and SoftwareApplication in both directions
  • make ‘entry points’ and ‘service endpoints’ explicit
  • settle some ambigious terms

Linking source code to application entrypoint and service endpoints

In #198, #229 and #246 it was discussed and subsequently decided to add hasSourceCode to schema.org and codemeta; a good idea. I would propose we also add a property that is the exact and unambigious reverse of this. I suggest providesApplication = @reverse hasSourceCode… There is also targetProduct (#267) which has the same domain and range, but there seems to be a lot of confusion what targetProduct means exactly, schema.org defines it as: “Target Operating System / Product to which the code applies. If applies to several versions, just the product name can be used.”) . It is too vaguely defined and there is conflicting information in #267, #246 and #198.

Various aspects of what I propose here affect schema.org directly but I thought it better to pass this through the codemeta community first.

The providesApplication property would allow explicitly linking from the source code metadata to software applications. This make two things possible:

  1. This would provide a better means of expressing entry points for software, as I proposed earlier in #183 back in 2018. An entry point here is simply defined as an executable provided by a source code, each of which can be considered a schema:SoftwareApplication in their own right.
  2. Linking source code to service instances where the application is running (service endpoints). Each typically associated with an URL. Here the range would be: * schema:WebAPI (as proposed in schemaorg/schemaorg#1423 and worked out in schemaorg/schemaorg#2635) - Emphasis here is on the machine interface. The existing schema:EntryPoint also has a place in what they proposed here. Their proposal also covers linking to formal specifications (like OpenAPI/swagger). * schema:WebApplication - Emphasis here is on the human interface (web UI). * schema:WebPage - Emphasis here is on the human interface. * The domain of schema:hasSourceCode would also need to be extended to included all these three.

I think we can use providesApplication to cover both cases, but alternatively we could envision two properties (providesApplication vs providesService?)

Interface type

A software application offers one or more interfaces through which users or machines can interact with it. I’d like to make this information explicit. When using providesApplication with schema:WebAPI/WebApplication/WebPage it is already implied. For the more generic schema:SoftwareApplication it is not. The specific types schema:MobileApplication, schema:VideoGame and aforementioned schema:WebApplication already exist, but other interface types are not covered yet. We could extend these with:

  • CommandLineApplication (command line interfaces)
  • DesktopApplication (Desktop GUIs)
  • TerminalApplication (Text UIs, think of vim,mutt and ncurses-based tools etc)
  • SoftwareDaemon (Software running as a daemon providing some kind of service over a network or local socket, think e.g. of ntpd, crond), this would be more generic than WebApplication (or WebAPI).
  • SoftwareLibrary (APIs, think of libraries, either in the form of shared-objects/dll/dylib or in the form of modules for interpreted languages like Python)

More specific types can be envisioned (relates to #256):

  • NotebookApplication (more specific form of WebApplication) - For Jupyter Notebooks and comparable technologies. Characterised by a mixture of text and code, often used in data science. May or may not be tied to a specific url where an intertactive instance is available (e.g link to binder/collab).
  • SoftwareImage - A software application in some kind of image form (such as an OCI container (e.g. Docker)), that typically ships the software with all its immediate dependency context. May or may not be tied to a specific url where the image is obtained (e.g. Docker Hub). Here the provided interface is relevant for operators (in a DevOps context) seeking to deploy the software in an infrastructure.
  • SoftwarePackage - The Software in some packaged form (e.g. for a particular linux distribution, homebrew, a Python wheel, etc). The difference between this and SoftwareImage would be that this packages only the software, and not its dependency context, the dependency context is assumed to be explicitly expressed in the package but is obtained from other packages within the same packaging context (whatever package distribution method that may be).

Alternatively, we could have an interfaceType property like I suggested in #183, but as it seems there is already precedence in schema.org for doing it with Types, so that might be the best way to follow.

An important point to consider is that a software application, even implemented in a single executable, may provide multiple types. But assigning multiple types is not an obstacle, correct me if I’m wrong, so that should be covered already.

Executable Name

In order to express entry points explicitly, it’s important to list the exact executable names, which are not necessarily identical to the name. Alternatively, one may argue that schema:identifier suffices for this.

There is already a schema:executableLibraryName property (used in a documentation context on APIReference). That could be reused for the proposed SoftwareLibrary. But a more generic executableName would need to be introduced for the others, and there’s no real reason not to use that for libraries as well. The executableName would be defined that what is runnable (within a certain runtimePlatform context), it should not contain platform-specific extensions like .exe,.so,.dylib,.dll but just the name portion. For software libraries for platform like Python it would correspond to the top-level module name that can be imported.

Such a property may also make sense directly on SoftwareSourceCode, allowing for a more succint expression rather than needing to go via providesApplication and the corresponding SoftwareApplication-subtypes.

Example

Consider the following example of a SoftwareSourceCode instance where the codebase provides various interface types. (This software actually exists though in reality it’s not a single codebase that provides all these interfaces, it’s split into multiple repositories, but it would be conceivable someone does it like this):

{
    "@type": "SoftwareSourceCode",
    "name": "Frog",
    "codeRepository": "https://github.com/LanguageMachines/frog",
    ...,
    "providesApplication": [
        {
            "type": "CommandLineApplication",
            "executableName": "frog",
            "name": "Frog",
            "runtimePlatform": "Linux"
        },
        {
            "type": "SoftwareLibrary",
            "executableName": "libfrog",
            "name": "Frog Library",
            "runtimePlatform": "Linux"
        },
        {
            "type": "SoftwareLibrary",
            "executableName": "frog",
            "name": "Frog Python Binding",
            "runtimePlatform": "Python"
        },
        {
            "type": "WebAPI",
            "provider": "Radboud Universiteit Nijmegen",
            "endpointUrl": "https://webservices.cls.ru.nl/frog",
            "endpointDescription": "https://webservices.cls.ru.nl/frog",
            "conformsTo": "https://clam.readthedocs.io/en/stable/",
            "documentation": "https://webservices.cls.ru.nl/frog/info",
            "contentType": "application/xml"
        },
        {
            "type": "WebApplication",
            "executableName": "frog-service",
            "provider": "Radboud Universiteit Nijmegen",
            "url": "https://webservices.cls.ru.nl/frog"
        }
    ]
}

Conclusion

I’ve tried to tie together some existing loose ends in this proposal, reusing as much of the existing codemeta/schema vocabulary as possible and linking with other existing proposals, keeping the amount of newly introduced vocabulary to a minimum.

What this subsequently allows is expressing software metadata from multiple perspectives, one may start with a codemeta.json and the source code as a basis and produce a complete tree of software applications and service instances that are provided by the source code. In a research context, there’s often a single institute bringing a web-demo of a certain research sofware online, possibly for demo purposes. It makes sense to be able accommodate this metadata directly from the codemeta.json in the source code root.

Moreover, this enables conversion of entrypoint metadata already present in e.g. Python setup.py, to codemeta/schema.

For those who take the other perspective and express metadata as WebAPI or WebPage or WebApplication first and foremost, this provides the means to explicitly link it to the source code.

Apologies for the long post but I wanted to make sure to sketch a complete picture, I’d be appreciative of any feedback. Most of this is probably more for schema.org than codemeta but I wanted to discuss it here first and see what you suggest. I’d also like to poke @dgarijo in this because I see he’s been doing some excellent work on formalizing things in the Software Description Ontology and we have some overlap there (this touches upon #229 and #256).

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:13

github_iconTop GitHub Comments

1reaction
proyconcommented, May 17, 2022

I have implemented the ideas from this issue in codemetapy and released a major new version today (2.0). It uses targetProduct to link source code to application instances and uses the extra types from https://github.com/SoftwareUnderstanding/software_types to describe their types.

Additionaly, I just released two new tools based on codemetapy for which this functionaltiy was needed. (I opened a pull request for inclusion on the website (codemeta/codemeta.github.io#39) as well):

  • codemeta-harvester - A wrapper around codemetapy and other tools, provides a full automatic conversion pipeline to codemeta
  • codemeta-server - A webservice/webapplication to search and browse codemeta (offers a SPARQL endpoint etc)…

A live demo of this ensemble of codemeta software can currently be found here: https://tools.dev.clariah.nl/ (mind that it’s still in development).

1reaction
proyconcommented, Feb 3, 2022

@dgarijo Thanks! I have committed the initial proposal as it stands, we can continue from there next week!

Read more comments on GitHub >

github_iconTop Results From Across the Web

CLARIAH+ Shared Service: FAIR Tool Discovery - GitHub
Linking source code to software applications (entrypoints and service endpoints aka SaaS) with regard for interface type. tool-discovery #16. proycon.
Read more >
What Is an API (Application Programming Interface)?
An application programming interface (API) is code that enables two software programs to communicate. An API defines how a developer should request services...
Read more >
IT Terminology Glossary | Internet Technology Terms ...
Application Programming Interface (API): An application programming ... The cloud, simply, refers to software and services that run on the Internet instead ...
Read more >
Add Power BI URLs to your allowlist - Microsoft Learn
This article lists URL endpoints and ports with their associated linked sites to add to your allowlist for connectivity to Power BI.
Read more >
Cloud Endpoints | Google Cloud
Build API gateways. Cloud Endpoints uses an NGINX-based proxy and distributed architecture for performance and scale. Using an OpenAPI Specification or one of ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found