Linking source code to software applications (entrypoints and service endpoints aka SaaS) with regard for interface type
See original GitHub issueLinking source code to software applications (entrypoints and service endpoints aka SaaS) with regard for interface type
The aim of this proposal is to:
- explicitly specify the interface type(s) provided by software
- make an explicit distinction and explicit link between software and software as a service
- allow linking to software instances (services) from the source code metadata
- relate SoftwareSourceCode and SoftwareApplication in both directions
- make ‘entry points’ and ‘service endpoints’ explicit
- settle some ambigious terms
Linking source code to application entrypoint and service endpoints
In #198, #229 and #246 it was discussed and subsequently decided to add
hasSourceCode
to schema.org and codemeta; a good idea. I would propose we
also add a property that is the exact and unambigious reverse of this. I suggest
providesApplication
= @reverse hasSourceCode
… There is also
targetProduct
(#267) which has the same
domain and range, but there seems to be a lot of confusion what
targetProduct
means exactly, schema.org defines it as: “Target Operating System / Product to which the code applies. If applies to several versions, just the product name can be used.”)
. It is too vaguely defined and there is conflicting information in #267, #246 and #198.
Various aspects of what I propose here affect schema.org directly but I thought it better to pass this through the codemeta community first.
The providesApplication
property would allow explicitly linking from the
source code metadata to software applications. This make two things possible:
- This would provide a better means of expressing entry points for software, as I proposed earlier in #183 back in 2018. An entry point here is simply defined
as an executable provided by a source code, each of which can be considered a
schema:SoftwareApplication
in their own right. - Linking source code to service instances where the application is running (service endpoints). Each typically associated with an URL. Here the range would be:
*
schema:WebAPI
(as proposed in schemaorg/schemaorg#1423 and worked out in schemaorg/schemaorg#2635) - Emphasis here is on the machine interface. The existingschema:EntryPoint
also has a place in what they proposed here. Their proposal also covers linking to formal specifications (like OpenAPI/swagger). *schema:WebApplication
- Emphasis here is on the human interface (web UI). *schema:WebPage
- Emphasis here is on the human interface. * The domain ofschema:hasSourceCode
would also need to be extended to included all these three.
I think we can use providesApplication
to cover both cases, but alternatively we could envision two properties (providesApplication
vs providesService
?)
Interface type
A software application offers one or more interfaces through which users or
machines can interact with it. I’d like to make this information explicit. When
using providesApplication
with schema:WebAPI/WebApplication/WebPage
it
is already implied. For the more generic schema:SoftwareApplication
it is
not. The specific types schema:MobileApplication
, schema:VideoGame
and
aforementioned schema:WebApplication
already exist, but other interface
types are not covered yet. We could extend these with:
CommandLineApplication
(command line interfaces)DesktopApplication
(Desktop GUIs)TerminalApplication
(Text UIs, think of vim,mutt and ncurses-based tools etc)SoftwareDaemon
(Software running as a daemon providing some kind of service over a network or local socket, think e.g. of ntpd, crond), this would be more generic thanWebApplication
(orWebAPI
).SoftwareLibrary
(APIs, think of libraries, either in the form of shared-objects/dll/dylib or in the form of modules for interpreted languages like Python)
More specific types can be envisioned (relates to #256):
NotebookApplication
(more specific form ofWebApplication
) - For Jupyter Notebooks and comparable technologies. Characterised by a mixture of text and code, often used in data science. May or may not be tied to a specific url where an intertactive instance is available (e.g link to binder/collab).SoftwareImage
- A software application in some kind of image form (such as an OCI container (e.g. Docker)), that typically ships the software with all its immediate dependency context. May or may not be tied to a specific url where the image is obtained (e.g. Docker Hub). Here the provided interface is relevant for operators (in a DevOps context) seeking to deploy the software in an infrastructure.SoftwarePackage
- The Software in some packaged form (e.g. for a particular linux distribution, homebrew, a Python wheel, etc). The difference between this andSoftwareImage
would be that this packages only the software, and not its dependency context, the dependency context is assumed to be explicitly expressed in the package but is obtained from other packages within the same packaging context (whatever package distribution method that may be).
Alternatively, we could have an interfaceType
property like I suggested in
#183, but as it seems there is already precedence in schema.org for doing it
with Types, so that might be the best way to follow.
An important point to consider is that a software application, even implemented in a single executable, may provide multiple types. But assigning multiple types is not an obstacle, correct me if I’m wrong, so that should be covered already.
Executable Name
In order to express entry points explicitly, it’s important to list the exact
executable names, which are not necessarily identical to the name
.
Alternatively, one may argue that schema:identifier
suffices for this.
There is already a schema:executableLibraryName
property (used in a
documentation context on APIReference
). That could be reused for the
proposed SoftwareLibrary
. But a more generic executableName
would need
to be introduced for the others, and there’s no real reason not to use that for
libraries as well. The executableName
would be defined that what is
runnable (within a certain runtimePlatform context), it should not contain
platform-specific extensions like .exe
,.so
,.dylib
,.dll
but just
the name portion. For software libraries for platform like Python it would
correspond to the top-level module name that can be imported.
Such a property may also make sense directly on SoftwareSourceCode
,
allowing for a more succint expression rather than needing to go via
providesApplication
and the corresponding SoftwareApplication
-subtypes.
Example
Consider the following example of a SoftwareSourceCode
instance where the
codebase provides various interface types. (This software actually exists
though in reality it’s not a single codebase that provides all these interfaces,
it’s split into multiple repositories, but it would be conceivable someone does
it like this):
{
"@type": "SoftwareSourceCode",
"name": "Frog",
"codeRepository": "https://github.com/LanguageMachines/frog",
...,
"providesApplication": [
{
"type": "CommandLineApplication",
"executableName": "frog",
"name": "Frog",
"runtimePlatform": "Linux"
},
{
"type": "SoftwareLibrary",
"executableName": "libfrog",
"name": "Frog Library",
"runtimePlatform": "Linux"
},
{
"type": "SoftwareLibrary",
"executableName": "frog",
"name": "Frog Python Binding",
"runtimePlatform": "Python"
},
{
"type": "WebAPI",
"provider": "Radboud Universiteit Nijmegen",
"endpointUrl": "https://webservices.cls.ru.nl/frog",
"endpointDescription": "https://webservices.cls.ru.nl/frog",
"conformsTo": "https://clam.readthedocs.io/en/stable/",
"documentation": "https://webservices.cls.ru.nl/frog/info",
"contentType": "application/xml"
},
{
"type": "WebApplication",
"executableName": "frog-service",
"provider": "Radboud Universiteit Nijmegen",
"url": "https://webservices.cls.ru.nl/frog"
}
]
}
Conclusion
I’ve tried to tie together some existing loose ends in this proposal, reusing as much of the existing codemeta/schema vocabulary as possible and linking with other existing proposals, keeping the amount of newly introduced vocabulary to a minimum.
What this subsequently allows is expressing software metadata from multiple
perspectives, one may start with a codemeta.json
and the source code as a
basis and produce a complete tree of software applications and service
instances that are provided by the source code. In a research context, there’s
often a single institute bringing a web-demo of a certain research sofware
online, possibly for demo purposes. It makes sense to be able accommodate this
metadata directly from the codemeta.json
in the source code root.
Moreover, this enables conversion of entrypoint metadata already present in
e.g. Python setup.py
, to codemeta/schema.
For those who take the other perspective and express metadata as WebAPI
or
WebPage
or WebApplication
first and foremost, this provides the means
to explicitly link it to the source code.
Apologies for the long post but I wanted to make sure to sketch a complete picture, I’d be appreciative of any feedback. Most of this is probably more for schema.org than codemeta but I wanted to discuss it here first and see what you suggest. I’d also like to poke @dgarijo in this because I see he’s been doing some excellent work on formalizing things in the Software Description Ontology and we have some overlap there (this touches upon #229 and #256).
Issue Analytics
- State:
- Created 2 years ago
- Comments:13
Top GitHub Comments
I have implemented the ideas from this issue in codemetapy and released a major new version today (2.0). It uses
targetProduct
to link source code to application instances and uses the extra types from https://github.com/SoftwareUnderstanding/software_types to describe their types.Additionaly, I just released two new tools based on codemetapy for which this functionaltiy was needed. (I opened a pull request for inclusion on the website (codemeta/codemeta.github.io#39) as well):
A live demo of this ensemble of codemeta software can currently be found here: https://tools.dev.clariah.nl/ (mind that it’s still in development).
@dgarijo Thanks! I have committed the initial proposal as it stands, we can continue from there next week!