OpenAPI vocabulary or dialect for code generation
See original GitHub issueCode generation tools often have special requirements or restrictions on the structure of an OpenAPI definition (document?) that improve the generated code. Here are some examples of restrictions from the IBM OpenAPI SDK generator:
-
Parameters must be unique by name only, irrespective of “in”.
Rationale: Operation parameters are often rendered as the parameters on a function or method in the target language of the code generator. Since most languages require parameters to have unique names, the code generator would need to incorporate the
in
of a parameter into its name to prevent name collisions. This is undesirable, since it exposes the mechanics of the API without adding any value. -
There should be at most one success response with a response body. A 204 and other 2XX is okay, but no other combination of two or more 2XX responses.
Rationale: In statically-typed languages like Java, the return value of a method must have a single static type. This makes it difficult to represent an operation with two different response schemas as a single method returning a single response type.
-
Property names and parameter names must be “case-insensitive” unique
Rationale: Code generators often reformat the names of parameters, properties, and schemas to use idomatic case formatting for the target language: lower_snake_case for Python, lowerCamelCase for Java, etc. But this reformatting could introduce naming conflicts if two parameters, e.g. “foo_bar” and “fooBar” are not “case-insensitive” unique.
-
Arrays must contain items of a single type
Rationale: Many languages require an array to contain only values of a single type.
-
Schema type must specify a single type – no type arrays
Rationale: Some widely-used statically-typed languages, e.g. Java and Go) have no provision for “union” types, making it impossible to define a
type: [ integer, string ]
typed property or parameter. -
Don’t use “nullable”
Rationale: it’s deprecated, and is just an alternate way of expressing type arrays
-
Don’t use JSON schema “not”
Rationale There’s no obvious way to represent this in many widely used programming languages.
-
No “if-then-else” in JSON schema
Rationale There’s no obvious way to represent this in many widely used programming languages.
-
The API document should be “self contained” (no external “$refs”)
Rationale: External refs can easily create multiple namespaces for schemas, parameters, security schemes, etc. These are unnecessary complications for code generators.
-
All “$refs” must be to elements in the “components” section of the document
Rationale: “$ref” targets outside of “components” are unnecessary complications for code generators.
It would be nice to have a common set of rules like this that could be codified into a “Code generation” vocabulary or dialect for OpenAPI.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:6
- Comments:12 (7 by maintainers)
Top GitHub Comments
@landrito it’s not really a good practice to
$ref
things that are interior to a usable schema. It doesn’t matter (to me) whether it’s OAS’s#/components/schemas
or JSON Schemas#/$defs
, but if you’re going to re-use a schema, put it somewhere re-usable.Of course nothing bad automatically happens if you don’t. But it’s like abusing the leading underscore convention in Python, which usually indicates a private method. You can call it like a public one, but you’re doing something that most people reading or maintaining the code wouldn’t expect. Most people will assume that a random property schema is not being re-used elsewhere and will feel free to change it without looking for
$ref
s. But in a re-usable location, most people know that they should take re-use into account when making changes.@MikeRalphson good idea on TypeScript. Which may be the only time I’ve ever said something positive about TypeScript but that’s just my preference for loose/dynamic typing speaking 😝
To me, an ideal code gen vocabulary is one that allows me to use the full power of JSON Schema for validation while also allowing me to use the same schemas for code gen. That would mean that tooling would have to ignore some things that only relate to validation. It also means tooling can’t make assumptions about how a pattern is to be interpreted in OO. For example,
if
/then
can be used to express the same thing (and more) asdiscriminator
, but if tooling doesn’t recognizeif
/then
, the expressive power for both code gen and validation is limited. Another example is tooling that assumes thatallOf
means an intersection type andanyOf
/oneOf
means a union type. That’s not always true. The OpenAPI schema includes an example whereanyOf
is used to express that at least one of “paths”, “components”, or “webhooks” is required. A third example isanyOf
/oneOf
being used to emulateenum
when you want to give each option a description.I believe that the way forward is a vocabulary of annotation keywords that allows you to be explicit about how you expect a schema to be used for code gen without it having an effect on validation. Here are a couple examples of the kind of thing I’m thinking of.
This is just off the top of my head. It’s probably not the best approach and the names certainly will need some workshopping, but hopefully this gets across the idea of the general idea. I think it would be useful to get some details about the reason for each of the restrictions in the original proposal. That way we can work backwards to try to solve those problems in ways that don’t require restrictions for JSON Schema validation.
One more thing I want to point out is that the current proposal is coupled to the OpenAPI document. I believe that we should be solving the general case. OpenAPI users aren’t the only JSON Schema users that are interested in code gen and it would be great if we could solve for their needs as well.