[Co-presented with Mike Kistler, Architect for SDK Generation for the Watson Client Libraries]
The OpenAPI Specification is emerging as the leading standard for describing REST APIs. A key factor in the popularity of OpenAPI is the broad array of open source tools that it enables that create, manipulate, and publish documentation and code from OpenAPI descriptions. In this talk, we describe a configurable and extensible open source linter for OpenAPI that we are using to solve API code generation problems at IBM and Google. Our linter is based on Gnostic, an open source framework for working with API descriptions that was developed at Google and is available on GitHub.
OpenAPI itself is language-agnostic and is being used to generate code in a large set of popular programming languages. This generated code includes both server-side "stubs" and client libraries that are sometimes called software development kits (SDKs). IBM has begun to employ code generation for the Watson Developer Cloud SDKs and other companies are doing similar things, including Google, which generates client libraries from Google-specific API description formats. These teams have found that the quality of SDKs generated from API descriptions depends heavily on the quality of the descriptions. This goes far beyond mere syntactic compliance with a specification -- it involves proper API design, naming, and adherence to organization-wide design patterns. To address this, many companies have created API design guides. Some companies, such as Google and Microsoft, have published their API design guides externally, while others like IBM have kept theirs as internal documents. But to this point, verifying compliance with an API design guide has largely been a manual task. What is needed, we believe, is a configurable and extensible linter to check OpenAPI descriptions for conformance with rules derived from API design guides.
5. Open API Tooling
Originally designed for documentation, tools are now available to automate or assist
with many common tasks:
● API Authoring
● Validation
● Documentation
● Testing
● Mocking
● Management
● Code Generation
● Validation
6. OpenAPI-based Tooling at IBM
● SDK Generation for Watson Developer Cloud
○ Java
○ Node
○ Python
○ Swift
○ .NET
● Benefits
○ Faster time to Market
○ Improved Quality
○ Enforce Compliance
○ Enforce Consistency
○ Increase Adoption
https://github.com/watson-developer-cloud
7. Some challenges for OpenAPI-based tools
● Some elements of an OAS API definition are "optional" or "flexible"
○ Parameter and property type and format
○ OperationId
○ Response schema
● Some styles of definitions are problematic for tools
○ Inline responses
○ Content-type, accept-type
● Focus implementation effort on main use cases
8. Parameter and property type and format
● OpenAPI v2 does not restrict the
values for type and format
○ except they must be strings
● But some values are "defined"
● Many tools can have unexpected
results when type and format are not
one of these combinations
"properties": {
"level": {
"type": "number",
"format": "integer"
},
9. API Design Guides
To promote a consistent style for APIs, and also to address the requirements and
constraints of tooling, many companies have developed "design guides" for their APIs:
● Google
○ https://cloud.google.com/apis/design
● Microsoft
○ https://github.com/Microsoft/api-guidelines
○ https://docs.microsoft.com/en-us/azure/architecture/best-practices/api-design
● IBM (in development)
○ http://watson-developer-cloud.github.io/api-guidelines/
○ http://watson-developer-cloud.github.io/api-guidelines/swagger-coding-style
10. Google API Design Guide
● Collection IDs
○ valid C/C++ identifiers
○ clear and concise English terms
○ avoid overly general terms
○ plural form with lowerCamel case
● Naming conventions
○ Field definitions (property names) must use lower_case_underscore_separated_names.
● Standard field names and types
● Error Model
○ code: integer
○ message: string
○ details: array of object
11. Microsoft API Design Guide
● All APIs MUST support explicit versioning.
● Services SHOULD provide JSON as the default encoding.
● JSON property names SHOULD be camelCased.
● Error Model
○ error:
■ code: string
■ message: string
■ target: string
■ details: array of object
■ innererror: object
12. IBM API Design Guide
● Naming conventions
○ Parameter and property names must be lower snake-case
● Data types
○ Parameters and properties should use well-defined data types (type and format)
● Operations
○ Every operation should have a unique operationId
○ List all required parameters before any optional parameters
● Error Model
○ code: integer
○ error: string
○ help: string
13. API Design -- the devil is in the details
In a perfect world, everyone would agree on the design guidelines for APIs.
Even if everyone agreed on the aesthetics, the disparity of tools and methodologies
used across today's Cloud companies drives each company to establish specific
requirements to match their tooling.
● Approach to versioning
● Naming conventions
● Error models
14. Validating Compliance
● Trust, but verify.
● There are validators available to flag deviances from OpenAPI.
○ swagger-editor
○ swagger-parser
○ swagger-tools
○ go-swagger
● But the design guidelines we're talking about are generally beyond simple
compliance to the OpenAPI spec.
● And as we've said, each company -- maybe even different divisions within the
same company -- have different API guidelines.
15. Solution: Linters
Let’s use a solution borrowed from programming languages: linters
● Java: maven checkstyle
● Python: pylint
● Javascript: eslint
● Go: go-tools
The best of these tools are:
● Configurable
○ Let users choose which rules to enforce and which rules to ignore
● Extensible
○ Let users implement their own rules
16. A Configurable and Extensible Linter for OpenAPI
● All the common linters are written in the same language that they consume.
● So what language should we use for the OpenAPI linter?
● Answer (borrowing from microservices): A Polyglot Linter
○ Linter extensions (plugins) in any language can detect and report violations
18. JSON/YAML: in-between language and data structure
Humans can write it easily
but find it tedious and need validators to get it right.
Dynamically-typed languages can read it easily
but can crash on unexpected inputs.
Statically-typed languages can read it easily(?)
but require lots of casting (ugly!) or explicit models.
19. How to write OpenAPI tooling in static languages
Step 1: Read the OpenAPI spec.
Step 2: Start handwriting data structures and a reader.
Step 3: Notice two things:
1. This is tedious.
2. There’s a JSON schema for OpenAPI.
20. The OpenAPI JSON Schema
{
"title": "Schema for Swagger 2.0 API.",
"id": "http://swagger.io/v2/schema.json#",
"type": "object",
"required": [
"swagger",
"info",
"paths"
],
"additionalProperties": false,
"patternProperties": {
"^x-": {
"$ref": "#/definitions/vendorExtension"
}
},
"properties": {
"swagger": {
"type": "string",
"enum": [
"2.0"
],
"description": "The version of this document."
},
"info": {
"$ref": "#/definitions/info"
},
"host": {
"type": "string",
"pattern": "^[^{}/ :]+(?::d+)?$",
"description": "The host of the API.'"
},
"definitions": {
"info": {
"type": "object",
"description": "General information”,
"required": [
"version",
"title"
],
"additionalProperties": false,
"patternProperties": {
"^x-": {
"$ref": "#/definitions/vendorExtension"
}
},
"properties": {
"title": {
"type": "string",
"description": "A unique and precise title."
},
"version": {
"type": "string",
"description": "A semantic version number."
},
21. Can we use JSON Schema to build a data structure factory?
OpenAPI 2.0
JSON Schema
OpenAPI 3.0
JSON Schema
OpenAPI 3.1
JSON Schema
OpenAPI 3.2
JSON Schema
OpenAPI.go
OpenAPI.swift
OpenAPI.rust
OpenAPI.dart
OpenAPI.ex
OpenAPI.kt
26. gnostic, the OpenAPI Compiler
gnostic-
generator
OpenAPI
.proto and
support
code
OpenAPI
JSON
schema
protoc +
pluginsOpenAPI
.proto
reusable data structures
and reader for protobuf
OpenAPI descriptions
gnostic apps
and pluginsOpenAPI
description
gnostic parsed and
verified binary
protobuf of the
OpenAPI description
27. Kubernetes OpenAPI: .json vs .pb
Format Size Deserialization time Download time
(at 80 Mbps)
Json 1653 KB >500 ms 165.3 ms
Proto binary 914 KB 9.3 ms 91.4 ms
Proto binary
compressed
96 KB 13.5 ms 1.3 ms
28. Inside OpenAPI.proto: it starts with a Document
From OpenAPIv2/OpenAPIv2.proto:
message Document {
string swagger = 1;
Info info = 2;
string host = 3;
string base_path = 4;
repeated string schemes = 5;
// A list of MIME types accepted by the API.
repeated string consumes = 6;
// A list of MIME types the API can produce.
repeated string produces = 7;
Paths paths = 8;
Definitions definitions = 9;
ParameterDefinitions parameters = 10;
ResponseDefinitions responses = 11;
repeated SecurityRequirement security = 12;
SecurityDefinitions security_definitions = 13;
repeated Tag tags = 14;
ExternalDocs external_docs = 15;
repeated NamedAny vendor_extension = 16;
29. Inside OpenAPI.proto: preserving ordering in maps
From OpenAPIv2/OpenAPIv2.proto:
// One or more JSON representations for parameters
message ParameterDefinitions {
repeated NamedParameter additional_properties = 1;
}
// Automatically-generated message used to represent maps of Parameter as ordered (name,value) pairs.
message NamedParameter {
// Map key
string name = 1;
// Mapped value
Parameter value = 2;
}
message Parameter {
oneof oneof {
BodyParameter body_parameter = 1;
NonBodyParameter non_body_parameter = 2;
}
}
30. Inside OpenAPI.proto: representing paths
From OpenAPIv2/OpenAPIv2.proto:
// Relative paths to the individual endpoints. They must be relative to the 'basePath'.
message Paths {
repeated NamedAny vendor_extension = 1;
repeated NamedPathItem path = 2;
}
// Automatically-generated message used to represent maps of PathItem as ordered (name,value) pairs.
message NamedPathItem {
// Map key
string name = 1;
// Mapped value
PathItem value = 2;
}
message PathItem {
string _ref = 1;
Operation get = 2;
Operation put = 3;
Operation post = 4;
...
}
31. gnostic writes a “Request” message to plugins
From plugins/plugin.proto:
// A parameter passed to the plugin from (or through) gnostic.
message Parameter {
// The name of the parameter as specified in the option string
string name = 1;
// The parameter value as specified in the option string
string value = 2;
}
// An encoded Request is written to the plugin's stdin.
message Request {
// filename or URL of the original source document
string source_name = 1;
// Output path specified in the plugin invocation.
string output_path = 2;
// Plugin parameters parsed from the invocation string.
repeated Parameter parameters = 3;
// The version number of gnostic.
Version compiler_version = 4;
// API models
repeated google.protobuf.Any models = 5;
From google/protobuf/any.proto:
message Any {
// A URL/resource name that uniquely
// identifies the type of the serialized
// protocol buffer message.
string type_url = 1;
// Must be a valid serialized protocol
// buffer of the above specified type.
bytes value = 2;
}
32. plugins write “Response” messages back to gnostic
From plugins/plugin.proto:
// The plugin writes an encoded Response to stdout.
message Response {
// error messages. If non-empty, the plugin failed.
repeated string errors = 1;
// file output, each file will be written by gnostic to an appropriate location.
repeated File files = 2;
// informational messages to be collected and reported by gnostic.
repeated Message messages = 3;
}
// File describes a file generated by a plugin.
message File {
// name of the file
string name = 1;
// data to be written to the file
bytes data = 2;
}
33. New for linters: message responses
From plugins/plugin.proto:
// Plugins can return messages to be collated and reported by gnostic.
message Message {
enum Level {
UNKNOWN = 0;
INFO = 1;
WARNING = 2;
ERROR = 3;
FATAL = 4;
}
// message severity
Level level = 1;
// a unique message identifier
string code = 2;
// message text
string text = 3;
// an associated key path in an API description
repeated string keys = 4;
}
34. Linter examples
● Go
○ gnostic-lint-descriptions
○ gnostic-lint-paths
● Node.js
○ gnostic-lint-operations
○ gnostic-lint-responses
● Swift
○ gnostic-lint-responses-swift
35. Let’s run one.
$ gnostic examples/v2.0/yaml/petstore.yaml --lint-responses
level:ERROR code:"NO_ARRAY_RESPONSES" text:"Arrays MUST NOT be returned as the
top-level structure in a response body." keys:"paths" keys:"/pets" keys:"get"
keys:"responses" keys:"200" keys:"schema"
level:ERROR code:"NO_ARRAY_RESPONSES" text:"Arrays MUST NOT be returned as the
top-level structure in a response body." keys:"paths" keys:"/pets/{petId}"
keys:"get" keys:"responses" keys:"200" keys:"schema"
$ gnostic examples/v2.0/yaml/petstore.yaml --lint-responses --messages-out=.
$ report-messages petstore.messages.pb
ERROR NO_ARRAY_RESPONSES Arrays MUST NOT be returned as the top-level
structure in a response body. [paths /pets get responses 200 schema]
ERROR NO_ARRAY_RESPONSES Arrays MUST NOT be returned as the top-level
structure in a response body. [paths /pets/{petId} get responses 200 schema]
36. Let’s run ALL of them. (1 / 2)
$ gnostic examples/v2.0/yaml/petstore.yaml
--lint-responses
--lint-descriptions
--lint-paths
--lint-operations
--lint-responses-swift
--messages-out=.
--time-plugins
37. Let’s run ALL of them. (2 / 2)
$ report-messages petstore.messages.pb
ERROR NO_ARRAY_RESPONSES Arrays MUST NOT be returned as the top-level structure in a response body. [paths ...
ERROR NO_ARRAY_RESPONSES Arrays MUST NOT be returned as the top-level structure in a response body. [paths ...
WARNING NODESCRIPTION Operation has no description. [paths /pets get]
WARNING NODESCRIPTION Response has no description. [paths /pets get responses 200]
WARNING NODESCRIPTION Response has no description. [paths /pets get responses default]
WARNING NODESCRIPTION Operation has no description. [paths /pets post]
WARNING NODESCRIPTION Response has no description. [paths /pets post responses default]
WARNING NODESCRIPTION Operation has no description. [paths /pets/{petId} get]
WARNING NODESCRIPTION Response has no description. [paths /pets/{petId} get responses 200]
WARNING NODESCRIPTION Response has no description. [paths /pets/{petId} get responses default]
WARNING NODESCRIPTION Definition has no description. [definitions Pet]
WARNING NODESCRIPTION Property has no description. [definitions Pet properties id]
WARNING NODESCRIPTION Property has no description. [definitions Pet properties name]
WARNING NODESCRIPTION Property has no description. [definitions Pet properties tag]
WARNING NODESCRIPTION Definition has no description. [definitions Pets]
WARNING NODESCRIPTION Definition has no description. [definitions Error]
WARNING NODESCRIPTION Property has no description. [definitions Error properties code]
WARNING NODESCRIPTION Property has no description. [definitions Error properties message]
INFO PATH /pets [paths /pets]
INFO PATH /pets/{petId} [paths /pets/{petId}]
ERROR NO_ARRAY_RESPONSES Arrays MUST NOT be returned as the top-level structure in a response body. [paths ...
ERROR NO_ARRAY_RESPONSES Arrays MUST NOT be returned as the top-level structure in a response body. [paths ...
38. Runtime comparisons
Plugin Language Plugin Run Time
gnostic-lint-operations Node.js 195 ms
gnostic-lint-responses Node.js 221 ms
gnostic-lint-responses-swift Swift 17.8 ms
gnostic-lint-descriptions Go 18.5 ms
gnostic-lint-paths Go 18.6 ms
39. Where is this going?
● Organization-specific linters, e.g.
○ gnostic-lint-ibm
○ gnostic-lint-google
○ gnostic-lint-supercodegen
● Per-language linter-helper libraries, e.g.
○ Messaging helpers
○ Reference resolution
● More API tools: code generators, API management, smart clients, …
Better API experiences!