Don’t Write a (builtin) API, Write a CRD!
Who am I?
Solly Ross (@directxman12 / metamagical.dev)
KubeBuilder Maintainer and Software Engineer on GKE
My mission is to make writing Kubernetes extensions less arcane
So… adding APIs to Kubernetes
a (snarky) Oddessy
How does one add a (built-in) API to Kubernetes?
Make a folder in
vendor/k8s.io/api/<group>/v<version>
, and write your public types, with JSON and proto tags.Copy those types to
pkg/apis/<group>
to make your internal types, which are probably exactly the same as your external types, except without JSON tags.Make sure you’ve
engraved all the magic runesadded all the xyz-gen markers to your typesWrite defaulting code in defaults.go
Generate automatic conversions between the internal type and the external type (hopefully conversion-gen doesn’t explode or fail to run because the file doesn’t exist)
Write manual conversions (if your internal and external types don’t match)
Write validation code in validation.go
Start code generation (
go-to-protobuf
,client-gen
,lister-gen
,informer-gen
) and go make some tea 🍵Create a registry & storage implementation in
pkg/registry
by copying somebody else’s registry cause it’s probably the same. You’ll also add your print columns and such here.Make sure you’ve got an install package in your group directories, and add your API version to the API server’s list, scheme, etc.
Edit a pelethora of bookkeeping files, including (but not limited to):
hack/lib/init.sh
,hack/update-generated-protobuf-dockerized.sh
1, various linter files, and various linter exception files 2.Update the fuzzer if you have things like tagged unions.
Start the non-code generators (
generated-docs
,generated-swagger-docs
,openapi-spec
) and go make another cup of tea 🍵Edit the cmd tests to make sure kubectl gives proper output.
Edit kubectl to implement a new describe command for your type.
Optional, but suggested: realize that you messed something up, make some changes, forget to run an update, have some test break subtly. Run
./hack/update-all.sh
in anger, go bake a cake 🍰, and come back to find that your laptop has overheated.
So, what are the issues here?
high barrier to entry/long iteration cycle
tightly coupled to Kubernetes releases
lots of bespoke code, little declarative information (things like validation and defaulting are raw Go code, and don’t show up in the OpenAPI)
Enter CRDs!
Make API folder anywhere
Add external types with JSON tags
Add markers to your fields for declarative defaulting, validation, print columns
Run code generation if you really must 3
Run controller-gen to generate your CRDs
Add some tests to make sure your examples work and give the right output
i.e. you want generated clients, informers, etc ↩
What does this bring us?
Faster iteration/experimentation: it’s easier to rapidly iterate on changes, share experiments, etc, since features are contained within dynamically loadable YAML, as opposed to compiling an entire API server.
Decoupled releases: Hypothetically, types can be released both with Kubernetes releases and independently of them.
Declarative Features (better OpenAPI spec): since things like validation and defaulting are done through the OpenAPI spec, you get expose a better experience to clients: all of your validation information transfers to the exposed OpenAPI spec for external tooling to consume.
(project-selfish) Dogfood: We want to make CRDs a good, full-featured experience for users. If we can’t develop our feature, there’s a decent chance that users will have issues too.
But I don’t want to write OpenAPI by hand!
Enter controller-gen!
Part of the KubeBuilder SIG API Machinery subproject
Already in use in Cluster API, Volume Snapshots, and others Kubernetes projects
Generates your CRDs (including print columns, validating, defaulting, etc) from your Go code (quickly, too)
But what if I need …
But what if I need …
complex validation?
declarative validation is generally sufficient.
OpenAPI’s validation is fairly robust (regexes, OneOf, format for things like date).
relly complex cross-field validation?
Use a validating webhook (another extensibility feature).
If we have problems with the existing validation, users will have them
too! We should extend the format
values for hard-to-validate common types.
But what if I need …
defaulting? Use declarative defaulting, which lets you apply defaults to arbrarity fields (including structs and lists)
special apply behavior? use the server-side apply merge configuration markers (CRDs don’t support strategic merge patch, but new consumers should be using server-side apply anyway).
at this point, @directxman12 looks around the room to SIG API Machinery for confirmation ↩
But what if I need conversion ?
Conversion webhooks
But what if I need cross-field defaulting ?
Please don’t
But what if I need to embed a built-in API type ?
controller-gen will generate a basic validation schema for those types, but you won’t get full validation until pod create time Sometimes, this is desirable.
Otherwise, you’ll need to use a validating webhook.
We could improve the situation by any of:
adding declarative validation information to built-in types,
adding schema references 5
cue concerned glaces from SIG API Machinery ↩
But what if I need subresources ?
Scale and Status have built-in support in CRDs
No, I mean <insert bespoke subresource here>
Did you check with API review 🤨?
Yes!
You’ll need a built-in API
If we get too many of these, we should eventually consider supporting these in CRDs as well.
But what if I need high read-write peformance ?
You’ll need a built-in API
We don’t have proto support yet for CRDs.
We’ll want to add proto support to CRDs eventually
But what if I need field selectors ?
Are you Pod
, coming from the glorious future where everything is CRDs
and we wear shiny eyemasks while we drive our hovercars?
No, but…
Use an informer with an indexer.
But what if I need my types to be present for the APIServer to boot ?
You’ll want a built-in API
It’s technically possible to overcome this bootstrap problem in some cases, but you’ll need to be extra careful.
But what if I need to actually install my types on cluster boot ?
🤷
RuntimeClass and CSIDriver had issues since they had in-tree controllers depending on their CRDs, and we never quite solved them.
Talk to SIG Cluster Lifecycle…
So… have people actually done this?
RuntimeClass: alpha with CRDs, migrated to a built-in API for beta
CSIDriver: alpha with CRDs, migrated to a built-in API for beta
TL;DW
CRDs have fast iteration time and good declarative features
We need to dogfood CRDs to ensure that they’re a good experience for users
There are still a few advanced use cases relating to performance, bootstrapping, and uncommon features that necessitate built-in APIs
All the problems (with the possible exception of bootstrapping) are solvable if we
believe in ourselves 🌈write some KEPs
Any Questions?
- controller-gen: book.kubebuilder.io/reference/generating-crd.html