Public release of katbox

This commit is contained in:
Renán Del Valle 2021-04-12 12:00:01 -07:00
commit 4c764e09a4
46 changed files with 4646 additions and 0 deletions

6
docs/README.md Normal file
View file

@ -0,0 +1,6 @@
# Katbox Documentation
* [High level overview](overview.md)
* [Volume creation and deletion flow](create-delete-flow.md)
* [How to deploy Katbox](deploy-1.18-and-later.md)
* [Running an sample application leveraging katbox](example-ephemeral.md)

View file

@ -0,0 +1,51 @@
# Creation and Deletion of Volumes
## Volume Creation Flow
![Katbox Create Flow](images/katboxCreate.png)
A request is made to the kubernetes to create a pod. Kubernetes schedules a pod on a worker node in the cluster.
The kubelet sees that there is a request for an ephemeral-inline volume from the driver `katbox.csi.paypal.com` so it looks through its registered plugins and finds Katbox. It creates a `NodePublishVolume` api requests through the Unix Domain Socket registered for Katbox by the CSI Node Registrar.
Katbox receives the request and creates a folder in its working directory. By default, this directory is located at `/var/lib/csi-katbox`.
Using the information provided by the request, a new folder is created in the working directory using the volume ID received:
`$WORKDIR/<Volume ID>`
This newly created folder is then bind mounted to the mount location expected by the kubelet called the `targetPath`. By default this folder's location is in:
`/var/lib/kubelet/pods/<pod ID>/volumes/kubernetes.io-csi/<volume name>/mount`
The pod UUID is generated by Kubernetes when the pod is created, while the volume name is the name given to the mount in the spec.
Once the bind mount is successful, Katbox replies to the request as having succeeded.
If for any reason katbox is unable to fulfill the request, an error is returned to the kubelet and the pod creation process fails.
## Volume Deletion Flow
![Katbox Delete Flow](images/katboxDelete.png)
When a pod which uses a Katbox allocated volume is deleted, the kubelet sends a `NodeUnpublishVolume` api request to katbox via a previously registered Unix Domain Socket.
Katbox unbinds the folder that was bound from katbox's working directory to the `targetPath`. Next, katbox adds the volume to its deletion queue and returns a successful deletion message to the kubelet.
If for some reason katbox is unable to unmount the given folder, katbox will instead return an error to the kubelet which will then begin attempting to retry the deletion.
The information given to the deletion queue includes the path of the folder used for the volume inside katbox's working directory, the time at which it was deleted, as well as a lifespan.
An example of the information given to the queue looks like this:
```
{
Path: "/var/lib/kubelet/csi-katbox/<pod ID>/<volume ID>",
Time: time.Now(),
Lifespan: <time.Duration,
}
```
The lifespan is set by the `--afterLifespan` flag passed to the katbox plugin. It determines how long after the deletion event we should hold on to the information stored inside folder whose volume was deleted.
A goroutine is then responsible for looking through the deletion queue and determining if a deletion candidate has aged passed its afterlife span. If it has, the folder inside katbox's working directory is deleted at this time and the candidate is removed from the deletion queue. If the deletion candidate is _not_ yet eligible for deletion, the candidate remains in the queue.
The after lifespan is also influenced by a `pressureFactor` which is derived from the `--headroom` flag passed to the katbox plugin. The `pressureFactor` may decrease the age needed for an eviction to happen if the underlying storage being used by katbox is currently experiencing high utilization.
High utilization in this case is defined by using the space defined by a value between 0.0 and 1.0 inclusive passed as the headroom flag. The default value for headroom is `0.1`. Therefore, the age required to be evicted will decrease if the underlying storage uses more than 90% of its total disk space.

View file

@ -0,0 +1,296 @@
## Cluster setup
Kubernetes 1.16+ required due to the Volume context `csi.storage.k8s.io/ephemeral` not existing before this version.
### Create CSIDriver object
Using kubectl, create a CSIDriver object for katbox
```yaml
apiVersion: storage.k8s.io/v1
kind: CSIDriver
metadata:
name: katbox.csi.paypal.com
spec:
# Supports persistent and ephemeral inline volumes.
volumeLifecycleModes:
- Ephemeral
# To determine at runtime which mode a volume uses, pod info and its
# "csi.storage.k8s.io/ephemeral" entry are needed.
podInfoOnMount: true
attachRequired: false
```
### Deploy DaemonSet on to cluster
#### Create a namespace (optional)
It makes it easier for all katbox pods to run in a different namespace.
A dedicated namespace can be created by using kubectl to apply the following configuration:
```yaml
apiVersion: v1
kind: Namespace
metadata:
name: csi-plugins
```
#### Creating the DaemonSet
Deploy the DaemonSet to run katbox (preferably in a namespace that is not used by default)
```shell
$ kubectl apply --namespace csi-plugins -f csi-katbox-plugin.yaml
```
```yaml
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: csi-katboxplugin
spec:
selector:
matchLabels:
app: csi-katboxplugin
template:
metadata:
labels:
app: csi-katboxplugin
spec:
hostNetwork: true
tolerations:
- operator: "Exists"
containers:
- name: node-driver-registrar
image: quay.io/k8scsi/csi-node-driver-registrar:v1.3.0
args:
- --v=5
- --csi-address=/csi/csi.sock
- --kubelet-registration-path=/var/lib/kubelet/plugins/csi-katbox/csi.sock
securityContext:
# This is necessary only for systems with SELinux, where
# non-privileged sidecar containers cannot access unix domain socket
# created by privileged CSI driver container.
privileged: true
env:
- name: KUBE_NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
volumeMounts:
- mountPath: /csi
name: socket-dir
- mountPath: /registration
name: registration-dir
- mountPath: /csi-data-dir
name: csi-data-dir
- name: katbox
image: quay.io/katbox/katboxplugin:latest
args:
- "--drivername=katbox.csi.paypal.com"
- "--v=1"
- "--endpoint=$(CSI_ENDPOINT)"
- "--nodeid=$(KUBE_NODE_NAME)"
- "--afterlifespan=3h"
env:
- name: CSI_ENDPOINT
value: unix:///csi/csi.sock
- name: KUBE_NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
securityContext:
privileged: true
ports:
- containerPort: 9898
name: healthz
protocol: TCP
livenessProbe:
failureThreshold: 5
httpGet:
path: /healthz
port: healthz
initialDelaySeconds: 10
timeoutSeconds: 3
periodSeconds: 2
volumeMounts:
- mountPath: /csi
name: socket-dir
- mountPath: /var/lib/kubelet/pods
mountPropagation: Bidirectional
name: mountpoint-dir
- mountPath: /var/lib/kubelet/plugins
mountPropagation: Bidirectional
name: plugins-dir
- mountPath: /csi-data-dir
name: csi-data-dir
- name: liveness-probe
volumeMounts:
- mountPath: /csi
name: socket-dir
image: quay.io/k8scsi/livenessprobe:v1.1.0
args:
- --csi-address=/csi/csi.sock
- --health-port=9898
volumes:
- hostPath:
path: /var/lib/kubelet/plugins/csi-katbox
type: DirectoryOrCreate
name: socket-dir
- hostPath:
path: /var/lib/kubelet/pods
type: DirectoryOrCreate
name: mountpoint-dir
- hostPath:
path: /var/lib/kubelet/plugins_registry
type: Directory
name: registration-dir
- hostPath:
path: /var/lib/kubelet/plugins
type: Directory
name: plugins-dir
- hostPath:
path: /var/lib/csi-katbox-data/
type: DirectoryOrCreate
name: csi-data-dir
```
### Run example application and validate
Next, validate the deployment. First, ensure all expected pods are running properly:
```shell
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
csi-katboxplugin-298f5 3/3 Running 0 43h
csi-katboxplugin-2qd8d 3/3 Running 0 43h
csi-katboxplugin-hvkjf 3/3 Running 0 43h
csi-katboxplugin-n62fm 3/3 Running 0 43h
csi-katboxplugin-x824j 3/3 Running 3 43h
csi-katboxplugin-zjr6g 3/3 Running 0 43h
```
There should be exactly one katbox pod per node able to schedule work.
From the [examples directory](../examples), run `csi-app-inline.yaml`
```yaml
kind: Pod
apiVersion: v1
metadata:
name: my-csi-app-inline
spec:
containers:
- name: my-frontend
image: busybox
volumeMounts:
- mountPath: "/data"
name: my-csi-volume
command: ["sh", "-c", "while true; do echo hello >> /data/test; sleep 100; done"]
volumes:
- name: my-csi-volume
csi:
driver: katbox.csi.paypal.com
```
Finally, inspect the application pod `my-csi-app` which mounts a katbox volume:
```shell
$ kubectl describe pods/my-csi-app
Name: my-csi-app-inline
Namespace: default
Priority: 0
Node: k8s-test-node-4/10.180.73.244
Start Time: Thu, 16 Jul 2020 13:44:53 -0700
Labels: <none>
Annotations: Status: Running
IP: 10.180.96.189
IPs:
IP: 10.180.96.189
Containers:
my-frontend:
Container ID: docker://f777b8c44d0d146241d73bbc2663b85274dca2e954c19d23ff504e81ffc0e875
Image: busybox
Image ID: docker-pullable://busybox@sha256:9ddee63a712cea977267342e8750ecbc60d3aab25f04ceacfa795e6fce341793
Port: <none>
Host Port: <none>
Command:
sh
-c
while true; do echo hello >> /data/test; sleep 100; done
State: Running
Started: Thu, 16 Jul 2020 13:44:59 -0700
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/data from my-csi-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-wrfhf (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
my-csi-volume:
Type: CSI (a Container Storage Interface (CSI) volume source)
Driver: katbox.csi.paypal.com
FSType:
ReadOnly: false
VolumeAttributes: <none>
default-token-wrfhf:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-wrfhf
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/my-csi-app-inline to k8s-test-node-4
Normal Pulling 32s kubelet, k8s-test-node-4 Pulling image "busybox"
Normal Pulled 28s kubelet, k8s-test-node-4 Successfully pulled image "busybox"
Normal Created 28s kubelet, k8s-test-node-4 Created container my-frontend
Normal Started 27s kubelet, k8s-test-node-4 Started container my-frontend
```
## Confirm the katbox driver works
The katpox driver is configured to create new volumes under `/csi-data-dir` inside the katbox container that is specified in the plugin DaemonSet previously deployed.
A file written in a properly mounted katbox volume inside an application should show up inside the katbox container. The following steps confirms that katbox is working properly. First, create a file from the application pod as shown:
```shell
$ kubectl exec -it my-csi-app-inline -- /bin/sh
/ # touch /data/hello-world
/ # exit
```
Find the node in which the sample app is running in by running:
```shell
$ kubectl get pods my-csi-app-inline -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-csi-app-inline 1/1 Running 0 7m57s 10.180.96.189 k8s-test-node-4 <none> <none>
```
Next, find the katbox driver for the node on which the sample application is running on:
```shell
$ kubectl get pods -n csi-plugins -o wide --field-selector spec.nodeName=k8s-test-node-4
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
csi-katboxplugin-n62fm 3/3 Running 0 43h 10.180.73.244 k8s-test-node-4 <none> <none>
```
Next, ssh into the katbox container and verify that the file shows up there:
```shell
$ kubectl exec -it csi-katboxplugin-n62fm -n csi-plugin -c katbox -- /bin/sh
```
Then, use the following command to locate the file. If everything works OK you should get a result similar to the following:
```shell
/ # find / -name hello-world
/csi-data-dir/csi-69121cc2ba7624a259442664bc942c00811cf4495faefccdd11efc2e79d1127c/hello-world
/var/lib/kubelet/pods/32a784c5-88a3-4585-8827-989d2c79dbfe/volumes/kubernetes.io~csi/my-csi-volume/mount/hello-world
/ #
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 126 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

23
docs/overview.md Normal file
View file

@ -0,0 +1,23 @@
# Overview of how Katbox works
The Katbox pod is comprised of a CSI Node Driver, a Liveness probe, as well as a [CSI node driver registrar](https://github.com/kubernetes-csi/node-driver-registrar/).
Before bringing up the Katbox pod, we must register some information with Kubernetes so that Katbox is able to serve requests for ephemeral-inline volumes.
The `kubectl` command is used to tell Kubernetes about our CSI Driver's name, what type of volume lifecycles it supports (ephemeral only), and whether or not we need extra info provided to the CSI when a volume is being mounted (we do).
An example can be seen in[ csi-katbox-driverinfo.yaml](../deploy/latest/katbox/csi-katbox-driverinfo.yaml)
Since we want to be able to create Katbox containers in all worker nodes, we can deploy Katbox as a [DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/).
![Katbox DaemonSet](images/katboxDaemonSet.png)
An example configuration can be seen in [csi-katbox-plugin.yaml](../deploy/latest/katbox/csi-katbox-plugin.yaml)
![Katbox DaemonSet](images/katboxDeploy.png)
The CSI node driver registrar comes up and registers Katbox by creating a Unix Domain Socket located at`/var/lib/kubelet/plugins_registry/katbox.csi.paypal.com-reg.sock`. The registrar is also responsible for querying the Katbox plugin, via a socket located at `/var/lib/kubelet/plugins/csi-katbox/csi.sock` for information about the plugin using the `GetPluginInfo()` gRPC call. It uses this information provided by the CSI plugin while registering the node with the Kubelet.
If the registration is successful, the Katbox plugin will now be able to receive communication via `/var/lib/kubelet/plugins/csi-katbox/csi.sock` whenever an ephemeral volume from the driver `katbox.csi.paypal.com` is requested. This request will come in the form of a `NodePublishVolume` API call. When an ephemeral-inline volume previously allocated by Katbox needs to be deleted, a `NodeUnpublishVolume` API call will come through the same socket.
The [Liveness probe container](https://github.com/kubernetes-csi/livenessprobe), as its name implies, is responsible for sending HTTP health checks to the Katbox plugin. If the plugin does not reply to the heartbeat requests, the Liveness probe container signals the failure to the kubelet and the kubelet restarts the Katbox plugin container.