Cloud Run services now can configure startup and liveness probes for a running container.
The startup probe is for determining when a container has cleanly started up and is ready to take traffic. A Liveness probe kicks off once a container has started up, to ensure that the container remains functional — Cloud Run would restart a container if the liveness probe fails.
Implementing Health Check Probes
A Cloud Run service can be described using a manifest file and a sample manifest looks like this:
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
annotations:
run.googleapis.com/ingress: all
name: health-cloudrun-sample
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/maxScale: '5'
autoscaling.knative.dev/minScale: '1'
spec:
containers:
image: us-west1-docker.pkg.dev/sample-proj/sample-repo/health-app-image:latest
startupProbe:
httpGet:
httpHeaders:
- name: HOST
value: localhost:8080
path: /actuator/health/readiness
initialDelaySeconds: 15
timeoutSeconds: 1
failureThreshold: 5
periodSeconds: 10
livenessProbe:
httpGet:
httpHeaders:
- name: HOST
value: localhost:8080
path: /actuator/health/liveness
timeoutSeconds: 1
periodSeconds: 10
failureThreshold: 5
ports:
- containerPort: 8080
name: http1
resources:
limits:
cpu: 1000m
memory: 512Mi
This manifest can then be used for deployment to Cloud Run the following way:
gcloud run services replace sample-manifest.yaml --region=us-west1
Now, coming back to the manifest, the startup probe is defined this way:
startupProbe:
httpGet:
httpHeaders:
- name: HOST
value: localhost:8080
path: /actuator/health/readiness
initialDelaySeconds: 15
timeoutSeconds: 1
failureThreshold: 5
periodSeconds: 10
It is set to make an http request to a /actuator/health/readiness path. There is an explicit HOST header also provided, this is temporary though as Cloud Run health checks currently have a bug where this header is missing from the health check requests.
The rest of the properties indicate the following:
- initialDelaySeconds — delay for performing the first probe
- timeoutSeconds — timeout for the health check request
- failureThreshold — number of tries before the container is marked as not ready
- periodSeconds — the delay between probes
Once the startup probe succeeds, Cloud Run would mark the container as being available to handle the traffic.
A livenessProbe follows a similar pattern:
livenessProbe:
httpGet:
httpHeaders:
- name: HOST
value: localhost:8080
path: /actuator/health/liveness
timeoutSeconds: 1
periodSeconds: 10
failureThreshold: 5
From a Spring Boot application perspective, all that needs to be done is to enable the Health check endpoints as described here
Conclusion
Start-Up probe ensures that a container receives traffic only when ready and a Liveness probe ensures that the container remains healthy during its operation, else gets restarted by the infrastructure. These health probes are a welcome addition to the already excellent feature set of Cloud Run.