Kubernetes Probes
Alokai Cloud customers' middleware and frontend apps are deployed in Kubernetes. This document explains the Kubernetes mechanisms used by Alokai Cloud to ensure that customers' applications are starting, running and exiting correctly. Implementing your application so that it works in accordance with those mechanisms ensures better detectability of issues and lower error rates of your deployment.
In order for some of those mechanisms to work, you - as the application developer - need to ensure certain REST endpoints exist in your application and that they respond with the correct status code. The sections below explain how to implement those endpoints.
Liveness probes
The purpose of liveness probes
Without any kind of probes set up for your application, the only failure scenario where your app will be considered as not working correctly is if the process exits on startup or any time during the application runtime. For example, if you run npm start
but the application secrets are missing from the environment, most applications' processes will immediately exit. The operating system could also kill the process due to lack of memory.
Liveness probes can help you recover the application from a broken state when only restart can solve the problem. The application gets probed every few seconds to determine if the server can respond. If a timeout or error response is received instead of a success status code, the app is considered to be in a dead state (e.g. caught in an infinite loop due to an edge-case and developer error) and gets restarted.
This helps your application keep handling traffic despite an issue that causes it to lock, which gives you time to fix the underlying issue without impacting company operations (as much).
Implementation of liveness probes in your application
In order for your application to be covered by liveness probes, you need to ensure that it responds with a HTTP 200 OK
status code to a GET request on a [your app URL]/healthz
endpoint.
In the case of the Alokai middleware - as of version 3.0.0 of the @vue-storefront/middleware
package, a liveness probe is enabled by default. You can launch the middleware locally and send a GET request to http://localhost:4000/healthz
endpoint. The response will be a HTTP 200 OK containing the body "ok
".
In the case of the frontend apps, /healthz
endpoints are also present in Nuxt and Next templates generated from the Alokai CLI. If you will be deploying your own custom fronted app to Alokai Cloud, and your app is not generated from Alokai CLI, you will need to add the /healthz
endpoint manually.
The below instructions help you create a simple /healthz
endpoint that responds with a HTTP 200 OK status code and the text ok
in the response body. You may be tempted to instead make /healthz
a real application route in your app, which if queried will respond with the HTML of your actual application. At first glance it may seem more robust, as in theory it tests a larger part of your application stack.
In reality, requesting a full app route as part of a liveness check - especially during periods of heavy traffic - can lead to new connection being opened that will never be resolved. A full page app can take hundreds of miliseconds to respond and contain a few kilobytes of payload. The simple route described below will respond in a few miliseconds with a two byte body size.
Do not make /healthz
or /readyz
a full app route. Instead keep it as simple as possible, by following the instructions below. Do not consider liveness probes as a fully fledged application smoke test, but as a simple check if the app can serve a basic HTTP request.
Nuxt
In Nuxt, to create a /healthz
endpoint, use server endpoints. Create a [Nuxt fronted app folder]/server/routes/healthz.ts
export default defineEventHandler(() => 'ok');
Next: App router
If using Next.js with app router, use route handlers:
1
Create a [Next app directory]/app/healthz/route.ts
file
2
Paste the following content inside
import { NextResponse } from 'next/server';
export function GET() {
return NextResponse.json({ status: 'ok' }, { status: 200 });
}
Next: Pages router
If using Next.js with pages router, use rewrites together with API routes:
1
Add the following code to your next.config.js
:
module.exports = {
// ...
async rewrites() {
return [
{
source: '/healthz',
destination: '/api/healthz',
},
];
},
}
This is necessary because by default Next's API routes have an /api
subpath, but we need it to be /healthz
/api/healthz
2
Create a [Next app directory]/pages/api/healthz.ts
file
3
Paste the following content inside:
import { NextApiRequest, NextApiResponse } from 'next';
export default function handler(_req: NextApiRequest, res: NextApiResponse) {
res.status(200).send('ok');
}
Readiness probes
The purpose of readiness probes
Applications in Kubernetes are often hosted in such a way that multiple duplicate instances of an application exist side-by-side simultaneously. Such a duplicate instance is called a replica. Readiness probes allow an application replica to temporarily mark itself as unable to serve requests in a Kubernetes cluster. A liveness probe can pass while a readiness probe fails - meaning that in general, the application is up, but is still waiting for something to happen so that it can serve requests (e.g. waiting for some secondary, dependent service to become online, like Redis cache).
The /readyz
endpoint of your application is queried automatically by Alokai Cloud every few seconds to check whether requests should be routed to the queried application replica. One such case - where traffic will should stop directed to a replica - is if an application instance is being killed (if it receives a SIGTERM
signal).
You can read more about Kubernetes readiness probes in the official documentation.
Built-in middleware readiness probes
As of version 5.0.0 of the @vue-storefront/middleware
package, you can launch the middleware locally and send a GET request to the http://localhost:4000/readyz
endpoint. The response will contain either a success message or a list of errors describing why the readiness probe failed.
To add custom readiness probes to the built-in @vue-storefront/middleware
readiness probe feature, pass them to the readinessProbes
property when calling createServer
.
const customReadinessProbe = async () => {
const dependentServiceRunning = await axios.get('http://someservice:3000/healthz');
if(dependentServiceRunning.status !== 200) {
throw new Error('Service that the middleware depends on is offline. The middleware is temporarily not ready to accept connections.')
}
}
const app = await createServer(config, { readinessProbes: [customReadinessProbe]});
In order for custom readiness probes to be implemented correctly, they need to do two things:
- they must all be async or return a promise (the return value is not checked, it's expected to be void/undefined)
- they must all throw an exception when you want a readiness probe to fail
Implementation of readiness probes in your own application
Readiness probes are more difficult to implement than liveness probes. In liveness probes, a simple stateless REST endpoint handler was sufficient. Readiness probes, on the other hand, need to monitor the signals that the application process received. In addition to that, you can write your own readiness conditions, such as checking if an external service that your application depends on is online.
If your application uses Node's http
module to serve HTTP requests, you can use the @godaddy/terminus
NPM package to implement readiness checks.