Retries

A transient failure - a 502 Bad Gateway, a dropped connection, an upstream timeout - is often gone by the time you try again a moment later. Without retries, that momentary blip surfaces to the storefront as a hard error on the very first attempt.

The Alokai middleware can retry these transient failures automatically for any integration. It's opt-in and works the same way for every integration regardless of transport (Axios, GraphQL, or a native SDK), because it operates on the normalized error after the request fails.

Quick Start

Add a retry field to the integration's config - the same file where its location and extensions already live (middleware.config.ts just composes these configs together):

integrations/sapcc/config.ts

export const config = {
  configuration: { /* ... */ },
  extensions: (extensions) => [...extensions /* ... */],
  location: '@vsf-enterprise/sapcc-api/server',
  retry: true, 
} satisfies Integration<Config>;

That's it. retry: true applies the default policy: up to 2 retries (3 attempts total) on transient errors, with exponential backoff between attempts. Integrations without a retry config are never retried.

What gets retried

By default, a failed call is retried only when the error is transient - something a retry could plausibly fix. That means one of these status codes (resolved from the normalized error):

Status	Meaning
`408`	Request Timeout
`429`	Too Many Requests
`502`	Bad Gateway
`503`	Service Unavailable
`504`	Gateway Timeout

Network and timeout failures (ETIMEDOUT, ECONNRESET, ECONNREFUSED, ENOTFOUND, …) are normalized into this range as well, so they're retried too.

Everything else fails immediately - 4xx business errors like 400 or 404, a deterministic 500, and a circuit breaker that has opened (it surfaces as a non-retryable error so a struggling backend isn't hammered further).

Retries are not method-aware by default The default policy retries any method on a transient error, including mutations like placeOrder or addCartLineItem. If a write times out after the backend already committed it, a retry can submit it twice. If that matters for your integration, restrict retries with a custom retryCondition (see below).

Customizing the policy

Instead of true, pass an object to override any part of the default policy. Every field is optional.

integrations/sapcc/config.ts

export const config = {
  configuration: { /* ... */ },
  extensions: (extensions) => [...extensions /* ... */],
  location: '@vsf-enterprise/sapcc-api/server',
  retry: {
    retries: 3,
    retryCondition: (error, context) => { /* ... */ },
    retryDelay: (attempt, error) => { /* ... */ },
  },
} satisfies Integration<Config>;

Field	Default	Description
`retries`	`2`	Number of retries after the initial attempt. `0` disables retrying.
`retryCondition`	transient errors	`(error, context) => boolean` - return `true` to retry.
`retryDelay`	randomized exponential backoff	`(attempt, error) => number` - delay in milliseconds before the given retry.

The context passed to retryCondition carries the functionName, integrationName, the optional extensionName, the 1-based attempt number, the configured retries budget, and a request-scoped logger - so you can tailor the decision per method and log why a call was or wasn't retried.

Retry reads only

To avoid retrying mutations, opt in by method name. The error passed to retryCondition has already been normalized to an HttpError, so you can read its statusCode directly. This example only retries transient failures on read-shaped methods:

integrations/sapcc/config.ts

import { HttpError } from '@alokai/connect/middleware';

const TRANSIENT = new Set([408, 429, 502, 503, 504]);

// inside the integration config:
retry: {
  retryCondition: (error, context) =>
    /^(get|search|list)/.test(context.functionName) &&
    error instanceof HttpError &&
    TRANSIENT.has(error.statusCode),
},

Custom backoff

retryDelay receives the attempt number (starting at 1) and returns milliseconds to wait. For a linear delay:

integrations/sapcc/config.ts

retry: {
  retryDelay: (attempt) => attempt * 500, // 500ms, 1000ms, ...
},

Disabling retries

Retries are off unless you opt in, so removing the retry field is enough. You can also set it explicitly to make the intent clear:

integrations/sapcc/config.ts

retry: false,

Relationship with the circuit breaker

Retries and the circuit breaker work together. Retries handle short-lived blips; the circuit breaker handles sustained outages. The breaker wraps the retry loop, which gives two guarantees:

A call that recovers within its retry budget counts as a single success - transient failures a retry papers over do not trip the breaker. Only a fully exhausted retry counts as one failure.
When the breaker is open, it short-circuits immediately, so the request fails fast without retrying.

Because the breaker wraps the whole retry sequence, its timeout now bounds all attempts plus their backoff delays - if you raise retries or the delay substantially, make sure the breaker's timeout leaves room for them.

Observability

Every retry emits a debug log line naming the integration, method, attempt number, and delay, so you can confirm retries are happening and tune the policy from real traffic.

Summary

Opt-in per integration, off by default
Retries transient failures (502/503/504/408/429 and network/timeout errors)
Sensible defaults (2 retries, exponential backoff) with full control over count, condition, and delay
Works for every integration through one shared mechanism
Plays nicely with the circuit breaker - an open breaker fails fast

On this page