Rate Limiting

The stemp API enforces rate limits to ensure fair usage and platform stability. This page explains how rate limiting works and how to handle it in your integration.

How It Works

Rate limits are applied per organization and per authentication method. When you exceed the limit, the API returns a 429 Too Many Requests response.

Response Headers

Every API response includes rate limit information:

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed in the window
`X-RateLimit-Remaining`	Requests remaining in the current window
`X-RateLimit-Reset`	Unix timestamp when the window resets
`Retry-After`	Seconds to wait before retrying (only on 429)

Example 429 Response

HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1705312800
Content-Type: application/json

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Retry after 30 seconds."
  }
}

Handling Rate Limits

Exponential Backoff

The recommended approach is exponential backoff with jitter:

async function withRateLimit<T>(
  fn: () => Promise<Response>,
  maxRetries = 5
): Promise<T> {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const response = await fn()

    if (response.ok) {
      return response.json()
    }

    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get('Retry-After') || '5')
      const jitter = Math.random() * 1000
      await new Promise(r => setTimeout(r, retryAfter * 1000 + jitter))
      continue
    }

    throw new Error(`API error: ${response.status}`)
  }

  throw new Error('Max retries exceeded')
}

Proactive Rate Limit Monitoring

Check the remaining quota before making requests:

function checkRateLimit(response: Response) {
  const remaining = parseInt(response.headers.get('X-RateLimit-Remaining') || '0')
  const reset = parseInt(response.headers.get('X-RateLimit-Reset') || '0')

  if (remaining < 10) {
    const waitMs = (reset * 1000) - Date.now()
    console.warn(`Rate limit low: ${remaining} remaining. Resets in ${waitMs}ms`)
  }
}

Best Practices

Respect Retry-After — Always wait the specified duration before retrying.
Use exponential backoff — Don't hammer the API with immediate retries.
Add jitter — Randomize retry delays to avoid thundering herd effects.
Cache responses — Cache read-heavy data like user profiles and pass states.
Batch where possible — Combine multiple operations where the API supports it.
Monitor your usage — Track X-RateLimit-Remaining to detect issues before hitting limits.
Use webhooks — Instead of polling for changes, subscribe to webhook events to receive real-time notifications.

Common Scenarios

Scenario	Recommendation
Importing many users	Throttle to ~10 requests/second and monitor remaining quota
Real-time POS scanning	Rate limits are generous for normal usage; no special handling needed
Polling for updates	Switch to webhooks — they're more efficient and real-time
Batch stamp/point operations	Space requests over time rather than sending them all at once

Rate Limiting

On this page