Gateway API

The Cachecore gateway is an OpenAI-compatible HTTP proxy. All standard chat completions requests are forwarded transparently with caching applied.

Base URL: https://api.cachecore.it

Authentication: All requests require a Cachecore JWT:

Authorization: Bearer cc_live_xxxxx.eyJ...

Requests without a valid token are treated as bypass mode: routed directly to OpenAI without caching, and rate-limited by IP (100 req/min by default).

POST /v1/chat/completions

Proxies to OpenAI's chat completions endpoint with L1/L2 caching applied.

Request body: Identical to the OpenAI Chat Completions API. All parameters (model, messages, temperature, tools, etc.) are forwarded unchanged.

Request headers

| Header | Required | Description | |--------|----------|-------------| | Authorization | Yes* | Bearer cc_live_xxxxx.eyJ... | | X-Cachecore-Token | Yes* | Alternative auth (backward compat) | | X-Cachecore-Deps | No | Base64url-encoded JSON array of dependency declarations |

*One of the two auth headers is required.

Bypass: Omit both auth headers to skip caching entirely (BYPASS mode). The Python client's request_context(bypass=True) does this automatically.

Response headers

| Header | Values | Description | |--------|--------|-------------| | X-Cache | HIT_L1, HIT_L1_STALE, HIT_L2, MISS, BYPASS | Cache result | | X-Cache-Similarity | 0.00–1.00 | Cosine similarity (L2 hits); 1.00 for L1, 0.00 for MISS | | X-Cache-Age | integer seconds | Age of the cached entry; 0 for MISS/BYPASS |

Response body: Identical to OpenAI's response format.

POST /v1/invalidate

Delete all cache entries (L1 and L2) tagged with a given dependency.

Request body

{
  "dep_id": "doc:contract-123",
  "new_hash": "v2"
}

| Field | Type | Required | Description | |-------|------|----------|-------------| | dep_id | string | Yes | The dependency identifier to invalidate | | new_hash | string | No | New hash value. If omitted, a random UUID is generated |

Response

{
  "ok": true,
  "dep_id": "doc:contract-123",
  "keys_deleted": 14
}

Requires a valid Authorization: Bearer JWT header.

X-Cachecore-Deps format

The dependency header carries a base64url-encoded JSON array:

[
  {"dep_id": "doc:contract-123", "expected_hash": "v1"},
  {"dep_id": "table:products", "expected_hash": "2024-03-15"}
]

Encoding example in Python:

import base64, json

deps = [{"dep_id": "doc:contract-123", "expected_hash": "v1"}]
header_value = base64.urlsafe_b64encode(json.dumps(deps).encode()).decode()

If a cached entry's dep hash differs from the expected_hash in the incoming request, the entry is rejected (treated as a miss) and a fresh response is fetched and cached.

Rate limits

| Scope | Default | |-------|---------| | Authenticated (per tenant) | Configurable via JWT claims | | Unauthenticated (bypass) | 100 req/min per IP |

Exceeded limits return 429 Too Many Requests with an optional Retry-After header.