Gateway API
The Cachecore gateway is an OpenAI-compatible HTTP proxy. All standard chat completions requests are forwarded transparently with caching applied.
Base URL: https://api.cachecore.it
Authentication: All requests require a Cachecore JWT:
Authorization: Bearer cc_live_xxxxx.eyJ...
Requests without a valid token are treated as bypass mode: routed directly to OpenAI without caching, and rate-limited by IP (100 req/min by default).
POST /v1/chat/completions
Proxies to OpenAI's chat completions endpoint with L1/L2 caching applied.
Request body: Identical to the OpenAI Chat Completions API. All parameters (model, messages, temperature, tools, etc.) are forwarded unchanged.
Request headers
| Header | Required | Description |
|--------|----------|-------------|
| Authorization | Yes* | Bearer cc_live_xxxxx.eyJ... |
| X-Cachecore-Token | Yes* | Alternative auth (backward compat) |
| X-Cachecore-Deps | No | Base64url-encoded JSON array of dependency declarations |
*One of the two auth headers is required.
Bypass: Omit both auth headers to skip caching entirely (BYPASS mode). The Python client's request_context(bypass=True) does this automatically.
Response headers
| Header | Values | Description |
|--------|--------|-------------|
| X-Cache | HIT_L1, HIT_L1_STALE, HIT_L2, MISS, BYPASS | Cache result |
| X-Cache-Similarity | 0.00–1.00 | Cosine similarity (L2 hits); 1.00 for L1, 0.00 for MISS |
| X-Cache-Age | integer seconds | Age of the cached entry; 0 for MISS/BYPASS |
Response body: Identical to OpenAI's response format.
POST /v1/invalidate
Delete all cache entries (L1 and L2) tagged with a given dependency.
Request body
{
"dep_id": "doc:contract-123",
"new_hash": "v2"
}
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| dep_id | string | Yes | The dependency identifier to invalidate |
| new_hash | string | No | New hash value. If omitted, a random UUID is generated |
Response
{
"ok": true,
"dep_id": "doc:contract-123",
"keys_deleted": 14
}
Requires a valid Authorization: Bearer JWT header.
X-Cachecore-Deps format
The dependency header carries a base64url-encoded JSON array:
[
{"dep_id": "doc:contract-123", "expected_hash": "v1"},
{"dep_id": "table:products", "expected_hash": "2024-03-15"}
]
Encoding example in Python:
import base64, json
deps = [{"dep_id": "doc:contract-123", "expected_hash": "v1"}]
header_value = base64.urlsafe_b64encode(json.dumps(deps).encode()).decode()
If a cached entry's dep hash differs from the expected_hash in the incoming request, the entry is rejected (treated as a miss) and a fresh response is fetched and cached.
Rate limits
| Scope | Default | |-------|---------| | Authenticated (per tenant) | Configurable via JWT claims | | Unauthenticated (bypass) | 100 req/min per IP |
Exceeded limits return 429 Too Many Requests with an optional Retry-After header.