Python Client

The cachecore package provides tenant namespace isolation, per-request dependency tagging, and programmatic cache invalidation. It injects at the httpx transport layer and works with any OpenAI-compatible SDK.

Installation

pip install cachecore-python

Requires Python 3.10+. Only dependency: httpx >= 0.25.0.

Quick start

import asyncio
import httpx
from openai import AsyncOpenAI
from cachecore import CachecoreClient

async def main():
    cc = CachecoreClient(
        gateway_url="https://api.cachecore.it",
        tenant_jwt="cc_live_xxxxx.eyJ...",
    )

    openai = AsyncOpenAI(
        base_url="https://api.cachecore.it/v1",
        api_key="cc_live_xxxxx.eyJ...",
        http_client=httpx.AsyncClient(transport=cc.transport),
    )

    response = await openai.chat.completions.create(
        model="gpt-5.4-mini",
        messages=[{"role": "user", "content": "Summarise contract #123"}],
    )
    print(response.choices[0].message.content)
    await cc.aclose()

asyncio.run(main())

Three-rung adoption ladder

Rung 1: Base URL swap

No client library needed. Swap base_url on the OpenAI SDK:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.cachecore.it/v1",
    api_key="cc_live_xxxxx.eyJ..."
)

You get L1 + L2 caching immediately.

Rung 2: Tenant isolation

Use CachecoreClient to scope caching per end-user. Each JWT encodes a tenant_id, so namespaces are automatically isolated:

from cachecore import CachecoreClient
import httpx
from openai import AsyncOpenAI

cc = CachecoreClient(
    gateway_url="https://api.cachecore.it",
    tenant_jwt=user.cachecore_token,
)

openai = AsyncOpenAI(
    base_url="https://api.cachecore.it/v1",
    api_key=user.cachecore_token,
    http_client=httpx.AsyncClient(transport=cc.transport),
)

Rung 3: Dependency invalidation

Tag cache entries with data dependencies. Invalidate when the data changes:

from cachecore import CachecoreClient, Dep
import httpx
from openai import AsyncOpenAI

cc = CachecoreClient(
    gateway_url="https://api.cachecore.it",
    tenant_jwt="cc_live_xxxxx.eyJ...",
)

openai = AsyncOpenAI(
    base_url="https://api.cachecore.it/v1",
    api_key="cc_live_xxxxx.eyJ...",
    http_client=httpx.AsyncClient(transport=cc.transport),
)

# Tag the response with a dependency
with cc.request_context(deps=[Dep("doc:contract-123")]):
    response = await openai.chat.completions.create(
        model="gpt-5.4-mini",
        messages=[{"role": "user", "content": "Summarise contract #123"}],
    )

# When contract #123 changes:
result = await cc.invalidate("doc:contract-123", new_hash="v2")
print(result.ok)  # True

Bypass caching

Skip cache for a specific request. The transport omits the X-Cachecore-Token header, which causes the gateway to route directly to OpenAI in BYPASS mode:

with cc.request_context(bypass=True):
    response = await openai.chat.completions.create(
        model="gpt-5.4-mini",
        messages=[{"role": "user", "content": "Generate a unique report ID"}],
    )

Bulk invalidation

results = await cc.invalidate_many(
    dep_ids=["table:products", "table:prices"],
    new_hash="2024-03-15",
)
for r in results:
    print(r.dep_id, r.ok)

Runs invalidations concurrently. Each dependency receives the same new_hash (or a random UUID if not provided).

Async context manager

async with CachecoreClient(
    gateway_url="https://api.cachecore.it",
    tenant_jwt="cc_live_xxxxx.eyJ...",
) as cc:
    # ... use cc
    pass  # aclose() called automatically on exit

Reading cache status

from cachecore import CacheStatus

# Parse cache metadata from the httpx response headers
status = CacheStatus.from_headers(httpx_response.headers)
print(status.status)       # "HIT_L1", "HIT_L2", "MISS", etc.  (from X-Cache)
print(status.similarity)   # 0.0-1.0 (from X-Cache-Similarity)
print(status.age_seconds)  # seconds since entry was written (from X-Cache-Age)

Error handling

from cachecore.errors import CachecoreAuthError, CachecoreRateLimitError

try:
    response = await openai.chat.completions.create(...)
except CachecoreAuthError:
    # JWT is invalid or expired (401/403)
    pass
except CachecoreRateLimitError as e:
    # 429 Too Many Requests
    if e.retry_after:
        await asyncio.sleep(e.retry_after)

See Python Client API Reference for the full class and method reference.