Python Client
The cachecore package provides tenant namespace isolation, per-request dependency tagging, and programmatic cache invalidation. It injects at the httpx transport layer and works with any OpenAI-compatible SDK.
Installation
pip install cachecore-python
Requires Python 3.10+. Only dependency: httpx >= 0.25.0.
Quick start
import asyncio
import httpx
from openai import AsyncOpenAI
from cachecore import CachecoreClient
async def main():
cc = CachecoreClient(
gateway_url="https://api.cachecore.it",
tenant_jwt="cc_live_xxxxx.eyJ...",
)
openai = AsyncOpenAI(
base_url="https://api.cachecore.it/v1",
api_key="cc_live_xxxxx.eyJ...",
http_client=httpx.AsyncClient(transport=cc.transport),
)
response = await openai.chat.completions.create(
model="gpt-5.4-mini",
messages=[{"role": "user", "content": "Summarise contract #123"}],
)
print(response.choices[0].message.content)
await cc.aclose()
asyncio.run(main())
Three-rung adoption ladder
Rung 1: Base URL swap
No client library needed. Swap base_url on the OpenAI SDK:
from openai import OpenAI
client = OpenAI(
base_url="https://api.cachecore.it/v1",
api_key="cc_live_xxxxx.eyJ..."
)
You get L1 + L2 caching immediately.
Rung 2: Tenant isolation
Use CachecoreClient to scope caching per end-user. Each JWT encodes a tenant_id, so namespaces are automatically isolated:
from cachecore import CachecoreClient
import httpx
from openai import AsyncOpenAI
cc = CachecoreClient(
gateway_url="https://api.cachecore.it",
tenant_jwt=user.cachecore_token,
)
openai = AsyncOpenAI(
base_url="https://api.cachecore.it/v1",
api_key=user.cachecore_token,
http_client=httpx.AsyncClient(transport=cc.transport),
)
Rung 3: Dependency invalidation
Tag cache entries with data dependencies. Invalidate when the data changes:
from cachecore import CachecoreClient, Dep
import httpx
from openai import AsyncOpenAI
cc = CachecoreClient(
gateway_url="https://api.cachecore.it",
tenant_jwt="cc_live_xxxxx.eyJ...",
)
openai = AsyncOpenAI(
base_url="https://api.cachecore.it/v1",
api_key="cc_live_xxxxx.eyJ...",
http_client=httpx.AsyncClient(transport=cc.transport),
)
# Tag the response with a dependency
with cc.request_context(deps=[Dep("doc:contract-123")]):
response = await openai.chat.completions.create(
model="gpt-5.4-mini",
messages=[{"role": "user", "content": "Summarise contract #123"}],
)
# When contract #123 changes:
result = await cc.invalidate("doc:contract-123", new_hash="v2")
print(result.ok) # True
Bypass caching
Skip cache for a specific request. The transport omits the X-Cachecore-Token header, which causes the gateway to route directly to OpenAI in BYPASS mode:
with cc.request_context(bypass=True):
response = await openai.chat.completions.create(
model="gpt-5.4-mini",
messages=[{"role": "user", "content": "Generate a unique report ID"}],
)
Bulk invalidation
results = await cc.invalidate_many(
dep_ids=["table:products", "table:prices"],
new_hash="2024-03-15",
)
for r in results:
print(r.dep_id, r.ok)
Runs invalidations concurrently. Each dependency receives the same new_hash (or a random UUID if not provided).
Async context manager
async with CachecoreClient(
gateway_url="https://api.cachecore.it",
tenant_jwt="cc_live_xxxxx.eyJ...",
) as cc:
# ... use cc
pass # aclose() called automatically on exit
Reading cache status
from cachecore import CacheStatus
# Parse cache metadata from the httpx response headers
status = CacheStatus.from_headers(httpx_response.headers)
print(status.status) # "HIT_L1", "HIT_L2", "MISS", etc. (from X-Cache)
print(status.similarity) # 0.0-1.0 (from X-Cache-Similarity)
print(status.age_seconds) # seconds since entry was written (from X-Cache-Age)
Error handling
from cachecore.errors import CachecoreAuthError, CachecoreRateLimitError
try:
response = await openai.chat.completions.create(...)
except CachecoreAuthError:
# JWT is invalid or expired (401/403)
pass
except CachecoreRateLimitError as e:
# 429 Too Many Requests
if e.retry_after:
await asyncio.sleep(e.retry_after)
See Python Client API Reference for the full class and method reference.