CacheCore
LangChain / LangGraph
CacheCore

LangChain / LangGraph

Cachecore works with LangChain and LangGraph because they use the OpenAI SDK internally. Configure openai_api_base on ChatOpenAI and all LLM calls route through the caching gateway.

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-5.4-mini",
    openai_api_base="https://api.cachecore.it/v1",
    openai_api_key="cc_live_xxxxx.eyJ...",
)

response = llm.invoke("Classify this support ticket as: billing, technical, or general.")
print(response.content)

LangGraph agents

LangGraph nodes that invoke LLMs benefit from caching automatically. Tool selection, classification, and routing steps repeat frequently across agent runs.

from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END
from typing import TypedDict

llm = ChatOpenAI(
    model="gpt-5.4-mini",
    openai_api_base="https://api.cachecore.it/v1",
    openai_api_key="cc_live_xxxxx.eyJ...",
)

class State(TypedDict):
    ticket: str
    category: str

def classify(state: State) -> State:
    result = llm.invoke(
        f"Classify this ticket as billing, technical, or general: {state['ticket']}"
    )
    return {"category": result.content.strip()}

graph = StateGraph(State)
graph.add_node("classify", classify)
graph.set_entry_point("classify")
graph.add_edge("classify", END)

app = graph.compile()

The classify node caches responses for repeated or similar tickets, saving tokens and latency on every run.

What caches well in LangChain workloads

| Call type | Cache effectiveness | Why | |-----------|-------------------|-----| | Classification | High | Same structure, varying inputs, semantic matches | | Tool routing | High | Limited set of intents map to tools | | Document summarisation | High | Same chunks produce same summaries | | Multi-turn conversation | Low | Full message history changes every turn |

See Caching for AI Agents for a deeper analysis.

Dependency invalidation

LangChain does not expose httpx transport injection directly. For full dependency invalidation support, either:

  1. Use the Python Client with the OpenAI SDK directly for LLM calls that need dep tagging
  2. Call the invalidation API via HTTP when your data changes
# Invalidate after a data change, independent of LangChain
from cachecore import CachecoreClient

cc = CachecoreClient(
    gateway_url="https://api.cachecore.it",
    tenant_jwt="cc_live_xxxxx.eyJ...",
)

await cc.invalidate("doc:contract-123", new_hash="v2")

Constraints

LangChain's ChatOpenAI does not support custom httpx transports, so the CachecoreTransport injection pattern from the Python Client page cannot be used directly. The base URL swap still provides L1 + L2 caching.