Docs & quickstart
gemini-switch speaks the OpenAI Chat Completions API. Point your client at our base URL, use your gsw_live_ key, and set model: "auto". That's the whole integration.
1. Get a key
Create a free account and your first key is minted automatically. Manage keys from the dashboard.
2. Python (OpenAI SDK)
python
from openai import OpenAI
client = OpenAI(
base_url="https://switch.aiskillhub.info/v1",
api_key="gsw_live_...",
)
resp = client.chat.completions.create(
model="auto", # or pin: "gemini-2.5-flash", "claude-sonnet-4.6"
messages=[{"role": "user", "content": "Explain RAG in two sentences"}],
extra_body={"quality": "low"} # optional: low | medium | high
)
print(resp.choices[0].message.content)
print(resp.gemini_switch) # routing + savings metadata3. Node / TypeScript
typescript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://switch.aiskillhub.info/v1",
apiKey: process.env.GSW_KEY,
});
const r = await client.chat.completions.create({
model: "auto",
messages: [{ role: "user", content: "Write a haiku about caching" }],
});
console.log(r.choices[0].message.content);4. cURL
bash
curl https://switch.aiskillhub.info/v1/chat/completions \
-H "Authorization: Bearer gsw_live_..." \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"quality": "medium",
"messages": [{"role":"user","content":"Hello"}]
}'Controlling the route
- model: "auto" — we pick the cheapest model that meets the inferred quality bar.
- quality: "low" | "medium" | "high" — pin the bar yourself instead of letting us infer it.
- model: "gemini-2.5-flash" (etc.) — force a specific model and skip routing.
- baseline: "claude-opus-4.8" — set what we compare savings against (default Claude Sonnet 4.6).
Response metadata
Every response includes a gemini_switch object and x-gsw-* headers: the chosen model, the baseline, cost, and exact savings for that call.
List models
bash
curl https://switch.aiskillhub.info/v1/models