Docs & quickstart

gemini-switch speaks the OpenAI Chat Completions API. Point your client at our base URL, use your gsw_live_ key, and set model: "auto". That's the whole integration.

1. Get a key

Create a free account and your first key is minted automatically. Manage keys from the dashboard.

2. Python (OpenAI SDK)

python

from openai import OpenAI

client = OpenAI(
    base_url="https://switch.aiskillhub.info/v1",
    api_key="gsw_live_...",
)

resp = client.chat.completions.create(
    model="auto",                 # or pin: "gemini-2.5-flash", "claude-sonnet-4.6"
    messages=[{"role": "user", "content": "Explain RAG in two sentences"}],
    extra_body={"quality": "low"} # optional: low | medium | high
)
print(resp.choices[0].message.content)
print(resp.gemini_switch)         # routing + savings metadata

3. Node / TypeScript

typescript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://switch.aiskillhub.info/v1",
  apiKey: process.env.GSW_KEY,
});

const r = await client.chat.completions.create({
  model: "auto",
  messages: [{ role: "user", content: "Write a haiku about caching" }],
});
console.log(r.choices[0].message.content);

4. cURL

bash

curl https://switch.aiskillhub.info/v1/chat/completions \
  -H "Authorization: Bearer gsw_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "quality": "medium",
    "messages": [{"role":"user","content":"Hello"}]
  }'

Controlling the route

model: "auto" — we pick the cheapest model that meets the inferred quality bar.
quality: "low" | "medium" | "high" — pin the bar yourself instead of letting us infer it.
model: "gemini-2.5-flash" (etc.) — force a specific model and skip routing.
baseline: "claude-opus-4.8" — set what we compare savings against (default Claude Sonnet 4.6).

Response metadata

Every response includes a gemini_switch object and x-gsw-* headers: the chosen model, the baseline, cost, and exact savings for that call.

List models

bash

curl https://switch.aiskillhub.info/v1/models