-H "Authorization: Bearer YOUR_API_KEY" \
/api/tags | jq
curl -X POST -s \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen2.5-coder:7b-instruct-q4_K_M",
"prompt": "Write a Python prime checker",
"stream": true,
"options": { "temperature": 0.7, "num_predict": 512 }
}' \
/api/generate
…
Session …
About storage, analytics & request timeline
Chats are stored only in this browser (localStorage), keyed by your device ID. They are not sent to a separate “chat cloud” unless you use the API yourself.
Optional server beacons: POST /api/playground/analytics with the same auth as the API. Nginx must allow this path (see setup-domain.sh). Each payload includes clientProfile (user agent, screen, timezone, language) for attribution.
Request timeline (extra phases): When traffic goes through the ollama_logger proxy (default 11435), responses can include X-SDL-Proxy-Timing and stream markers for Ollama/proxy timing — not when the browser talks to Ollama directly. Queue depth is not exposed by Ollama over HTTP. Chat and Generate calls send keep_alive: "30m" so Ollama is more likely to keep the model loaded between turns (subject to server RAM settings).
Local diagnostics for this playground: what the browser knows about your API usage, storage, and a live probe of the server. Nothing here replaces server-side logs (e.g. SQLite from ollama_logger).
| Model | Turns | OK | Avg latency | Prompt Σ | Reply Σ |
|---|
Timeline columns are from the saved phase log (new events only). Click Show input / output to expand that turn’s text.
| Time | OK | HTTP | Proxy | Stream↑ | Output | Total | Model | In | Out | Client | I/O | Error |
|---|
curl -X POST -s \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen2.5-coder:7b-instruct-q4_K_M",
"messages": [
{"role":"system","content":"You are a coding assistant."},
{"role":"user","content":"Explain async/await"}
],
"stream": false
}' \
/api/chat
nomic-embed-text — pull it with ollama pull nomic-embed-text. It's much faster and lighter than using a general model for embeddings.
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen2.5-coder:7b-instruct-q4_K_M",
"prompt": "The quick brown fox"
}' \
/api/embeddings