Every retrieval call runs on your machine. Anonymized usage events stream to a
Cloudflare Worker at api.windags.ai. Server-side aggregates feed
back into ranking so every install gets sharper as the network sees more tasks.
Honest status of each piece is below.
The MCP server ships with the skill catalog, the BM25 index, the Tool2Vec embeddings, the cross-encoder weights, and an attribution DB. The retrieval cascade runs there — no API key, no network round-trip on the hot path.
The Cloudflare Worker is a one-way street for writes (telemetry events) and a read-only API for aggregates (popular skills, gap clusters, per-skill health). Aggregates are what makes the flywheel turn across installs.
Solid arrows = synchronous request/response. Dashed grey = anonymous fire-and-forget telemetry, never blocks the user. Dashed blue = aggregates read-back (the cross-user feedback path — fetched by every MCP at startup and blended into cascade Stage 6).
The Worker is public-read. You can hit it right now:
# health check $ curl https://api.windags.ai/v1/health {"status":"ok","service":"windags-api",...} # most-grafted skills in the last 7 days $ curl "https://api.windags.ai/v1/popular?window=7d&limit=5" # task clusters that scored below 0.4 — i.e. skills the catalog is missing $ curl "https://api.windags.ai/v1/gaps?min_requests=3" # aggregated health for one skill $ curl https://api.windags.ai/v1/skills/api-architect/health
And you can simulate a telemetry event the way the MCP does:
$ curl -X POST https://api.windags.ai/v1/telemetry \ -H "Content-Type: application/json" \ -d '{ "event_type":"graft", "installation_id":"my-test-id", "mcp_version":"2.3.1", "manifest_hash":"deadbeef", "runtime":"claude-code", "task_hash":"abc123", "selected_skills":["api-architect"], "top_score":0.82 }' {"ok":true}
That row is now visible in /v1/popular and contributing to
/v1/skills/api-architect/health within a few seconds.
| Piece | Where | Status |
|---|---|---|
| Stage 1 — BM25 (stemmed + bigrams) | Local MCP, always on | live |
| Stage 2 — Tool2Vec bi-encoder | Local MCP, when embeddings load | live |
| Stage 3 — RRF fusion (k=60) | Local MCP, when Stage 2 ran | live |
| Stage 4 — Cross-encoder re-rank (MS-MARCO MiniLM) | Local MCP, when ML deps loaded | live |
| Stage 5 — Attribution k-NN over local outcomes | Local MCP, reads .windags/triples/ |
live |
Telemetry write — POST /v1/telemetry |
MCP → Worker → D1, fire-and-forget | live |
Aggregate reads — /v1/popular, /v1/gaps, /v1/skills/:id/health |
Worker → D1, public | live |
Version & manifest check — GET /v1/version |
MCP checks at startup | live |
| Cross-user feedback into ranking — MCP reads global priors (3-tier: manifest ⊆ exact ⊆ any) and blends as Stage 6 | Worker: /v1/priors/batch + /v1/priors/freshness. Client: GlobalPriorsClient with adaptive refresh. |
live |
Gap-driven skill authoring — auto-PR from /v1/gaps clusters |
Worker → GitHub Actions | planned |
The loop is closed and live.
The cascade runs locally, telemetry flows to the Worker, the Worker
aggregates correctly, and POST /v1/priors/batch serves
3-tier priors (manifest_match ⊆ exact_match ⊆ any_match)
back into ranking as cascade Stage 6. The MCP attaches a
GlobalPriorsClient at startup, warms the cache once
(chunked at 500 skills per call), then polls /v1/priors/freshness
every five minutes. A skill re-fetches mid-session only when its new event
count would measurably shift the mean (Δn ≥ 10 absolute AND
Δn / n_cached ≥ 25% relative, rate-limited to one refresh per hour).
Every install learns from every other.
Each stage runs on your machine against the skill catalog that ships with the plugin. Each stage gracefully degrades — BM25 alone returns a coherent answer; later stages refine it as data and compute become available.
Why local? The skill bodies, the BM25 index, the
embeddings, and the cross-encoder weights all ship with the plugin. Keeping
the cascade local means no API key, no network on the hot path, no per-call
cost, and no provider can see the prompts you're routing through it.
Stage 5's attribution k-NN reads from ~/.windags/triples/ — also
local.
Every graft, reference load, and feedback event the MCP fires becomes a row
in the Worker's D1 table. SQL aggregates surface popularity, gap clusters, and
per-skill health. The MCP fetches those aggregates at startup via
/v1/priors/batch and blends them into cascade Stage 6 — three
nested tiers (manifest ⊆ exact ⊆ any), decomposed into
exclusive subsets, capped at a 0.2 blend weight. A background freshness probe
keeps trending skills current without thundering the API.
Agent calls graft. Cascade returns primary skill bodies + reference manifests, all from disk.
MCP fires anonymized event to /v1/telemetry. Worker inserts into telemetry_events.
SQL aggregates: graft counts, top scores, gap clusters when top_score < 0.4, reference load rates.
MCP pulls /v1/priors/batch at startup; blends 3-tier priors as cascade Stage 6 (capped 0.2). A background freshness probe refreshes only when Δ would measurably shift the mean.
Skills with high acceptance climb across every install; stale ones demote; gaps surface to skill authors via /v1/gaps.
Telemetry is opt-out via WINDAGS_TELEMETRY=off. Default is
anonymous mode: task hash, not task text. The skill catalog, your prompts,
your repo, your CLAUDE.md, and the actual graft response body never leave
your machine.
api.windags.ai.| Table | What's in it |
|---|---|
| telemetry_events | Append-only event log. event_type ∈ {graft, reference, search, catalog}. Indexed on type + installation + timestamp. |
| tool_call_events | Lighter-weight per-tool-call stream, sampled to once per 24h per machine. Drives the internal usage dashboard. |
| installations | One row per anonymous installation UUID. First/last seen, runtime, MCP version, event count. |
| skill_gaps | Auto-populated when a graft scores below 0.4. Same task pattern recurring → "skill that should exist." |
Aggregates like /v1/popular are computed on the fly with
json_each(selected_skills) over telemetry_events —
no precomputed rollup tables today, which keeps the schema small and the
truth in one place.