Prove every skill
is worth shipping.
Skill Lab grades every SKILL.md in a public GitHub repo against 37 quality and security checks — and can rewrite the failing ones for you. No clone, no sign-up.
Swap one word in any GitHub URL.
That's the entire onboarding. Works on any public repo containing SKILL.md files.
How it works
Three steps, all real endpoints.
Skill Lab reads every SKILL.md in a public GitHub repo, runs the same 37 checks the sklab CLI runs locally, then lets you call LLM-powered judge, optimize, and triggers passes on demand.
Scan
Paste any GitHub URL. Skill Lab fetches every SKILL.md via the GitHub API — no clone, no sign-up. Results cache by commit SHA.
Check
Structure, naming, description, content, security. Every failure ships with a severity and a one-line fix.
Improve
Optional LLM passes: a judge verdict, an optimize rewrite that lifts the score, and a triggers test plan. All returned as JSON for CI.
Setting the bar
How real skills score.
Six public skills from anthropics/skills, re-evaluated on every deploy. We pick the skills — the scores are whatever the scanner finds.
- 37/37100anthropics/skillsskills/pdf
Read, write, OCR, merge, and watermark PDFs.
- 37/37100anthropics/skillsskills/algorithmic-art
Generative art in p5.js with seeded randomness.
- 36/3799anthropics/skillsskills/frontend-design
Production-grade UI components and layouts.
- 35/3796anthropics/skillsskills/mcp-builder
Build well-designed MCP servers and tools.
2 findings- medcontent.broken-internal-linksBroken internal link(s): ./reference/mcp_best_practices.md, ./reference/node_mcp_server.md, ./reference/python_mcp_server.md, ./reference/evaluation.md
- lowcontent.compatibility-prereqsCommand runners missing from compatibility: npx (needs Node.js)
- 35/3795anthropics/skillsskills/skill-creator
Create, edit, and benchmark new skills.
2 findings- medcontent.token-budgetBody exceeds 5000 token budget (8156 estimated)
- medcontent.asset-paths-existAsset path(s) not found on disk: assets/eval_review.html
- 34/3794anthropics/skillsskills/pptx
Build, parse, and edit PowerPoint decks.
3 findings- medcontent.script-paths-existScript path(s) not found on disk: scripts/thumbnail.py
- medcontent.broken-internal-linksBroken internal link(s): editing.md, pptxgenjs.md
- lowcontent.metadata-token-budgetMetadata exceeds 150 token budget (173 estimated)
What 'optimize' actually does
One call, a rewritten SKILL.md.
POST /v1/repos/:o/:r/optimize returns the original and a higher-scoring rewrite, plus the deltas. Below is a frozen example for a deliberately weak refund-handler skill — illustrative numbers, real response shape.
---
name: Refund Handler
description:
---
Handle customer refund requests. Look up the order, check the refund
policy, and issue a refund if eligible.
---
name: refund-handler
description: Use when a customer asks for a refund. Looks up the order, applies the refund policy, and either issues the refund or routes the request for human review.
---
# Refund Handler
Use this skill when a customer requests a refund. The skill verifies
eligibility against the refund policy and either issues the refund
directly or escalates to a human reviewer.
## When to use
- Customer explicitly asks for a refund, return, or money back
- A previous order had a defect, shipping issue, or pricing error
## Inputs
- `order_id` — required. The order being refunded.
- `reason` — customer-supplied; preserved verbatim for audit.
## Steps
1. Look up the order via `scripts/get_order.py`.
2. Check eligibility: within 30 days, marked delivered, not previously refunded.
3. If eligible, issue the refund via the payments API.
4. Otherwise, escalate to a human reviewer with the reason and order summary.
## Example
```
> Refund order 81022 — wrong size
✓ Eligible · refunded $42.00 to original payment method
```
## Safety
- Never refund without a verified order ID.
- Do not promise refund amounts before eligibility passes.
Web ↔ CLI
Same checks, on a server or your laptop.
sklab is the CLI that ships with the skill-lab PyPI package — same 37checks, same judge, same optimizer. Run it in CI or on a directory that hasn't been pushed yet.
# scan a repo from anywhere
curl https://api.skill-lab.dev/v1/repos/anthropics/skills/evaluate
# or just open it in a browser
open https://skill-lab.dev/anthropics/skillsStop shipping skills you can't measure.
Paste any public GitHub repo with SKILL.md files and Skill Lab will scan it against the rubric.