Skill Lab — the evaluation layer for agent skills

sklab evaluate#

Static checks plus LLM quality review with 0–100 scoring.

Also available over HTTP. The Evaluate Skills endpoint runs the same logic on the server against a GitHub repository — useful when you want results in a browser, a CI job without installing Python, or an agent that reaches skill-lab.dev directly.

Usage#

bash

sklab evaluate [SKILL_PATH] [OPTIONS]

Runs 37 static checks across Structure, Naming, Description, Content, and Security, then sends the skill to an LLM judge that scores it on 9 criteria across Activation and Instruction axes. Use --skip-review for a static-only run, or --format json to emit the same payload shape as the /v1/evaluate endpoint.

Arguments#

Argument	Required	Description
`SKILL_PATH`	no	Path to the skill directory. Defaults to the current directory.

Options#

Flag	Value	Description
`--output`, `-o`	`<PATH>`	Write the report to a file (implies --format json if --format is not set).
`--format`, `-f`	`json\|console` default: `console`	Output format.
`--verbose`, `-V`	flag	Show all checks (including passing ones) and LLM reasoning.
`--spec-only`, `-s`	flag	Run only the checks required by the Agent Skills spec.
`--all`, `-a`	flag	Discover and evaluate every skill under the current directory.
`--repo`	flag	Discover and evaluate every skill from the git repo root.
`--skip-review`	flag	Skip the LLM judge (static checks only).
`--model`, `-m`	`<MODEL_ID>` default: `claude-haiku-4-5-20251001`	Model for the LLM judge. Supports Anthropic, OpenAI (gpt-), and Gemini (gemini-) models — provider auto-detected from the prefix.
`--optimize`	flag	Automatically chain into sklab optimize after evaluation (no interactive prompt).

Examples#

Evaluate one skill

bash

$ sklab evaluate ./my-skill

JSON report to disk

bash

$ sklab evaluate ./my-skill -f json -o report.json

Static checks only (no API key)

bash

$ sklab evaluate ./my-skill --skip-review

Every skill in the current repo

bash

$ sklab evaluate --repo

Evaluate then optimize in one step

bash

$ sklab evaluate ./my-skill --optimize

Output#

Console rendering groups checks by dimension with pass/fail status and the LLM judge's per-criterion scores. With --format json, the output matches the /v1/repos/{owner}/{repo}/evaluate response payload.

Exit Codes#

Code	Meaning
`0`	All high-severity checks passed.
`1`	One or more checks failed, or a CLI error occurred.

Notes#

LLM review requires ANTHROPIC_API_KEY, OPENAI_API_KEY, or GEMINI_API_KEY. The env var is selected from the model prefix.
--all and --repo are mutually exclusive, and cannot be combined with a positional SKILL_PATH.

sklab checkQuick pass/fail check — exits 0 or 1, designed for CI pipelines.sklab infoSkill metadata and token cost estimates (discovery / activation / on-demand).sklab optimizeLLM-powered SKILL.md rewrite with diff preview and score delta.

PreviousQuickstart Nextsklab check