skill-evaluation

Star

Here are 22 public repositories matching this topic...

Evol-ai / SkillCompass

Star

Evaluate agent skill quality. Find the weakest link. Fix it. Prove it worked.

ai-agents skill-evaluation anthropic agent-skills claude-code skill-rating claude-code-skill openclaw openclaw-skill

Updated Apr 23, 2026
JavaScript

AndrewNgGirl / SkillLens

Star

Open-source self-hosted web tool for evaluating Agent Skills with rubric scores, Deep Review, and improvement suggestions.

typescript skills skill nextjs self-hosted developer-tools cursor ai-agents claude rubric skill-evaluation llm agent-skills claude-code openclaw

Updated May 17, 2026
TypeScript

The skill OS for Codex, Claude Code, and Gemini CLI. One pool, one router, one feedback loop — across all three hosts. Per-turn semantic top-K with dynamic context sizing, session-end self-evaluation, and evidence-blended re-ranking that gets better the more you use it.

skill-evaluation self-improving-ai skill-routing skill-eval

Updated Jun 8, 2026
Python

ALEX-nlp / OpenSkillEval

Star

OpenSkillEval: Automatically Auditing the Open Skill Ecosystem for LLM Agents

benchmark ai-agents skill-evaluation llm-eval agent-evaluation

Updated Jun 15, 2026
Python

lizhiyao / oh-my-knowledge

Star

Evaluation framework for LLM knowledge inputs — prompts, RAG corpora, skills, agent workflows. Fix the model, vary the artifact. Built-in statistical rigor: bootstrap CI, Krippendorff α, length-debias, saturation curves.

benchmark ai evaluation-framework claude knowledge-engineering skill-evaluation llm prompt-engineering prompt-testing llm-evaluation rag-evaluation llm-judge claude-code agent-evaluation bootstrap-ci krippendorff-alpha evaluation-as-code multi-judge-ensemble

Updated Jun 30, 2026
TypeScript

huajielong / skill-evaluator

Star

Your AI agent skill doctor - 5-dimension scoring + security gate + leaderboard

security evaluation cursor ai-agents skill-evaluation ai-agent llm claude-code codex-cli claude-skills meta-skill openclaw skill-review hemerss

Updated Jun 10, 2026
Shell

liuchunming033 / hands-on-skill

Star

A complete, open-source guide to Agent Skill design: from cognition to production. 28 chapters + 6 appendices.

skill skill-evaluation agent-skills skill-design skill-practice

Updated Jun 29, 2026
HTML

VKirill / mcp-annas-archive-create-skill

Star

MCP server for Claude Code: Anna's Archive search/download + Gemini methodology extraction → audited Claude Code SKILL.md. One tool call, end-to-end.

Updated May 26, 2026
TypeScript

SirryChen / triage-skill-creator

Star

Triage-trainer：从零为您的个人助手构建定制化的导诊 Skill，赋予精准的就诊科室推荐能力

skill triage skill-evaluation skill-creator agent-trainer

Updated Mar 28, 2026
HTML

gaoguo / pg-skill-forge

Star

opencode 技能工厂 · Create, evaluate & optimize agent skills via DB physical-evidence eval engine. Cross-platform packaging (trae/claude/generic) + governance. 用 opencode.db 物理证据消除 LLM 自评偏差。

python opencode skill-evaluation ai-agent llm prompt-engineering agent-skills

Updated Jun 20, 2026
Python

Earnest-clockworkuniverse497 / mcp-annas-archive-create-skill

Star

Convert methodology books from Anna's Archive into Claude Code skills using the Model Context Protocol.

Updated Jun 30, 2026
TypeScript

pasunboneleve / skilpel

Star

A focused Go evaluator for agent-skills CI

testing go cli ci developer-tools ai-agents skill-evaluation llm-evaluation agent-skills

Updated Jun 1, 2026
Go

yadinae / agent-evolution

Star

🧬 Agent 自我进化系统 - 基于数据驱动的 AI Agent 能力提升平台 | ✨ 任务监控/技能评估/智能调度/自动进化 | 📊 95%+ 测试覆盖，<20ms 延迟

python open-source machine-learning automation data-driven performance-monitoring self-improvement skill-evaluation ai-optimization ai-agent intelligent-scheduling agent-evolution

Updated Jun 16, 2026
JavaScript

WilliamWJHuang / agent-skill-evaluator

Star

Evaluate agent SKILL.md files for structure, security, quality, and domain correctness.

linter quality-assurance security-analysis ai-agents skill-evaluation agent-skills

Updated Apr 18, 2026
Python

duck-ai-yy / skill-safety-reviewer

Star

A skill that reviews whether skills found online are safe to install for non-tech-background developers

ai-safety cowork skill-evaluation tool-evaluation claude-ai claude-skills safety-reviewer

Updated Mar 22, 2026

joinalahmed / skilleval

Star

100% deterministic evaluation framework for AI agent skills with dual-phase scoring, comprehensive security scanning, and transparent grading

python testing security ai skills evaluation owasp deterministic agents skill-evaluation llm agent-skills

Updated Jun 26, 2026
Python

yoligehude14753 / heyi-eval-v10

Star

Automated model evaluation pipeline v10 (post-incident architecture rewrite)

benchmark mcp chinese skill-evaluation huggingface ai-agent vllm llm-evaluation claude-code agent-evaluation

Updated Jun 2, 2026
Python

saniyaacharya04 / interviewforge

Star

AI-powered mock interview platform with automated scoring, role-based questions, modern React UI, FastAPI backend, and a fully implemented freemium SaaS architecture.

machine-learning-application technical-interviews fastapi skill-evaluation ai-tools assessment-platform interview-simulator ai-scoring

Updated Dec 13, 2025
Python

zinan92 / repo-evals

Star

Claim-first 仓库评测框架。in: owner/repo + repo_type → out: eval scaffold + 可靠性桶 (unusable/usable/reusable/recommendable)

bash evaluation testing-framework skill-evaluation claim-first

Updated Jun 27, 2026
HTML

agentskillexchange / agent-skills-resources

Star

Source-backed companion guide for agent skills, AI agent workflows, framework resources, evaluation templates, and rollout playbooks.

Updated Jun 29, 2026
Python

Improve this page

Add a description, image, and links to the skill-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the skill-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

skill-evaluation

Here are 22 public repositories matching this topic...

Evol-ai / SkillCompass

AndrewNgGirl / SkillLens

mega-edo / mega-tron

ALEX-nlp / OpenSkillEval

lizhiyao / oh-my-knowledge

huajielong / skill-evaluator

liuchunming033 / hands-on-skill

VKirill / mcp-annas-archive-create-skill

SirryChen / triage-skill-creator

gaoguo / pg-skill-forge

Earnest-clockworkuniverse497 / mcp-annas-archive-create-skill

pasunboneleve / skilpel

yadinae / agent-evolution

WilliamWJHuang / agent-skill-evaluator

duck-ai-yy / skill-safety-reviewer

joinalahmed / skilleval

yoligehude14753 / heyi-eval-v10

saniyaacharya04 / interviewforge

zinan92 / repo-evals

agentskillexchange / agent-skills-resources

Improve this page

Add this topic to your repo