#1
·
2026-05-05
·main@HEAD (npm v1.4.1)
qiushi-skill
HughYau/qiushi-skill
🛠73 / 100
🛠
🗺
📍
⚛
→
⚗
→
🧬
🛑
0–29
⚠️
30–49
🛠
50–79
🏭
80–100
▼
73
🛠· 73 / 100
- ✓6 claims passed, no critical failures
- ✓MIT / Apache / etc., installable per deployment.install_methods
- ✓release_pipeline_score=2 + pushed in 90-day window
- ✓multilingual_readme=true
- ⚪static-only eval; live e2e pending
#2
#3
#4
npx qiushi-skill install <platform> | Claude Code / Codex / Cursor / Hermes / NanoBot / OpenClaw / OpenCode | easy |
git clone + cp skills/ to platform skills dir | any | moderate |
AI agent runtime (Claude Code / Codex / Cursor / Hermes / NanoBot / OpenClaw / OpenCode)
Host that loads the skills
Standard agent-side cost; the skill itself is pure markdown (no extra API calls)
· 7
5 1 1
| +40 | |
| +18 | |
| +12 | |
| +3 | |
| 0 | |
| 0 |
6 / 7
passed claim-001
passed claim-002
passed claim-003
passed claim-004
passed claim-005
passed claim-006
untested claim-007
input_contract | |
|---|---|
output_contract | |
determinism | |
idempotence | |
no_skill_callouts | |
failure_mode_clarity |
workflow_correctness | |
|---|---|
declared_call_graph | |
stop_conditions | |
handoff_points | |
atom_evidence | |
error_propagation | |
partial_failure_handling |
- core user-facing layer untested → capped at 'usable'
- hybrid-repo rule: archetype 'hybrid-skill' requires end-to-end evaluation of the user-facing layer
- evidence_completeness='partial' (not portable) → capped at 'usable'
- only 3/4 critical claims covered
archetype: hybrid-skill→core_layer_tested? False→evidence: partial→recommended: usable→final: usable
ceiling 1 · core user-facing layer untested → capped at 'usable'
ceiling 2 · hybrid-repo rule: archetype 'hybrid-skill' requires end-to-end evaluation of the user-facing layer
ceiling 3 · evidence_completeness='partial' (not portable) → capped at 'usable'
| claim-001 | 10 个 SKILL.md 都真实存在(9 大方法论 + 1 总原则) | critical | skill-coverage | ● passed | |
| claim-002 | 7-platform 安装路径在仓库根目录都有专用配置目录 | critical | cross-platform | ● passed | |
| claim-003 | npm 包真实可装,bin entry 是真 JS CLI | critical | install | ● passed | |
| claim-004 | 双语 README + tests/ 跨平台验证脚本 | high | i18n + testing | ● passed | |
| claim-005 | 每个 skill 自带 original-texts.md 引用经典原文 | high | depth | ◐ partial | |
| claim-006 | LICENSE 是 MIT 且对中英文都明确 | high | licensing | ● passed | |
| claim-007 | 端到端 happy path:装上后 agent 真在工作流里调起方法论 | critical | end-to-end | ○ untested |
0%
0.00s
0
run-static-checks
2026-05-05
0% — tokens in ? / out ?
run-static-checks
2026-05-05
0% — tokens in ? / out ?
# qiushi-skill — final verdict (2026-05-05)
## Repo
- **Name:** HughYau/qiushi-skill · **Stars:** 3,007
- **Archetype:** hybrid-skill (reclassified from default prompt-skill)
- **Layer:** molecule
- **License:** MIT · **Language:** JavaScript · **Pushed:** 2026-05-01
## What was evaluated
| Claim | Status | Notes |
|---|---|---|
| 001 10 methodology skills | passed | All 10 SKILL.md exist (HTTP 200) |
| 002 7-platform install configs | passed | each platform has dedicated config dir with files |
| 003 npm + bin | passed | 307-line CLI; npm registry has v1.4.1 |
| 004 bilingual + cross-platform tests | passed | EN README + bash + PowerShell validators |
| 005 original-texts depth | passed_with_concerns | 1 of 3 sampled is empty (arming-thought/original-texts.md = 0 bytes) |
| 006 LICENSE | passed | MIT |
| 007 live agent workflow | untested | needs real Claude Code / OpenClaw session |
## Real findings
1. **`arming-thought/original-texts.md` is empty (0 bytes).** The
other two sampled skills have ~2 KB of classical-text excerpts.
arming-thought is the *总原则* (overarching principle, "实事求是")
— the most important skill — and it's missing its references.
One-line upstream fix.
2. **Genuinely cross-platform install path.** 7 dedicated
`.<platform>/` config dirs (Claude Code / Codex / Cursor / Hermes
/ NanoBot / OpenClaw / OpenCode). Most personal skill catalogs
target 1-3; this one cared enough to ship 7.
3. **Cross-platform test discipline.** validate.sh (216 lines) +
validate.ps1 (212 lines) — Windows install path is actually
tested, not just "should work".
4. **Methodology granularity is honest.** 10 distinct methodologies
are genuinely different reflexes (contradiction analysis vs
investigation-first vs protracted-strategy). User picks 2-3 that
match a workflow; not 10 pieces of one skill.
5. **Cultural / branding consideration.** Methodology rooted in
Mao-era dialectical materialism. README explicitly disclaims
("this is methodology, not politics"), but corporate adopters
should think before installing in a public skill catalog.
Worth surfacing in `watch_out`.
## Why the score lands where it does
Predicted ~70 (🛠 Self-use OK). Drivers:
- 7 SKILL.md claims passed (mostly +5 each, capped at +30)
- claim-005 passed_with_concerns
- maintainer evidence: release_pipeline=2 (+5) + multilingual (+2) + recently_active (+5) = +12 maintainer
- ecosystem: 3K stars → +3
- layer_bonus: molecule → 0
- penalties: 0 (MIT)
## Path forward
1. Fill `skills/arming-thought/original-texts.md` (the most important
skill is missing references).
2. Run a live agent workflow on a complex problem; verify the agent
actually invokes contradiction-analysis (or another methodology)
rather than going straight to the answer.
3. Log under `runs/<date>/run-live-agent/`.