· 2026-04-24

skills-manage

iamzhihuix/skills-manage

🛠65 / 100

✅

⚠

🎯

⚠

🛠

🗺

📍

⚛

→

⚗

→

🧬

🛑

0–29

⚠️

30–49

🛠

50–79

🏭

80–100

▼

🛠· 65 / 100

✓8 claims passed, no critical failures
✓MIT / Apache / etc., installable per deployment.install_methods
✓release_pipeline_score=2 + pushed in 90-day window
✓multilingual_readme=true
◐evidence_completeness=portable

#1👤

#2🎯

#3🧭

#4⇄

anthropics/skill-creator

🏭 · 81molecule

obra/superpowers

🛠 · 77compound


`Download Tauri DMG (unsigned — needs xattr workaround)`	macOS	moderate
`Build from source (Tauri / Rust)`	any	hard

📡

AI coding tools (Claude Code, Cursor, Codex, Gemini CLI, Trae, +24 more)

Target tools whose skill folders are managed

Manages local skill folders; doesn't talk to upstream APIs itself

· 9

6 1 1 1

	+40
	+25
	0
	0
	0
	0

8 / 9

partial claim-001

passed claim-002

passed claim-003

passed claim-004

passed claim-005

failed claim-006

passed claim-007

untested claim-009

`input_contract`
`output_contract`
`determinism`
`idempotence`
`no_skill_callouts`
`failure_mode_clarity`

`workflow_correctness`
`declared_call_graph`
`stop_conditions`
`handoff_points`
`atom_evidence`
`error_propagation`
`partial_failure_handling`

core user-facing layer untested → capped at 'usable'
evidence_completeness='portable' → capped at 'reusable'

only 5/6 critical claims covered

archetype: adapter→core_layer_tested? False→evidence: portable→recommended: usable→final: usable

ceiling 1 · core user-facing layer untested → capped at 'usable'

ceiling 2 · evidence_completeness='portable' → capped at 'reusable'


claim-001	每个已声明平台都在 code 层注册为 adapter	critical	platform-coverage	◐ partial
claim-002	所有 install 产出同一个 shape（symlink → central）	critical	shape-conformance	● passed
claim-003	未装的/未列出的平台清晰失败	critical	failure-transparency	● passed
claim-004	GitHub 导入有真实 auth + retry fallback	high	auth	● passed
claim-005	重复 install 同一个 skill 是幂等的	high	deduplication	● passed
claim-006	数据本地化 + 无遥测	critical	upstream-drift	✕ failed
claim-007	Prebuilt macOS DMG 下载即可用	critical	platform-coverage	● passed
claim-008	支持 Central↔平台双向 centralize	medium	shape-conformance	● passed
claim-009	实际启动 + 扫描 + install 的端到端 runtime 验证	critical	platform-coverage	○ untested	Wendy 当前机器上 ~/.claude/skills/ (260+) 和 ~/.agents/skills/ 正在被本次 session 使用，启动未签名 adhoc-signed app 去扫描+改写这些目录风险过高。建议在干净账户 / VM 里再验。

50%

3.38s

run-source-and-dmg-integrity

2026-04-24

50% 3.4s tokens in 2903 / out 40

claim-001 · partial
claim-002 · passed
claim-003 · passed
claim-004 · passed
claim-005 · passed
claim-006 · failed
claim-007 · passed
claim-008 · passed
claim-009 · untested

run-source-and-dmg-integrity

2026-04-24

50% 3.4s tokens in 2903 / out 40

claim-001 · partial
claim-002 · passed
claim-003 · passed
claim-004 · passed
claim-005 · passed
claim-006 · failed
claim-007 · passed
claim-008 · passed
claim-009 · untested

# Final Verdict — iamzhihuix/skills-manage

## Repo

- **Name**: iamzhihuix/skills-manage
- **Version tested**: v0.9.1 (2026-04-23)
- **Date**: 2026-04-24
- **Archetype**: adapter
- **Final bucket**: `usable`
- **Confidence**: low (per verdict_calculator.py)

## Verdict Calculator Output

```
Recommended bucket: usable
Final bucket: usable
Confidence: low

Ceiling reasons:
- core user-facing layer untested → capped at 'usable'
- evidence_completeness='portable' → capped at 'reusable'

Blocking issues:
- only 5/6 critical claims covered
```

Inputs: 8 of 9 claims passed (claim-001 passed_with_concerns, claim-009 untested). 5 of 6 critical claims covered; claim-009 is the uncovered one.

## Why This Bucket

### Core Outcome — code path exists, end-to-end not proven

Every claim about install / uninstall / symlink / detection / GitHub import / local-first storage has a concrete, reviewable code path. The prebuilt DMG is byte-identical to the release asset digest. But no real user workflow was executed through the GUI — the adapter archetype says that failing to exercise the *actual user-facing layer* caps the verdict at `usable`, and the rule applied.

### Scenario Breadth — narrow on purpose

Only one scenario was tested: "open the source + download + inspect bundle". No per-platform install smoke, no collection batch-install, no discover scan against a real project tree. The breadth floor is 1, not 28.

### Repeatability — not tested

No repeat runs. Idempotency was verified at the *schema level* (`ON CONFLICT(skill_id, agent_id) DO UPDATE`) but not by running the same install twice and inspecting on-disk state.

### Failure Transparency — good signals in code

- `is_agent_detected()` honestly returns false when both dir and parent are missing.
- `ensure_centralized()` errors out with explicit messages when the source skill is missing.
- GitHub import falls back through 4 mirrors, so a single network failure won't produce a misleading empty import.
- Zero telemetry libraries, so a failure can't be silently phoned home.

## What I Would Say In Plain English

skills-manage is a well-built young project (910 stars in 11 days is not an accident — the code shows it). The README's claims about "central library + symlink to per-platform" are not marketing: they are literally implemented with `std::os::unix::fs::symlink` and a relative-path computation that makes the links portable. Privacy claims are honest — the database is where they say it is, and there is no analytics dependency anywhere.

But this evaluation did not prove the app works for a real user. It proved the code for each claim exists. For a pre-1.0 Tauri desktop app that requires `xattr -dr com.apple.quarantine` to launch and will scan/modify directories many other tools are already managing, "code exists" is not enough to recommend.

**Use it if** you have a clean macOS account or a VM, and your skills live in one place today.
**Wait if** your `~/.claude/skills/`, `~/.agents/skills/`, or `~/.cursor/skills/` are already managed by another tool (dbskill, lobster lock file, plugin registries) — test in isolation first.

## Remaining Risks

1. **claim-009 (runtime E2E) untested**. Everything downstream of "user clicks install" is inferred, not observed.
2. **EasyClaw V2 listed in README but not seeded in code** (`builtin_agents()` has 27 ids vs README's 28 platforms). If a user specifically needs that platform, the adapter is missing.
3. **API keys stored unencrypted** in `~/.skillsmanage/db.sqlite` (README self-discloses; still a real constraint).
4. **adhoc signing + no notarization**. The `xattr` workaround is a permanent requirement until the maintainer signs the build.
5. **3 legacy failing frontend tests** (CLAUDE.md self-discloses). Not in the core path but worth noting.
6. **Schema drift**: README and code disagree about Hermes category and React version. Low-risk but pattern-of-minor-drift is a smell to watch at 1.0.

## What Would Move It To `reusable`

- A live run on a clean macOS user account: launch app, detect platforms, install one skill to two platforms, verify symlinks on disk, uninstall, verify cleanup — all with screenshots/log evidence.
- A repeat run proving idempotency at the filesystem level.
- An unsupported-input run (e.g., custom agent with a read-only dir) proving the failure is loud.

## What Would Move It To `recommendable`

- Everything above, plus:
- A 1.0 release with notarized macOS build and a Linux build (currently source-only).
- The 3 legacy failing frontend tests fixed.
- README-code consistency pass (EasyClaw V2, React version, Hermes category).

## Related Artifacts

- Claim map: `claims/claim-map.yaml`
- Plan: `plans/2026-04-24-eval-plan.md`
- Run: `runs/2026-04-24/run-source-and-dmg-integrity/`
- DMG: `artifacts/skills-manage_0.9.1_macos_universal.dmg`
- DMG integrity log: `logs/dmg-integrity.log`
- Source inspection log: `logs/source-inspection.log`
- Business notes: `business-notes.md`
- Verdict calculator input: `verdicts/2026-04-24-verdict-input.yaml`