repo·evals
· 2026-05-04 ·main@HEAD (skill v1.0.0, Python 3.11+)

xiaohongshu-skills

autoclaw-cc/xiaohongshu-skills

🛠59 / 100
🎯

🗺
01Signal scanning信号发现02Content acquisition内容获取03Content understanding内容理解04Topic curation选题决策05Content production内容生产06Creative assembly创意组装07Distribution & feedback分发反馈08Learning学习
📍xiaohongshu
🧬

🛑
0–29
⚠️
30–49
🛠
50–79
🏭
80–100
59
🛠· 59 / 100
  • 5 claims passed, no critical failures
  • MIT / Apache / etc., installable per deployment.install_methods
  • release_pipeline=1, recently_active=True
  • EN-only or ZH-only README
  • static-only eval; live e2e pending

#1👤
#2🎯
#3🧭
#4

Natural-language自然语言意图intent (Claude Code(Claude Code/ OpenClaw)/ OpenClaw)Root SKILL.md根 SKILL.md(intent router)(意图路由)xhs-authxhs-auth(login / status)(登录 / 状态)xhs-explorexhs-explore(search / detail)(搜索 / 详情)xhs-publishxhs-publish(image / video)(图文 / 视频)xhs-interactxhs-interact(like / comment(点赞 / 评论/ bookmark)/ 收藏)xhs-content-opsxhs-content-ops(compound flows)(复合运营)XHS BridgeXHS BridgeChrome ext.Chrome 扩展→ real account→ 真实账号

Download ZIP + uv sync + load Chrome extension unpackedmacOS / Linuxmoderate
git clone into agent skills dir + uv syncmacOS / Linuxmoderate
  • 🌐
  • ⚠️
Xiaohongshu (real account)
Drives operations on user's logged-in XHS session
Uses your own XHS account; aggressive use can trigger anti-automation limits
Anthropic Claude / OpenClaw
Agent that invokes the skill
Standard agent-side cost
· 6
3 2 1
+40
+14
+5
+3
0
-3

5 / 6
passed claim-001

passed claim-002

passed claim-003

passed claim-004

passed claim-005

untested claim-006

input_contract
output_contract
determinism
idempotence
no_skill_callouts
failure_mode_clarity

workflow_correctness
declared_call_graph
stop_conditions
handoff_points
atom_evidence
error_propagation
partial_failure_handling

  • core user-facing layer untested → capped at 'usable'
  • hybrid-repo rule: archetype 'hybrid-skill' requires end-to-end evaluation of the user-facing layer
  • evidence_completeness='partial' (not portable) → capped at 'usable'

  • only 3/4 critical claims covered

archetype: hybrid-skillcore_layer_tested? Falseevidence: partialrecommended: usablefinal: usable
ceiling 1 · core user-facing layer untested → capped at 'usable'
ceiling 2 · hybrid-repo rule: archetype 'hybrid-skill' requires end-to-end evaluation of the user-facing layer
ceiling 3 · evidence_completeness='partial' (not portable) → capped at 'usable'

claim-001README 列出的 5 个子技能 SKILL.md 都真实存在criticalskill-coverage● passed
claim-002根 SKILL.md 真的引用 5 个子技能(路由层不是空架子)criticalorchestration● passed
claim-003Chrome 扩展权限申请与功能边界匹配(无过度授权)criticalprivacy-security◐ partial
claim-004Python 端依赖最小化、与 README 安装步骤一致highinstall● passed
claim-005兼容 OpenClaw + Claude Code(README 双声明真有契约支撑)highcross-platform◐ partial
claim-006端到端:复合命令能在真实账号下完成"搜索 → 收藏 → 总结"criticalend-to-end○ untested

0%
0.00s
0

run-static-checks

2026-05-04
0% tokens in ? / out ?

run-static-checks

2026-05-04
0% tokens in ? / out ?
# xiaohongshu-skills — final verdict (2026-05-04)

## Repo

- **Name:** autoclaw-cc/xiaohongshu-skills
- **Branch evaluated:** main@HEAD (skill v1.0.0)
- **Archetype:** hybrid-skill (LLM + Python bridge + Chrome extension)
- **Layer:** **molecule** — 5 atomic sub-skills wired by root routing layer
- **Eval framework:** repo-evals layer model v1 (4acbd5d)

## Bucket

**`usable`** — strong static layer; molecule rule caps `usable`
until a composite workflow is logged on a real XHS account. Two
disclosure gaps need surfacing (privileged extension permissions and
homepage repo mismatch).

## What was evaluated

### Atom + molecule level (static, this run)

| Claim | Status | Notes |
|---|---|---|
| 001 5 sub-skills | passed | All 5 SKILL.md files present (HTTP 200) |
| 002 root routing | passed | 11 sub-skill mentions in root SKILL.md; explicit "intent → sub-skill" router role |
| 003 extension permissions | passed_with_concerns | `debugger` + `cookies` + `scripting` privileged perms; README doesn't enumerate them |
| 004 minimal Python deps | passed | python-socks + requests + websockets; nothing surprising |
| 005 OpenClaw / Claude Code contract | passed_with_concerns | `metadata.openclaw` block real; but `homepage` field points to a different repo (xpzouying/xiaohongshu-skills) |

### Molecule level (deferred — live)

| Claim | Status | Required |
|---|---|---|
| 006 composite workflow | untested | Real XHS account + Chrome + Claude Code session running "搜索 → 收藏 → 总结" composite |

## Real findings worth surfacing

1. **`debugger` permission is a real privilege escalation.** Combined
   with `cookies` and `scripting`, the extension has full access to
   the user's xiaohongshu.com session including DevTools-level page
   manipulation. Design intent (drive a real account) is honest, but
   the README doesn't list this — anyone evaluating for production
   should open `extension/manifest.json` first.

2. **Homepage field points to a different repo.** Root SKILL.md says
   `metadata.openclaw.homepage: https://github.com/xpzouying/xiaohongshu-skills`
   — not `autoclaw-cc/xiaohongshu-skills`. Likely a fork or rename
   without metadata update. Mostly cosmetic but confusing for skill
   discovery.

3. **Windows not supported by design.** `metadata.openclaw.os` is
   `[darwin, linux]`. README install steps don't mention this; a
   Windows user following the install path would only discover this
   after it failed.

4. **Rate-limit risk acknowledged.** README explicitly warns about
   triggering XHS anti-automation; "use real account" is more humane
   than headless scraping but the platform ToS is the same.

## Why not higher

`usable` because:

- No live composite-workflow run logged on this evaluator's machine.
- Privilege escalation in the extension is real and undisclosed in
  the README — promotion past `usable` should require either the
  README disclosing it or the perms being narrowed.

## Path to `reusable`

1. Disclose extension permissions in README (one-line link to
   `extension/manifest.json` rationale).
2. Update `metadata.openclaw.homepage` to point to this repo, not
   the upstream fork.
3. Run a composite workflow on a real XHS account in Claude Code,
   log under `runs/<date>/run-composite/`.
4. Run an "expired login" scenario; verify `xhs-auth` re-triggers.
5. Update claim-006 to `passed`; re-run verdict_calculator.

## Recommended

```yaml
current_bucket: usable
status: evaluated
```