repo·evals
· 2026-05-04 ·main@HEAD

Awesome-finance-skills

RKiding/Awesome-finance-skills

🛠59 / 100
🎯

🗺
01Market data行情数据02Intel collection情报采集03Signal synthesis信号合成04Methodology routing方法论路由05Backtest validation回测验证06Risk management风控决策07Execution engine执行引擎08Journal & review复盘学习
📍
📍
🧬

🛑
0–29
⚠️
30–49
🛠
50–79
🏭
80–100
59
🛠· 59 / 100
  • 5 claims passed, no critical failures
  • MIT / Apache / etc., installable per deployment.install_methods
  • no recent release pipeline + not recently active
  • multilingual_readme=true
  • static-only eval; live e2e pending

#1👤
#2🎯
#3🧭
#4

User question用户提问(natural language)(自然语言)alphaear-newsalphaear-news10+ sources +10+ 信源 +PolymarketPolymarketalphaear-stockalphaear-stockA / HK / USA / 港 / 美OHLCVOHLCValphaear-sentimentalphaear-sentimentFinBERT / LLMFinBERT / LLM-1 ~ +1-1 ~ +1alphaear-predictoralphaear-predictorKronos +Kronos +news-aware新闻调整alphaear-logic-alphaear-logic-visualizervisualizerdraw.io XMLdraw.io XMLalphaear-signal-alphaear-signal-trackertrackerstrengthen / falsify强化 / 证伪alphaear-reporteralphaear-reporterplan → write →规划 → 撰写 →edit → chart编辑 → 图表

npx skills add RKiding/Awesome-finance-skills@<skill>any (npm)easy
git clone + cp skills/* to ~/.config/opencode/skills/anymoderate
  • 🌐
Anthropic Claude / OpenCode / OpenClaw
Agent host
Standard agent-side cost
alphaear-news data sources (Cailian / Polymarket / etc.)
News + market data
Each skill calls its own external sources; check skill SKILL.md for keys
FinBERT / Kronos forecasting
Sentiment + price forecasting
Local models; CPU-friendly
· 6
5 1
+40
+14
+2
+3
0
0

5 / 6
passed claim-001

passed claim-002

passed claim-003

passed claim-004

passed claim-005

untested claim-006

input_contract
output_contract
determinism
idempotence
no_skill_callouts
failure_mode_clarity

workflow_correctness
declared_call_graph
stop_conditions
handoff_points
atom_evidence
error_propagation
partial_failure_handling

  • core user-facing layer untested → capped at 'usable'
  • hybrid-repo rule: archetype 'hybrid-skill' requires end-to-end evaluation of the user-facing layer
  • evidence_completeness='partial' (not portable) → capped at 'usable'

  • only 2/3 critical claims covered

archetype: hybrid-skillcore_layer_tested? Falseevidence: partialrecommended: usablefinal: usable
ceiling 1 · core user-facing layer untested → capped at 'usable'
ceiling 2 · hybrid-repo rule: archetype 'hybrid-skill' requires end-to-end evaluation of the user-facing layer
ceiling 3 · evidence_completeness='partial' (not portable) → capped at 'usable'

claim-001README 列出的 8 个 alphaear-* skill 都真实存在criticalskill-coverage● passed
claim-002每个 skill 是真实 hybrid(SKILL.md + scripts + references + tests)criticalskill-shape● passed
claim-003skills/ 实际数量 ≥ README 列的 8 个(不会"找不到 skill")highcompleteness● passed
claim-004多 agent 框架(Antigravity / OpenCode / OpenClaw)的安装路径文档化highcross-platform● passed
claim-005SKILL.md 遵循标准 frontmatter(name + description)highcontract● passed
claim-006端到端:装到真实 agent 里跑一个新闻分析查询能拿到结果criticalend-to-end○ untested

0%
0.00s
0

run-static-checks

2026-05-04
0% tokens in ? / out ?

run-static-checks

2026-05-04
0% tokens in ? / out ?
# Awesome-finance-skills — final verdict (2026-05-04)

## Repo

- **Name:** RKiding/Awesome-finance-skills
- **Branch evaluated:** main@HEAD
- **Archetype:** hybrid-skill (reclassified from default `prompt-skill`)
- **Layer:** **molecule** at the repo level — catalog of 10 individually-
  hybrid skills
- **Eval framework:** repo-evals layer model v1 (f9ed1e9)

## Bucket

**`usable`** — clean static layer; all 8 README-listed skills exist
and are non-trivial; install paths cover three agent frameworks. The
multi-skill agentic value (the catalog's promise) is unverified
without a real session.

## What was evaluated

### Atom + molecule level (static, this run)

| Claim | Status | Notes |
|---|---|---|
| 001 8 skills present | passed | All 8 alphaear-* SKILL.md files (HTTP 200) |
| 002 hybrid shape | passed | Sampled alphaear-news has SKILL.md + 4 scripts + references + tests |
| 003 catalog ≥ docs | passed | 10 skill dirs, 8 listed in README + 2 extras (deepear-lite, skill-creator) |
| 004 multi-agent install paths | passed | README Integration Guide covers Antigravity / OpenCode / OpenClaw with workspace + global paths |
| 005 SKILL.md frontmatter | passed | Sampled SKILL.md has standard `name + description` with "Use when..." trigger phrase |

### Molecule level (deferred)

| Claim | Status | Required |
|---|---|---|
| 006 multi-skill chain agent run | untested | Real Claude Code / OpenCode / OpenClaw session running "analyze gold crash impact on A-shares" through news → visualizer → reporter chain |

## Real findings worth surfacing

1. **Each skill is genuinely hybrid.** Not a markdown-only catalog —
   alphaear-news ships 4 Python helpers (content_extractor,
   database_manager, news_tools, plus __init__). README's "use
   `scripts/news_tools.py` via NewsNowTools" claim has real code
   behind it.

2. **Catalog under-promises.** README headlines 8 skills; the
   directory has 10. Extra: `alphaear-deepear-lite` (mentioned in
   README's "New" badge) and `skill-creator` (a 371-line utility for
   authoring more skills). User won't be misled, but the table could
   include them for discoverability.

3. **Frontmatter is rigorous.** Sampled SKILL.md ends its
   description with a clear "Use when..." trigger phrase — that's
   the discriminator that lets LLM auto-discovery pick the right
   skill out of a packed registry. Suggests intentional skill-loader
   compatibility.

4. **3-framework install paths are concrete.** Workspace + global
   path pairs for Antigravity / OpenCode / OpenClaw — concrete enough
   that a new user can copy-paste-edit without reading source. Few
   skill catalogs in this space provide this.

## Why not higher

`usable` because:

- No live multi-skill chain logged. The catalog's "Wall Street
  analyst" promise is the joint behavior of multiple skills cooperating;
  static verification can show each skill exists, not that they
  cooperate well.
- Each skill's analytical accuracy (sentiment, prediction, logic
  chain quality) is independent and unverified. A `reusable` bucket
  would imply both the chain runs and the outputs are useful.

## Path to `reusable`

1. Install full catalog into Claude Code or OpenCode.
2. Ask a multi-skill question (the README example: "Analyze how the
   gold crash affects A-shares").
3. Capture: which skills got called, in what order, with what inputs.
4. Run a "data source down" failure scenario (e.g., Polymarket
   unreachable) and verify error visibility.
5. Update claim-006 to `passed` if the chain produces a useful
   artefact (text + draw.io XML transmission diagram).

## Recommended

```yaml
current_bucket: usable
status: evaluated
```