· 2026-05-04 ·main@HEAD (3.0.3)

QuantDinger

brokermr810/QuantDinger

🛠76 / 100

✅

⚠

🎯

⚠

💹

🗺

📍

⚛

→

⚗

→

🧬

🛑

0–29

⚠️

30–49

🛠

50–79

🏭

80–100

▼

🛠· 76 / 100

✓7 claims passed, no critical failures
✓MIT / Apache / etc., installable per deployment.install_methods
✓release_pipeline_score=3 + pushed in 90-day window
✓multilingual_readme=true
⚪compound layer needs a logged scenario run

#1👤

#2🎯

#3🧭

#4⇄

tukuaiai/tradecat

🛠 · 66molecule

zinan92/repo-evals

🛠 · 78molecule


`docker compose up -d --build`	any (Docker)	moderate
`AWS Marketplace AMI`	AWS EC2	easy
`git clone + manual setup`	any	hard

🛠
🌐

OpenAI API

LLM for AI strategy / indicator generation

One of three LLM providers — pick at least one

DeepSeek API

Alternative LLM provider

Cheaper alternative; CN-friendly

Grok API (x.ai)

Alternative LLM provider

Third LLM option

Interactive Brokers (IBKR)

US stock execution

Brokerage account; paper trading available for free

MetaTrader 5

Forex execution (Windows-only)

Windows host required; Linux not supported

Crypto exchanges (via ccxt)

Crypto execution

ccxt supports many exchanges; trading fees apply

· 8

7 1

	+40
	+24
	+12
	+3
	-3
	0

7 / 8

passed claim-001

passed claim-002

passed claim-003

passed claim-004

passed claim-005

passed claim-006

untested claim-007

passed claim-008

`input_contract`
`output_contract`
`determinism`
`idempotence`
`no_skill_callouts`
`failure_mode_clarity`

`workflow_correctness`
`declared_call_graph`
`stop_conditions`
`handoff_points`
`atom_evidence`
`error_propagation`
`partial_failure_handling`

`goal_achievement`
`direction_judgment`
`quality_judgment`
`meaningful_autonomy`
`handoff_timing`
`observed_call_graph`
`failure_recovery`

core user-facing layer untested → capped at 'usable'
hybrid-repo rule: archetype 'orchestrator' requires end-to-end evaluation of the user-facing layer
evidence_completeness='partial' (not portable) → capped at 'usable'

only 4/5 critical claims covered

archetype: orchestrator→core_layer_tested? False→evidence: partial→recommended: usable→final: usable

ceiling 1 · core user-facing layer untested → capped at 'usable'

ceiling 2 · hybrid-repo rule: archetype 'orchestrator' requires end-to-end evaluation of the user-facing layer

ceiling 3 · evidence_completeness='partial' (not portable) → capped at 'usable'


claim-001	"Try in 2 minutes" 一键安装命令真的能装出 4 个服务	critical	install	● passed
claim-002	后端基础镜像与 Python 版本声明一致	high	install-consistency	● passed
claim-003	多 LLM provider 支持（OpenAI / DeepSeek / Grok）真实可配	critical	ai-providers	● passed
claim-004	多 broker 集成在 requirements.txt 真实声明	high	brokers	● passed
claim-005	MCP server 是独立 Python 包，可被 AI agent 调起	critical	mcp-integration	● passed
claim-006	默认 docker-compose 不裸暴公网（端口 bind 到 127.0.0.1）	high	security	● passed
claim-007	端到端 happy path：MCP agent 触发一次回测能拿到结果	critical	end-to-end	○ untested
claim-008	AI 生成的策略不会自动下真实订单（人工审批）	critical	safety	● passed

0.00s

run-static-checks

2026-05-04

0% — tokens in ? / out ?

run-static-checks

2026-05-04

0% — tokens in ? / out ?

# QuantDinger — final verdict (2026-05-04)

## Repo

- **Name:** brokermr810/QuantDinger
- **Branch evaluated:** main@HEAD (3.0.3)
- **Archetype:** orchestrator
- **Layer:** **compound** — multi-agent AI research, LLM-driven
  strategy and indicator generation, ensemble + reflection
- **Eval framework:** repo-evals layer model v1 (4acbd5d)

## Bucket

**`usable`** — strong static layer. Compound rule caps `usable`
until at least one logged agent-driven scenario, and a platform that
lets an LLM trade real money needs a verified manual-approval gate
before any higher bucket can be claimed.

## What was evaluated

### Atom + molecule level (static, this run)

| Claim | Status | Notes |
|---|---|---|
| 001 4-service compose | passed | postgres + redis + backend + frontend, all healthchecked |
| 002 base-image consistency | passed | python:3.12-slim-bookworm in both Dockerfile and compose |
| 003 multi-LLM | passed | OpenAI / DeepSeek / Grok all with `*_BASE_URL` overrides |
| 004 multi-broker | passed | ccxt + ib_insync + finnhub + yfinance + akshare in requirements; MetaTrader5 conditional + Windows-only note |
| 005 MCP server | passed | quantdinger-mcp 0.1.0 with `mcp>=1.2.0`; supports 5 named agent runtimes |
| 006 default port binding | passed | postgres/redis/backend bind 127.0.0.1; frontend public-by-design |
| 008 live-trading off by default | passed | `AGENT_LIVE_TRADING_ENABLED=false`; env.example references paper-only force-pin |

### Compound level (deferred)

| Claim | Status | Required |
|---|---|---|
| 007 MCP-agent e2e | untested | install + LLM key + Cursor/Claude Code session running a real backtest end-to-end via MCP |
| 008 live-order gating in practice | untested | flip flag in paper-broker test, verify manual approval is enforced (not just UI) |

## Real findings worth surfacing

1. **The default safety posture is real.** `AGENT_LIVE_TRADING_ENABLED=false`
   + `paper_only` force-pinned + localhost-only bindings on
   sensitive services together mean the default deploy doesn't
   auto-fire live orders or auto-leak postgres/redis. That's the
   right baseline for a platform where an LLM writes trading code.

2. **MetaTrader5 is structurally Linux-incompatible.** The Python
   package only ships Windows wheels. README mentions "MT5 forex"
   alongside crypto and stocks as a peer; the requirements file is
   honest (Windows-only comment), but a casual reader of the README
   could miss that and pick a Linux server expecting forex to work.
   This belongs in `watch_out`.

3. **OSS / SaaS / Marketplace overlap.** README links to
   ai.quantdinger.com (SaaS), AWS Marketplace AMI, and a billing
   primitive in the OSS repo. A user evaluating "is this open
   source?" should read which features are gated and which are
   genuinely free to self-host.

4. **MCP integration is a separately-versioned package.** Not a
   stub or marketing phrase — `mcp_server/` has its own pyproject,
   its own version (0.1.0), its own console_scripts entry. Easier
   to audit than a "we mention MCP somewhere" claim.

## Why not higher

`usable` is the right ceiling because:

- No live agent-driven scenario logged. Compound layer is exactly
  the case where static evidence cannot translate to user-facing
  trust without a real session.
- The most consequential claim — that an LLM cannot auto-fire live
  orders — is verifiable only with a live test, and is too
  important to assume from one default-off env var.

## Path to `reusable`

1. Bring up the stack on a fresh host with `docker-compose up -d`.
2. Wire MCP into Claude Code (or Cursor) per README Step 2.
3. Ask the agent to run one backtest and capture: tool calls
   actually used, structured artefact returned, token usage.
4. With paper account: enable live trading, attempt to submit an
   order through the agent, confirm a manual-approval step
   intercedes.
5. Log under `runs/<date>/run-{compound-happy,compound-safety}/`.
6. Update claim-007 + claim-008 to `passed` if both work as
   advertised; re-run verdict_calculator.

## Recommended

```yaml
current_bucket: usable
status: evaluated
```