Every merge to modelai-main is gated by adversarial review, CI tests, and benchmark verification. This page documents the process.
Every implementation slice goes through a 13-section hostile review before merge. The reviewer's job is to try to break the code, not confirm it looks reasonable.
| # | Section | What It Checks |
|---|---|---|
| 0 | Scope Gate | Changes stay within declared scope |
| 1 | Spec Match | Code matches plan exactly (no contract drift) |
| 2 | Contract Boundary | Caller/callee assumptions match |
| 3 | Concrete Traces | 6 mandatory traces with real values |
| 4 | Multi-Variant Model | Tested across architecture variants |
| 5 | State Machine | Before/after/failure/rollback states traced |
| 6 | Precondition Audit | Every assumption enforced or documented |
| 7 | Test Reality Check | Tests actually prove correctness (not just exist) |
| 8 | Disprove-It Pass | Systematic attempt to find the one bug |
| 9 | Skepticism Hierarchy | Most skeptical of indexing, layouts, integer division |
| 10 | Dependency Check | License, CVE, version pinning |
| 11 | Performance Check | No regression on hot paths |
| 12 | Cross-Repo Contract | API contracts match across repositories |
| Trace | Purpose |
|---|---|
| Production | Dominant real-world case with real numbers |
| Boundary | Smallest/closest-to-threshold valid input |
| Adversarial | Hostile input designed to break the code |
| Integer Arithmetic | Every division/modulo/stride with substituted values |
| Security | Injection, SSRF, auth bypass, secrets exposure |
| Concurrency | Race conditions, deadlocks, orphaned work |
| Tier | Name | What It Tests |
|---|---|---|
| 1 | Server Pytests | 22 existing server unit tests, CI-gated |
| 2 | API Snapshots | Response schema validation against frozen snapshots |
| 3 | Perf Regression | Speed, RSS, compaction latency vs baseline |
| 4 | Quality Gate | Cosine floor, KV survival, state round-trip |
| 5 | Windows CI | MSVC x64 build + full test suite |
| 6 | Stress Tests | KV exhaustion, multi-slot concurrent compaction |
| 7 | Contract Tests | ModelAI-specific endpoints, metrics, schema |
| 8 | Live Dashboard | Automated data pipeline to public benchmark page |
| Workflow | Trigger | Purpose |
|---|---|---|
| modelai-ci | Push, PR | Build + 49 main-label tests |
| modelai-server-smoke | Push, PR | Server smoke test + pytests |
| modelai-perf-smoke | Push, PR | Performance regression detection |
| modelai-ci-windows | Push, PR | Windows MSVC build + test |
| modelai-upstream-sync | Saturday 2PM PDT | Weekly upstream merge + build + test |
| modelai-dashboard | After CI success | Aggregate bench data |
| modelai-auto-label | Issues, PRs | Auto-label by path/keyword |
Every Saturday at 2PM PDT, the automated sync workflow runs. The process is designed so that no upstream change reaches modelai-main without passing build + test + human review.
upstream/masterupstream-master tracking branchupstream-sync to modelai-mainupstream-syncmodelai-mainWhen the upstream delta touches compaction-related files (KV cache, graph construction, attention paths), the merge is held for manual review before integration. The reviewer checks:
6 upstream KV cache changes have been audited and verified compatible: #10873, #12695, #13194, #17450, #12253, #11213.
upstream/master on every sync.modelai-main, then merged with upstream. CI-gated before integration.Security patches (CVEs, RPC vulnerabilities) bypass the weekly schedule. Same-day sync via manual workflow_dispatch, followed by expedited review and merge. The RPC RCE patch was synced within hours of upstream disclosure.
Full bug list: BUGS-AND-FIXES.md