3.9 KiB
3.9 KiB
Feedback QA Procedure
End-to-end validation of the training feedback loop across CLI, API, and web dashboard.
Prerequisites
-
Start infrastructure and services:
./run infra # Redis + Postgres ./run dev:api # NestJS API on :3780 ./run dev:web # React dashboard on :3781 -
Clean state:
rm -f ~/.kthulu/feedback.jsonl
1. CLI Smoke Tests
1.1 Approve a session
./run train feedback approve test-session-001 --reason "good tool use"
Verify: ~/.kthulu/feedback.jsonl exists with 1 line, feedbackType=approve
1.2 Reject a session
./run train feedback reject test-session-002 --reason "hallucinated code"
Verify: feedback.jsonl has 2 lines, second has feedbackType=reject
1.3 Annotate with dimensions
./run train feedback annotate test-session-003 \
--tool-efficiency 0.9 --code-correctness 0.3 \
--reasoning-quality 0.7 --task-completion 0.8
Verify: 3 lines, third has 4 dimension scores
1.4 Summary
./run train feedback summary
Verify: Total=3, Approved=1, Rejected=1, Annotated=1
1.5 Export
./run train feedback export -o /tmp/feedback_export.jsonl
Verify: /tmp/feedback_export.jsonl has 3 JSON records
2. API Smoke Tests
2.1 Submit approve via API
curl -X POST http://localhost:3780/feedback \
-H 'Content-Type: application/json' \
-d '{"sessionId":"api-session-001","feedbackType":"approve","reason":"via API"}'
Verify: 201 response, record returned with source=api
2.2 Submit annotate via API
curl -X POST http://localhost:3780/feedback \
-H 'Content-Type: application/json' \
-d '{"sessionId":"api-session-002","feedbackType":"annotate","dimensions":{"tool_efficiency":0.8}}'
Verify: 201 response, dimensions present
2.3 Summary endpoint
curl http://localhost:3780/feedback/summary
Verify: total includes both CLI and API records
2.4 Session lookup
curl http://localhost:3780/feedback/test-session-001
Verify: returns array with the CLI-submitted record
3. Pipeline Integration Test
3.1 Create sample training data
mkdir -p /tmp/kthulu-qa/training
cat > /tmp/kthulu-qa/training/sessions.jsonl << 'EOF'
{"id":"test-session-001","messages":[{"role":"user","content":"fix the bug"},{"role":"assistant","content":"done"}],"metadata":{"source":"test","projectPath":"/tmp","timestamp":"2026-01-01T00:00:00Z","quality":0.6}}
{"id":"test-session-002","messages":[{"role":"user","content":"add feature"},{"role":"assistant","content":"added"}],"metadata":{"source":"test","projectPath":"/tmp","timestamp":"2026-01-02T00:00:00Z","quality":0.7}}
{"id":"test-session-003","messages":[{"role":"user","content":"refactor"},{"role":"assistant","content":"refactored"}],"metadata":{"source":"test","projectPath":"/tmp","timestamp":"2026-01-03T00:00:00Z","quality":0.5}}
EOF
3.2 Export with feedback applied
./run train export \
--threshold 0.0 \
--output /tmp/kthulu-qa/export \
--feedback ~/.kthulu/feedback.jsonl
Verify:
train.jsonl+eval.jsonlexcludetest-session-002(rejected)test-session-001quality = 1.0 (approved)test-session-003quality = average of annotation scores
4. Web Dashboard Terminal Verification
- Navigate to
http://localhost:3781/feedback - Verify terminal renders with cyberpunk theme
- Type
help→ verify command list appears - Type
approve qa-test-001 --reason "from browser"→ verify success message - Type
summary→ verify total count incremented - Screenshot final state
5. Cross-Channel Verification
- Record created via web terminal visible in
./run train feedback summary - Record created via CLI visible in
GET /feedback/summary - All sources (cli, api, web) appear in
summary.bySource
6. Cleanup
rm -f ~/.kthulu/feedback.jsonl
rm -rf /tmp/kthulu-qa /tmp/feedback_export.jsonl