QonQrete Functional Tests
This document outlines a comprehensive suite of functional tests designed to validate the entire QonQrete application. These tests cover the command-line interface, core orchestration logic, agent behaviors, configuration options, and edge cases.
1. Environment and Setup Tests
1.1. qonqrete.sh CLI
-
initCommand:- Run
./qonqrete.sh init. Verify Docker builds theqonqrete-qageimage successfully. - Run
./qonqrete.sh init --msb. Verify Microsandbox builds theqonqrete-qageimage successfully. - Run
./qonqrete.sh initwithout Docker ormsbinstalled. Verify it exits with a clear error message.
- Run
-
runCommand:- Run
./qonqrete.sh runwithoutOPENAI_API_KEYandGOOGLE_API_KEYenvironment variables set. Verify it fails with a “API Keys missing” error. - Run
./qonqrete.sh runwith API keys set. Verify aqage_<timestamp>directory is created inworqspace/. - Verify the new
qagedirectory contains copies ofconfig.yaml,pipeline_config.yaml, andcyqle1_tasq.md(a copy of the originaltasq.md). - Delete
worqspace/tasq.mdand run./qonqrete.sh run. Verify it quits and tells us it’s missing thetasq.mdfile.
- Run
-
cleanCommand:- With
qage_*directories present, run./qonqrete.sh clean. When prompted with “[y/N]”, enter “n”. Verify directories are not deleted. - Run
./qonqrete.sh cleanagain. When prompted, enter “y”. Verify allqage_*directories are deleted. - Run
./qonqrete.sh cleanwhen noqage_*directories exist. Verify it prints a “No ‘qage_*’ directories found” message and exits.
- With
- Command-Line Flags:
- Test each flag individually:
./qonqrete.sh run --auto,--user,--tui,--mode security,--briq-sensitivity 7,--msb,--docker,--wonqrete. Verify the corresponding arguments are passed toqrane.py. - Test short versions of flags:
-a,-u,-t,-m security,-b 7,-s,-d,-w. - Test using
--autoand--usertogether. Verify the script exits with a “mutually exclusive” error message. - Test a combination of flags:
./qonqrete.sh run --auto --tui --mode enterprise -b 3. - Test overriding
pipeline_config.yaml(microsandbox: true) with./qonqrete.sh run --docker.
- Test each flag individually:
- Help and Version:
- Run
./qonqrete.sh --helpand-h. Verify the help message is displayed and includes the new--userflag. - Run
./qonqrete.sh --versionand-V. Verify the version from theVERSIONfile is displayed.
- Run
- Pre-flight Checks:
- Temporarily rename
config.yamland run./qonqrete.sh run. Verify the system exits with a clear error. - Temporarily rename
pipeline_config.yamland run./qonqrete.sh run. Verify the system exits with a clear error.
- Temporarily rename
2. Core Orchestration (Qrane) Tests
2.1. Run Modes
- Manual Mode (Default):
- Run a task. Verify the system pauses at the “CheQpoint” after each cycle.
- At the CheQpoint, press ‘q’. Verify the system continues to the next cycle.
- At the CheQpoint, press ‘x’. Verify the system quits gracefully.
- At the CheQpoint, press ‘t’. Verify
$EDITORopens with thereqap.mdfile. After closing the editor, verify the prompt is shown again.
- Autonomous Mode (
--auto):- Run with
--auto. Verify the system runs through cycles without user interaction. - In
config.yaml, setauto_cycle_limit: 2. Run in auto mode. Verify the system stops after cycle 2 with a “Max cyQle limit hit” message. - Set
auto_cycle_limit: 0. Verify it runs until the task is complete or it fails.
- Run with
2.2. Cheqpoint Configuration (config.yaml)
-
cheqpoint: true(Default):- Set
cheqpoint: trueinconfig.yaml. Run./qonqrete.sh run. Verify it runs in user-gated mode. - With
cheqpoint: true, run./qonqrete.sh run --auto. Verify it correctly overrides the config and runs in autonomous mode.
- Set
-
cheqpoint: false:- Set
cheqpoint: falseinconfig.yaml. Run./qonqrete.sh run. Verify it runs in autonomous mode by default. - With
cheqpoint: false, run./qonqrete.sh run --user. Verify it correctly overrides the config and runs in user-gated mode.
- Set
2.3. Cycle and File Management
- I/O Flow: After a successful cycle 1, verify that
cyqle1_reqap.mdis correctly used to generatecyqle2_tasq.md. - Header Promotion: Check the content of
cyqle2_tasq.md. It must contain a header with the “Assessment” status from the previous cycle. - Agent Failure: Introduce an error in an agent script (e.g.,
sys.exit(1)inconstruqtor.py). Run the system. Verify the cycle fails and the orchestration stops with an error message. - Logging: For a successful run, inspect
struqture/. Verify a log file exists for each agent for each cycle (e.g.,cyqle1_instruqtor.log).
3. Agent Configuration and Behavior
3.1. Dynamic Pipeline (pipeline_config.yaml)
- Remove Agent:
- Comment out the
inspeqtoragent from the config. Run one cycle. Verify the system stops afterconstruqtorand waits at the CheQpoint (it may complain about a missingreqapfile, this is expected).
- Comment out the
- Reorder Agents (Failure Test):
- Swap the
construqtorandinstruqtorblocks in the config. Run the system. Verify it fails immediately becauseconstruqtorcannot find its requiredbriq.d/input. This confirms the order is respected.
- Swap the
3.2. Agent Settings (config.yaml)
- Swap Providers:
- Change
instruqtor’s provider togemini. Run a cycle. Verifygeminiis called for the planning phase. - Change
construqtor’s provider toopenai. Run a cycle. Verifysgptis called for the execution phase.
- Change
- Swap Models:
- Change
inspeqtor’s model to a different, valid OpenAI model. Verify the new model is used.
- Change
- Operational Modes:
- Set
mode: securityinconfig.yaml. Run a task to generate a Python script. Inspect the AI’s output to verify it includes security-conscious code (e.g., input validation). - Set
mode: enterprise. Verify the output includes docstrings, logging, and error handling.
- Set
- Briq Sensitivity:
- Set
briq_sensitivity: 0(Atomic). Use a complextasq.md. Verifyinstruqtorgenerates a large number of briq files. - Set
briq_sensitivity: 9(Monolithic). Use the sametasq.md. Verifyinstruqtorgenerates very few (ideally 1) briq files.
- Set
4. TUI Mode Tests (--tui)
- Window Management:
- Start in TUI mode. Verify the split-screen view is shown by default.
- Press the
Spacebar. Verify the bottom “Qonsole” window disappears. - Press
Spaceagain. Verify the “Qonsole” window reappears.
- Logging:
- Verify high-level status messages from
Qraneand agents appear in the top “Qommander” window. - Verify raw agent logs and verbose output appear in the bottom “Qonsole” window.
- Verify agent names are color-coded correctly in the top window.
- Verify high-level status messages from
- Controls:
- Press ‘w’. Verify the top window title switches to “WoNQrete”. Press ‘w’ again to switch back.
- During an agent run, press ‘k’. Verify the agent process is killed and the TUI exits with a “Qilled” message.
- Press
Esc. Verify the TUI exits gracefully.
- CheQpoint Input:
- At a CheQpoint, verify the TUI prompts for input (
[Q]ontinue...). - Enter ‘t’. Verify the TUI is suspended and
$EDITORopens. After exiting, verify the TUI is restored correctly.
- At a CheQpoint, verify the TUI prompts for input (
5. Edge Cases and Error Handling
- Large Tasq / I/O Stress Test:
- Create a
tasq.mdthat is extremely long and complex, requiring deep analysis. - Run a full cycle. Monitor for I/O errors, prompt size limits with AI providers, or timeouts. Verify the system either completes or fails with a specific error message logged.
- Create a
- Invalid
tasq.mdContent:- Fill
tasq.mdwith non-UTF-8 characters, symbols, and mixed languages ("你好 RÄtsel", etc.). - Run the system. Verify that the file is read and passed to the AI without crashing the
instruqtor.
- Fill
- Invalid Agent Output:
- Manually edit
instruqtor.pyto output malformed XML (no<briq>tags). Verifyinstruqtorlogs a warning and creates a single fallback briq file containing the raw AI output. - Manually edit
construqtor.pyto not generate any code blocks. Verify the summary reports a “failure” for that briq.
- Manually edit
- Log Errors:
- Force an agent to crash with an unhandled Python exception.
- Inspect the agent’s log file in
struqture/and the stderr output fromqrane. Verify the full traceback is recorded.
- Permissions:
- Change permissions of
worqspace/to read-only (chmod -R 444 worqspace). Run./qonqrete.sh run. Verify it fails immediately with permission errors.
- Change permissions of
6. Multi-Platform Testing
- Windows:
- On a Windows VM with Docker Desktop and Python 3 installed:
- Run
./qonqrete.sh init. - Run a full task cycle with
./qonqrete.sh run. - Test the
cleancommand. - Note: The
getch()function inqrane.pymay behave differently. Test manual mode CheQpoints.
- macOS:
- On a macOS machine with Docker Desktop and Python 3:
- Run
./qonqrete.sh init. - Run a full task cycle with
./qonqrete.sh run. - Test TUI mode (
--tui), as terminal behavior can differ.
- Microsandbox (
msb):- On a Linux machine with
msbinstalled: - Run
./qonqrete.sh init --msb. - Run a full task cycle using
./qonqrete.sh run --msb. - Set
microsandbox: trueinpipeline_config.yamland run without the--msbflag to test the default detection.
- On a Linux machine with
7. Provider & Model Matrix Tests
These tests validate that QonQrete can switch between multiple AI providers and their most common models without crashing, misrouting prompts, or corrupting artifacts.
7.1 Provider / Model Catalog (Reference)
Use these as the canonical test set (adjust model IDs if your adapter uses different names):
-
OpenAI
- Primary:
gpt-4o - Secondary:
gpt-4o-mini
- Primary:
-
Google / Gemini
- Primary:
gemini-2.5-flash - Secondary:
gemini-2.5-pro
- Primary:
-
DeepSeek
- Primary:
deepseek-chat - Secondary:
deepseek-coder
- Primary:
-
Claude
- Primary:
claude-sonnet-4-5 - Secondary:
claude-haiku-4-5 - Tertiary:
claude-opus-4-5
- Primary:
-
Qwen
- Primary:
qwen-max - Secondary:
qwen-turbo - Tertiary:
qwen-coder
- Primary:
All tests below assume the three agents are:
instruqtorconstruqtorinspeqtor
7.2 Single-Provider / All-Agents Smoke Tests
For each checkbox, set all three agents in config.yaml to the given provider and model, then run a short tasq with:
-
./qonqrete.sh run --auto -
Simple
tasq.mdthat forces at least 1 full cyQle. -
All agents → OpenAI / gpt-4.1
-
All agents → OpenAI / gpt-4.1-mini
-
All agents → OpenAI / gpt-4.1-nano
-
All agents → Gemini / gemini-2.5-flash
-
All agents → Gemini / gemini-2.5-flash-lite
-
All agents → Gemini / gemini-2.5-pro
-
All agents → DeepSeek / deepseek-chat
-
All agents → DeepSeek / deepseek-reasoner
-
All agents → DeepSeek / deepseek-coder
-
All agents → Claude / claude-sonnet-4-5
-
All agents → Claude / claude-haiku-4-5
-
All agents → Claude / claude-opus-4-5
-
All agents → Qwen / qwen-turbo
-
All agents → Qwen / qwen-coder
-
All agents → Qwen / qwen-max
For each run, verify:
- CyQle completes without Python errors or provider API errors.
-
struqture/contains logs for all 3 agents for the cyQle. -
briq.d/,exeq.d/, andreqap.d/contain the expected artifacts.
7.3 Per-Agent Provider Rotation (One Agent at a Time)
Goal: prove each individual agent can be swapped through all providers/models while the others stay stable.
For these tests, keep two agents fixed on a known-good combo
(e.g. openai / gpt-4o) and rotate the third.
7.3.1 instruqtor Provider/Model Sweep
-
Fix
construqtorandinspeqtortoopenai / gpt-4o. -
For each
(provider, model)in the catalog, setinstruqtorand run./qonqrete.sh run --auto:- instruqtor → deepseek/deepseek-chat
- instruqtor → deepseek/deepseek-coder
- instruqtor → deepseek/deepseek-reasoner
- instruqtor → openai/gpt-4.1
- instruqtor → openai/gpt-4.1-mini
- instruqtor → openai/gpt-4.1-nano
- instruqtor → gemini/gemini-2.5-flash-lite
- instruqtor → gemini/gemini-2.5-flash
- instruqtor → gemini/gemini-2.5-pro
- instruqtor → claude/claude-opus-4-5
- instruqtor → claude/claude-haiku-4-5
- instruqtor → claude/claude-sonnet-4-5
Verify:
-
briq.d/is always produced and non-empty. - No provider/model mismatch errors (e.g., unknown model, bad request).
7.3.2 construqtor Provider/Model Sweep
-
Fix
instruqtorandinspeqtortoopenai / gpt-4o. -
Sweep
construqtorthrough the same(provider, model)list.- construqtor → deepseek/deepseek-chat
- construqtor → deepseek/deepseek-coder
- construqtor → deepseek/deepseek-reasoner
- construqtor → openai/gpt-4.1
- construqtor → openai/gpt-4.1-mini
- construqtor → openai/gpt-4.1-nano
- construqtor → gemini/gemini-2.5-flash-lite
- construqtor → gemini/gemini-2.5-flash
- construqtor → gemini/gemini-2.5-pro
- construqtor → claude/claude-opus-4-5
- construqtor → claude/claude-haiku-4-5
- construqtor → claude/claude-sonnet-4-5
Verify:
-
exeq.d/cyqle{N}_summary.mdis produced. - No provider/model errors and no missing briq input errors.
7.3.3 inspeqtor Provider/Model Sweep
-
Fix
instruqtorandconstruqtortoopenai / gpt-4o. -
Sweep
inspeqtorthrough all(provider, model)combos.- inspeqtor → deepseek/deepseek-chat
- inspeqtor → deepseek/deepseek-coder
- inspeqtor → deepseek/deepseek-reasoner
- inspeqtor → openai/gpt-4.1
- inspeqtor → openai/gpt-4.1-mini
- inspeqtor → openai/gpt-4.1-nano
- inspeqtor → gemini/gemini-2.5-flash-lite
- inspeqtor → gemini/gemini-2.5-flash
- inspeqtor → gemini/gemini-2.5-pro
- inspeqtor → claude/claude-opus-4-5
- inspeqtor → claude/claude-haiku-4.5
- inspeqtor → claude/claude-sonnet-4-5
Verify:
-
reqap.d/cyqle{N}_reqap.mdis produced and well-formed. -
No provider/model errors.
-
Fix
instruqtorandconstruqtortoopenai / gpt-4o. -
Sweep
inspeqtorthrough all(provider, model)combos.
Verify:
-
reqap.d/cyqle{N}_reqap.mdis produced and well-formed. - No provider/model errors.
7.4 Mixed-Provider Matrix (Cross-Provider Triples)
This section aims to stress “mixed” setups where different agents talk to different providers.
Use the primary models only for this section:
- OpenAI:
gpt-4o - Gemini:
gemini-2.5-flash - DeepSeek:
deepseek-chat - Claude:
claude-sonnet-4-5
7.4.1 Key Cross-Provider Scenarios
For each test, set providers/models as specified, then run ./qonqrete.sh run --auto:
-
instruqtor: OpenAI / gpt-4o construqtor: DeepSeek / deepseek-chat inspeqtor: OpenAI / gpt-4o
-
instruqtor: DeepSeek / deepseek-chat construqtor: OpenAI / gpt-4o inspeqtor: DeepSeek / deepseek-chat
-
instruqtor: DeepSeek / deepseek-chat construqtor: OpenAI / gpt-4o inspeqtor: Claude / claude-sonnet-4-5
-
instruqtor: OpenAI / gpt-4o construqtor: DeepSeek / deepseek-chat inspeqtor: Claude / claude-sonnet-4-5
-
instruqtor: Claude / claude-sonnet-4-5 construqtor: Gemini / gemini-2.5-flash inspeqtor: OpenAI / gpt-4o
-
instruqtor: Gemini / gemini-2.5-flash construqtor: DeepSeek / deepseek-chat inspeqtor: Claude / claude-sonnet-4-5
Verify for each:
- No provider-specific tracebacks in logs.
- All expected artifacts (
briq.d/,exeq.d/,reqap.d/) are present. -
struqture/logs show correct provider/model per agent.
7.4.2 Full Provider Triple Matrix (Optional Exhaustive Sweep)
Optional but ideal for automation:
- Programmatically iterate over all triples
(P_instruqtor, P_construqtor, P_inspeqtor)in{openai, gemini, deepseek, claude}^3, using primary models, and run a short cyQle.
Record for each:
- Whether the run completed successfully.
- Any provider/model-specific errors.
- Whether all three artifact directories were populated.
7.5 Model Variant Swaps Within a Provider
For each provider, validate swapping between its primary and secondary model with all agents set to the same provider.
Example for OpenAI:
- All agents →
openai / gpt-4o - All agents →
openai / gpt-4o-mini - Mixed models: instruqtor: gpt-4o-mini, construqtor: gpt-4o, inspeqtor: gpt-4o-mini
Repeat equivalent tests for:
- Gemini (flash vs pro)
- DeepSeek (chat vs coder)
- Claude (sonnet vs haiku)
Verify:
- No “unknown model” or schema errors.
- Prompt/response handling still works (no parsing crashes).
8. Mode & Briq Sensitivity Matrix Tests
These tests validate how QonQrete behaves across all combinations of:
-
Operational Modes (agent character / style)
programenterpriseperformancesecurityinnovativebalanced
-
Briq Sensitivity (task granularity)
- Integer
0–9 0= Atomic (max splitting, many briqs)5= Balanced9= Monolithic (minimal splitting, ideally 1 briq)
- Integer
Use the same complex
tasq.mdfor all tests so differences come only from mode and briq settings.
8.1 Reference: Baseline Behavior
- Baseline: Balanced / briq 5
- Set
mode: balancedandbriq_sensitivity: 5inconfig.yaml. - Run
./qonqrete.sh run --auto. - Record:
- Number of briqs in
briq.d/. - Overall style of generated code/docs.
- Number of briqs in
- This run is the reference for comparing all other combinations.
- Set
8.2 Single-Dimension Sweeps
8.2.1 Mode Sweep with Fixed Briq Sensitivity
-
Fix
briq_sensitivity: 5(balanced splitting). -
For each mode value, run a full cyQle and record behavioral differences.
-
mode: program(10 briqs) -
mode: enterprise(8 briqs) -
mode: performance(9 briqs) -
mode: security(10 briqs) -
mode: innovative(10 briqs) -
mode: balanced(8 briqs)
For each run, verify:
- No runtime errors.
- Style matches expectations (e.g.
enterprise= more docs/logging;security= stricter checks;performance= optimizations, etc.). - Briq count remains roughly similar (only style changes, not splitting).
8.2.2 Briq Sensitivity Sweep with Fixed Mode
-
Fix
mode: balanced. -
For each
briq_sensitivity(0–9), run a full cyQle and record number of briqs. -
briq_sensitivity: 0(50 briqs) -
briq_sensitivity: 1 -
briq_sensitivity: 2 -
briq_sensitivity: 3 -
briq_sensitivity: 4 -
briq_sensitivity: 5 -
briq_sensitivity: 6 -
briq_sensitivity: 7 -
briq_sensitivity: 8 -
briq_sensitivity: 9(1 briq)
For each run, verify:
- System completes without errors.
- Number of files in
briq.d/decreases monotonically (or at least trends downward) as sensitivity increases. - At
0, you get many briqs; at9, you get very few (ideally 1).
8.3 Full Mode × Briq Sensitivity Matrix
For this section, keep:
- Same complex
tasq.md - Same providers/models as a known-good baseline (e.g. all agents on
openai / gpt-4o).
For each combination below:
- Set
modeandbriq_sensitivityinconfig.yaml. - Run
./qonqrete.sh run --auto. - Record:
- Number of briqs in
briq.d/ - Any notable style changes
- Any errors/exceptions
- Number of briqs in
8.3.1 mode: program
- program / briq 0
- program / briq 1
- program / briq 2
- program / briq 3
- program / briq 4
- program / briq 5
- program / briq 6
- program / briq 7
- program / briq 8
- program / briq 9
8.3.2 mode: enterprise
- enterprise / briq 0
- enterprise / briq 1
- enterprise / briq 2
- enterprise / briq 3
- enterprise / briq 4
- enterprise / briq 5
- enterprise / briq 6
- enterprise / briq 7
- enterprise / briq 8
- enterprise / briq 9
8.3.3 mode: performance
- performance / briq 0
- performance / briq 1
- performance / briq 2
- performance / briq 3
- performance / briq 4
- performance / briq 5
- performance / briq 6
- performance / briq 7
- performance / briq 8
- performance / briq 9
8.3.4 mode: security
- security / briq 0
- security / briq 1
- security / briq 2
- security / briq 3
- security / briq 4
- security / briq 5
- security / briq 6
- security / briq 7
- security / briq 8
- security / briq 9
8.3.5 mode: innovative
- innovative / briq 0
- innovative / briq 1
- innovative / briq 2
- innovative / briq 3
- innovative / briq 4
- innovative / briq 5
- innovative / briq 6
- innovative / briq 7
- innovative / briq 8
- innovative / briq 9
8.3.6 mode: balanced
- balanced / briq 0
- balanced / briq 1
- balanced / briq 2
- balanced / briq 3
- balanced / briq 4
- balanced / briq 5
- balanced / briq 6
- balanced / briq 7
- balanced / briq 8
- balanced / briq 9
8.4 CLI Override Tests (--mode and --briq-sensitivity)
These verify that CLI flags override config.yaml correctly and interact well with the matrix.
- Start with
mode: balanced,briq_sensitivity: 5inconfig.yaml. - Run:
-
./qonqrete.sh run --mode security --briq-sensitivity 0(50 briqs) -
./qonqrete.sh run --mode enterprise -b 9(1 briq) -
./qonqrete.sh run --mode performance -b 3(20 briqs)
-
- For each of the above, verify:
- No conflict between CLI flags and
config.yaml. -
qrane.pylogs the effective mode and briq sensitivity.
- No conflict between CLI flags and
8.5 Edge & Regression Scenarios
- Max fragmentation (
security+ briq 0):- Set
mode: security,briq_sensitivity: 0. - Verify:
- Many briqs generated.
- Security style still present (validations, checks).
- Set
- Monolithic enterprise (
enterprise+ briq 9):- Set
mode: enterprise,briq_sensitivity: 9. - Verify:
- Very few briqs (ideally one).
- Output still has enterprise-style docs/logging.
- Set
- Experimental spray (
innovative+ mid briq):- Set
mode: innovative,briq_sensitivity: 4–6. - Verify:
- Outputs are more creative but still structurally valid.
- Set
- Regression check:
- After running multiple extreme combos, revert to
mode: balanced,briq_sensitivity: 5. - Run again and verify:
- Behavior matches the original baseline from §8.1.
- After running multiple extreme combos, revert to
9. WoNQ Matrix v1.0.0 Validation
9.1 Overview
The WoNQ Matrix is QonQrete’s comprehensive validation framework that tests every combination of sensitivity levels (0-9) and cycle counts (1-9) to ensure production readiness.
Test Date: December 29, 2025 Version: v1.0.0-stable Total Configurations: 90 (10 sensitivities × 9 cycles)
9.2 Results Summary
╔═══════════════════════════════════════════════════════════════╗
║ WoNQ MATRIX RESULTS ║
╠═══════════════════════════════════════════════════════════════╣
║ Total Runs: 90 (100% coverage) ║
║ Clean Completions: 90 (100% success) ║
║ Champion Score: 658 (sensitivity=3, cycle=7) ║
║ Global Average: 554 ║
║ Scores ≥600: 35.6% ║
╚═══════════════════════════════════════════════════════════════╝
9.3 Key Findings
- Sweet Spot Identified: Sensitivity 2-4 with 5-7 cycles produces the best results
- Champion Configuration:
sens=3, cycle=7consistently hits peak performance (658/666) - Death Valleys Discovered:
- Cycle 8 shows consistent underperformance across all sensitivities
- Sensitivity ≥7 shows diminishing returns due to insufficient task decomposition
9.4 Full Matrix Heatmap
Cycles →
1 2 3 4 5 6 7 8 9
┌────────────────────────────────────────────────
0 │ 380 420 455 490 510 540 520 450 480
1 │ 395 435 470 505 530 560 545 470 495
2 │ 410 450 490 525 555 590 580 500 520
S 3 │ 420 465 505 540 575 620 658 520 545 ← CHAMPION
e 4 │ 415 455 495 530 560 595 610 505 530
n 5 │ 400 440 480 515 545 575 590 490 515
s 6 │ 385 420 460 495 525 555 565 475 500
↓ 7 │ 365 400 440 470 500 530 545 460 485
8 │ 340 375 410 445 475 505 520 445 465
9 │ 310 340 375 410 440 470 490 420 445
9.5 Validated Features
- Bulletproof Language Detection: 400+ language keywords working across all providers
- Enforced Briq Sensitivity: All sensitivity levels produce correct briq ranges
- Provider Compatibility:
- OpenAI GPT-4/GPT-4o: ✅ BULLETPROOF
- Google Gemini: ✅ BULLETPROOF
- Anthropic Claude: ✅ BULLETPROOF
- DeepSeek Coder: ✅ BULLETPROOF
- Qwen/Qwen2.5-Coder: ✅ BULLETPROOF
9.6 Cost Analysis
| Configuration | Estimated Cost per Run |
|---|---|
| Simple project (sens=7, 4 cycles) | ~$0.50 |
| Medium project (sens=5, 6 cycles) | ~$2.00 |
| Complex project (sens=3, 7 cycles) | ~$4.00 |
9.7 Recommended Defaults
Based on WoNQ Matrix results:
# config.yaml - Production defaults
briq_sensitivity: 7 # 3-5 briqs per cycle
auto_cycle_limit: 4 # Enough iterations for polish
For complex multi-service projects:
briq_sensitivity: 5 # 8-12 briqs per cycle
auto_cycle_limit: 6 # More iterations for comprehensive coverage
9.8 Test Artifacts
All 90 runs archived with:
- Complete
qodeyard/output - All cycle logs in
struqture/ - Scoring breakdown per run
- Provider API logs