Skip to content
Paul LuckeyProduct Architect
These are not concepts — every demo runs live AI inference in the browser. Try them!

Portfolio

Working prototypes of structured thinking tools I've designed and built.

Built by Paul Luckey, Product Architect

MCQ Generator

Goodhart-aware quality detection

Try the demo

Three bias analysts and a validity grader measure both the surface signals (position, length, vocabulary overlap) and the failure mode they miss — when every metric reads healthy and every option is in fact defensibly correct.

Sample output
production-rewrite · claude-opus-4-7

In one hour, Ana can either write 10 marketing emails or design 2 web pages. In the same hour, her assistant Ben can either write 4 marketing emails or design 1 web page. Ana's startup needs both done. How should they divide the work to maximize total output, and why?

  1. A.Ana should focus on emails and Ben on web pages, because Ana writes more emails per hour (10 vs. 4) — that productivity gap is larger than her web-page gap (2 vs. 1), so emails are where she's most clearly ahead.
  2. B.Ben should focus on emails and Ana on web pages, because Ana gives up 5 emails per web page while Ben gives up only 4, so assigning the cheaper email-producer to emails minimizes forgone design work.
  3. C.Ana should do both tasks herself, since she produces more emails per hour and more web pages per hour than Ben — bringing Ben in on either task reduces total output.
  4. D.Ben should focus on web pages and Ana on emails, because Ben can only produce 1 web page per hour while Ana can produce 2, so assigning the slower worker to the slower-output task balances the workload.

Generated 2026-05-20. Bias passes (3 / 3 analysts). Validity passes — single-correct grader confirms one defensibly correct option.

Length bias: 88% → 13%. All-false BAS: 88% → 20%. The simplest sufficient intervention beat every more-complex approach across 24 phases of iteration (Sonnet 4, March 2026).

Claude Code Skills

More experiments →

About the Architect

Paul Luckey

Paul Luckey

Product Architect

Austin, Texas

Product architect with a background in psychology and over a decade of enterprise systems engineering. I design software tools that help people think more clearly — structured reasoning, knowledge retrieval, real-time analysis. The work above is representative.