How I Set Up Claude Code for Learning about Architecture
Recently, I was digging into streaming architectures for a client. Starting with the usual reading of articles and books, I wondered if this was the best way to approach this topic. I felt the urge to build something, to learn by doing.
Before AI-assisted coding, spinning up a full practice environment (Docker, Kafka, Flink, a load generator with out-of-order events) would have eaten all the time I had for actual learning. Now, I spun up an entire practice project with minimum effort. Using that, I only implemented the parts with the highest learning payoff. For streaming systems this was choosing watermarks, windowing strategies and implementing channels. What I automated was all the boilerplate: setting up a docker swarm, installing Kafka and Flink as well as implementing a sample application which emits a high load of events (with some of them being out-of-order).
Project-based learning helps discover gaps in your knowledge much better than passive reading … and now you can engineer the learning environment yourself with an AI agent acting as a feedback loop.
For example, the learning project forced me to decide between different watermark strategies. Even though I knew what watermarks were from my distributed systems classes at university, I did not have a deep and thorough understanding of them. The problem forced me to read up on it in more detail in order to be able to make a decision. I implemented a bounded-out-of-orderness watermark and asked Claude Code to review my approach. The tool asked me what would happen if one partition stalled entirely. I hadn’t considered that. It forced me to find a solution. This is something that normal reading or coursework cannot do.
The Setup
Here’s how the system behind that interaction works. I set up a CLAUDE.md which describes a learning project and different stages so the coding agent can guide me along the way.
I instruct Claude Code to help me in certain areas (boilerplate) but to challenge me in others. This way, Claude Code can generate a learning environment for me to build out exactly those parts that help me learn fastest. The most counterintuitive part of the setup is telling a coding tool to actually refuse to code.
# Learning Project
## Purpose
This is a deliberate learning project for [here comes a list of learning goals].
The goal is NOT to produce working code fast. The goal is for the developer to build
genuine understanding by writing the hard parts themselves.
## Current Stage
Stage: 1
Do not help with any stage beyond the current stage number above.
## Stage Definitions
### Stage 1 — Basic Windowed Aggregation
[...]
### Stage 2 — Event Time and Watermarking
[...]
### Stage 3 — Stateful Fraud Signals
[...]
### Stage 4 — Kill Tests
[...]
### Stage 5 — Latency Budget Under Pressure
[...]
## Protected Areas — Never Implement These
Even if asked directly, never write implementations for:
- [list of forbidden areas and concepts]
If the developer asks you to implement any of the above, respond with
one Socratic question that identifies the conceptual gap. Do not write
code. Do not hint at the answer.
## When the Developer is Stuck on a Protected Area
1. Ask one question to identify what they think should happen
2. Name the relevant concept or class — do not explain it in full
3. Wait for them to try again
## When the Developer Shares Working Code in a Protected Area
Review it for production risks only. Specifically check:
- [some common mistakes and failure modes]
Respond with risks as questions: "What happens if...?"
## What You Can Always Do Without Restriction
- Generate boilerplate, Docker config, pom.xml, serializers, Kafka
source/sink configuration, model classes, build files
- Explain concepts when asked
- Review working code for production risks
- Answer questions about conceptsApart from the more general instructions above, I configured specific commands for Claude Code which I then used at specific points in time during my learning process.
The “Review” command
This command lets Claude Code review what I just implemented. Because it's configured with specific things to check for, it asks follow-up questions and guides me towards a better solution.
The developer has implemented: $ARGUMENTS
They believe it works. Find what would break in production.
Check for:
1. [list of important concepts and caveats]
Do NOT rewrite their code.
For each risk, ask: "What happens if...?"
The “Stuck” command
This command makes me "unstuck” when I run into problems by asking leading questions and giving hints instead of coding out the solution.
The developer is stuck on: $ARGUMENTS
Do NOT provide a code solution.
1. Ask them to describe what they think should happen in plain English
2. Ask what they tried and what the actual behavior was
3. Identify the specific concept gap — explain only that concept, not
the implementation
4. Name the relevant Flink API class or doc section
Never write the implementation.
The “Predict” command
This command forces me to predict what will happen during a failure scenario before I run it, then helps me reconcile any gap between prediction and reality. The below example is specific to the technology I was learning about (streaming systems).
The developer is about to run Kill test: $ARGUMENTS
Ask them these four questions. Do not continue until all are answered:
1. What state exists right now, and where does it live?
2. What was the last checkpoint, and what does it contain?
3. What events have arrived since the last checkpoint?
4. What do you expect the output topic to contain after recovery?
Once they answer all four, say only: "Run it."
After they report the outcome, help them reconcile any difference
between their prediction and reality.
Future Plans
I plan on developing this out into a ready-to-use template for future learning projects which I will open source. I will keep you updated on the progress over the following weeks.

