📺 Watch Ralph in Action: Ralph implementing its own features
🔗 Links: GitHub Repository
Shoutout to Matt Pocock whose viral tweet about "Ralph Wiggum" sparked this whole thing.
Why, You Ask?
Because AI coding assistants have a dirty little secret: they forget.
You start strong. "Refactor this auth module." The AI gets it. Code looks good. You're feeling productive.
Then 15 minutes later... it forgets what you discussed. It starts solving problems you never asked about. It rewrites code you specifically said to leave alone.
Sound familiar?
The Problem
Most AI coding tools are built for short bursts. Quick questions, small fixes, code snippets. Try to run them for hours on a bigger task? Good luck. Context degrades. Focus drifts. You spend more time course-correcting than coding.
I always wanted something different. An AI that could work autonomously on well-defined tasks. Start it before bed, wake up to working code. The missing piece? An agent that you can programmatically control.
Now, with the GitHub Copilot SDK and the Ralph loop, this has become too easy not to do.
So I built yet another Ralph.
What Is Ralph?
Ralph is a .NET CLI tool that treats AI as an autonomous agent, not a chat buddy. The key insight: structure breeds autonomy.
Here's how it works:
You describe what you want. Ralph asks clarifying questions, then generates a structured PRD with prioritized user stories.
Ralph executes autonomously. Picks up one story at a time, implements it, runs verification scripts to confirm it works, commits, moves on.
Repeat until done. You come back to find working, tested code.
The Secret Sauce: The PRD
At the heart of Ralph is a simple JSON file - the Product Requirements Document:
Each story is small enough to complete in one go. Ralph picks the next one where passes: false, implements it, runs verification, and marks it complete. If the build fails, it sees the errors and fixes them. If tests fail, same thing.
The magic is in the ralph_verify() calls. These run actual C# verification scripts - compiled at runtime with Roslyn - that confirm the work is correct before moving on.
Verification Scripts
Traditional AI assistants produce code and hope for the best. Ralph takes a different approach: every story must pass verification before it's marked complete.
The ralph_verify() function in acceptance criteria tells Ralph which scripts to run. When Ralph sees ralph_verify(Build) passes, it executes the Build script and confirms it succeeds before moving on.
Ralph ships with Build and Test scripts out of the box. They're C# classes compiled at runtime with Roslyn:
using Ralph.Core.Abstractions;
namespace Ralph.Scripts;
[VerificationScript("Build", "Verifies solution compiles")]
public class Build : IVerificationScript
{
public VerificationResult Execute()
{
// Spawn dotnet build, capture output, return pass/fail
}
}
The key insight: Ralph loops until all verification scripts pass. If Build fails, Ralph sees the compiler errors and attempts to fix them. If Test fails, Ralph sees which tests failed and why. This self-correcting loop is what enables hours of autonomous work.
Why custom scripts instead of just telling the AI "run dotnet build"? Context efficiency. A successful build returns "Build succeeded" - not the entire compiler output. Only on failure does Ralph see the errors it needs to fix. This keeps the context window lean and focused on what matters.
Want custom verification? Drop a script in .ralph/scripts/:
[VerificationScript("Lint", "Runs dotnet format to check code style")]
public class Lint : IVerificationScript
{
public VerificationResult Execute()
{
// Run dotnet format --verify-no-changes
// Return pass/fail based on exit code
}
}
Now you can use ralph_verify(Lint) passes in your acceptance criteria. Security scans, API tests, whatever you need.
Behind the Scenes
Every AI model has a context window-the amount of text it can "see" at once. Ralph works within these limits by keeping each story self-contained.
A story must fit entirely within a single context window:
- The system prompt (Ralph's instructions)
- The current story's details and acceptance criteria
- Relevant source files Ralph needs to read
- Space for Ralph to "think" and generate code
Practical guidance:
| Story Size | Files Touched | Recommendation |
|---|---|---|
| Small | 1-3 files | ✅ Ideal |
| Medium | 4-8 files | ✅ Good |
| Large | 9-15 files | ⚠️ Consider splitting |
| Too Large | 15+ files | ❌ Split required |
Ralph doesn't have persistent memory across stories. Each time it picks one up, it starts fresh. This isn't a bug-it's a feature that forces clean design. Each story is a complete, verifiable unit of work.
"Add user authentication" becomes:
- Create User entity and migration
- Implement password hashing service
- Create login endpoint
- Add JWT token generation
- Implement authentication middleware
- Create registration endpoint
Each story is small enough to complete in one iteration, yet together they deliver the full feature.
State Management
Ralph maintains state at two levels:
PRD state (persistent) - which stories exist, which have passed, notes from completed work. This file persists across sessions and gets committed to your repo.
Session state (ephemeral) - current working story, in-progress changes, retry attempts. Lives only for one execution.
If Ralph crashes mid-story, that story remains passes: false and will be retried on next run. Completed stories are never redone.
Getting Started
You'll need .NET 10 SDK and GitHub Copilot CLI installed and authenticated.
# Build and pack
# Install globally
Then in your project:
# Option A: Plan and implement in one go
# Option B: Plan first, then implement separately
That's it. Ralph asks "What would you like to accomplish?", creates a PRD from your description, and implements all stories. Go grab coffee.
Why This Works
Traditional AI assistants are designed for conversation. Ralph is designed for execution.
| Traditional AI | Ralph | |
|---|---|---|
| Duration | Minutes | Hours |
| Context | Degrades over time | Fresh per story |
| Verification | You check manually | Scripts verify automatically |
| Human involvement | Constant supervision | Set and forget |
The AI model is the same. The difference is the workflow architecture around it.
The Catch
Stories need to be small enough to fit in a context window. If a story requires understanding 50 files at once, it'll struggle. Keep them focused: 1-5 files per story, one concern each.
Also, this isn't magic. Ralph still needs good acceptance criteria. Vague requirements produce vague results. But that's true of any engineering work.
Wrapping Up
AI coding assistants are great at short bursts but fall apart on longer tasks. Ralph fixes that by giving AI what it needs: structure. A PRD breaks work into discrete stories. Verification scripts confirm each one works. The result? An agent that can run for hours without wandering off course.
The shift from "AI assistant" to "AI agent" isn't just semantic. It's the difference between babysitting and delegating.
What's Next?
- Watch the demo - See Ralph implement its own features in the YouTube video
- Clone the repo - github.com/svnscha/ralph
- Try it on something small - Pick a well-defined feature, let Ralph work on it overnight, come back to commits
That moment when you return to working, tested code you didn't write line by line? That's when it clicks.