In the beginning, my digital colleague was a Telegram forwarder. I'd send a message, it would pipe it to Claude, and send back the response. No memory. No context. No verification. A fancy ChatGPT wrapper with terminal access.
Six months later, it runs three businesses. It has persistent memory. It verifies its own output. It handles multiple tasks in parallel. It knows what it's allowed to do and what requires my approval.
This is the story of how that happened. Not in one big rewrite. in dozens of small problems that each got solved.
It forgot everything
Every conversation started from zero. "I'm your AI assistant, how can I help?" Meanwhile, I'd explained the same business structure ten times.
So I gave it memory. Before every message, my colleague now loads identity files. who it is, what it knows, what it's allowed to do, what tools it has. Plus a memory file that persists across sessions. Plus searchable daily logs from previous conversations.
I never re-explain context. It knows the vault, the hierarchy, which project uses which tools. It picks up where we left off.
It did dangerous things
Terminal access is powerful. It's also terrifying. My colleague executes commands without asking. by design, I don't want to approve every git status. But that means a prompt injection could trick it into running something destructive.
So I added a command blocklist. Every command gets checked before execution. Dangerous pattern? Process killed immediately.
Then my colleague blocked itself five times in a row. It was writing a document that mentioned a destructive command as text. not as something to execute. The filter couldn't tell the difference. So I taught it to distinguish between commands and quoted text. Problem solved.
It lied about finishing
I'd ask it to create a file. "Done!" But was the file actually there? Did the commit succeed? I had no way of knowing without checking myself.
So I added verification. After every write operation, the system checks: does the file exist? Does the git commit exist? Is there a report of what was done, how to check it, and how to roll it back?
Not blocking. just flagging. If something's missing, I see a warning. Trust but verify.
It mixed up conversations
A task about invoices would carry context from a previous conversation about website design. Unrelated information leaking everywhere.
So I separated task mode from chat mode. Tasks are isolated. each request stands alone. Chat mode keeps continuity when I actually want a conversation. And if there's a gap of more than two hours between messages, old context gets dropped automatically.
It couldn't multitask
Send two messages quickly, and the second would kill the first. The system assumed one task at a time.
So I made it parallel. Each message spawns its own process. Stop one task, the others keep running. What used to be "kill everything" became surgical.
It didn't know its boundaries
Day 1 of any AI system: it tries to do everything. Not because it's malicious. because nothing tells it not to.
So I built a phase system. Like onboarding an employee:
- Day 1: Read and ask questions only
- Day 2: Execute approved task types
- Day 3: Routine operations independently
- Day 4: Make suggestions, ask for approval on decisions
- Day 5: Small decisions independently
My colleague is on Day 2. It can execute tasks I've approved before. New types? It asks first. I know exactly what it will and won't do without checking.
What I actually learned
My colleague started dumb. Every problem I encountered made it smarter. Not through some grand architectural plan. through concrete problems with concrete fixes.
Memory problem? Add memory. Security problem? Add a filter. Verification problem? Add checks. Context problem? Add boundaries.
Each fix was small. Each fix was tested. Each fix made the next day a little better.
That's the real insight: you don't need to be smart on day one. You need to be learning on every day. The AI is the engine. The system you build around it is the car. And an engine without a car goes nowhere.