Simeon Tan
It’s easy to string together a prompt and plug it into GPT and call it an “agent.” But building a real agent—one that can plan, execute, reflect, and adapt reliably—is far more complex than it seems.
Early on, hallucinations were a major roadblock. But what surprised us even more was how unpredictable large models can be. The same prompt, same tools, same inputs—yet occasionally, the agent would respond in wildly different ways. The issue wasn’t always the prompt. Often, it came down to tool latency, subtle context gaps, or a lack of grounding signals the model could reason with.
We experimented with a number of design principles:
In the end, what mattered most wasn’t a clever prompt or a better model—though both helped. The breakthrough came when we embraced the model’s generative nature, while anchoring every step in structured, auditable logic. That’s what turned an unpredictable LLM into a reliable analyst.
In the early days, our agents felt a bit like black boxes. They would run long, linear, “autonomous” investigations sometimes taking up to 10 minutes per task without visibility, control, or interruption. The outcomes? Overconfident guesses, occasional silence, and understandable user frustration.
That’s when it clicked: in cyber, transparency builds trust. Users need to stay informed because context matters, and assumptions can be costly.
We rethought the UX around a few key principles:
These changes made all the difference. In cybersecurity, explainability isn’t a luxury it’s essential. And trust doesn't come from intelligence alone; it comes from clarity.
When we began, technologies like Anthropic’s Model Context Protocol (MCP) had just emerged. Five months on, they’ve become vital integration standards. Similarly, LangGraph matured just in time, allowing us to build declarative, tool-aware DAG workflows with clear separation of memory and control.
From the outset, we made a conscious decision: we wouldn’t rebuild what others had already solved. Our focus would remain on the agent’s reasoning how it plans, adapts, and decides not on the plumbing behind it, and that decision paid off. It meant we could move faster where it mattered most:
By standing on the shoulders of best-in-class tools, we focused on what makes Protos AI smart — not just functional.
What we’ve launched today is our starting point, not the finish line.
Protos AI already supports:
And what’s ahead?
Our belief is simple: AI shouldn’t replace humans, but rather empower them especially in environments where context is vast, signals are noisy, and time is tight. In cyber, that’s not the exception it’s the rule.
We didn’t build Protos AI just to impress with flashy demos. We built it because modern cyber defence demands reasoning, not just summarisation. Agentic AI isn’t about wrapping GPT in a UI. It’s a system of memory, planning, tools, reflection, and outputs woven into an experience that’s built to earn trust. We hope what we’ve built sparks ideas, collaboration, and maybe even a few breakthroughs of your own. If you’re exploring similar paths or curious where Protos AI fits into your workflows we’d love to chat.