Hacker Newsnew | past | comments | ask | show | jobs | submit | simple10's commentslogin

Really cool. I've been building a mission control system (multi agent orchestration) that follows very similar patterns of spec driven development, steering, and task management. Having this baked into an IDE is a great idea.

For observability, would be amazing to have session replay or at least session exploration built in. Kinda like git history but tied to tasks and tool use instead of file diffs.


Yep. I finally realized what "green" accounts are for on HN. Recently created accounts.

Right on. Good luck! You might also want to play around with https://github.com/simple10/agent-super-spy if you want to see the raw prompts claude is sending. It was really helpful for me to see the system prompts and how tool calls and message threads are handled.

Sub-agent trees are fully tracked by the dashboard. When an agent is spawned, it always has a parent agent id - claude is sending this in the hooks payload. When you mouse over an agent in the dashboard, it shows what agent spawned it. There currently isn't a tree view of agents in the UI, but it would be easy to add. The data is all there.

[Edit] When claude spawns sub-agents, they inherit the parent's hooks. So all sub-agents activity gets logged by default.


I hit a lot of limits on Pro plan. Upgraded to Max $200/mo plan and haven't hit limits for awhile.

It's super important to check your plugins or use a proxy to inspect raw prompts. If you have a lot of skills and plugins installed, you'll burn through tokens 5-10x faster than normal.

Also have claude use sub-agents and agent teams. They're significantly lighter on token usage when they're spawned with fresh context windows. You can see in Agents Observe dashboard exactly what prompt and response claude is using for spawning sub-agents.


I'm not actually reading the jsonl files. Agents Observe just uses hooks and sends all hook data the server (running as a docker container by default).

Basic flow:

1. Plugin registers hooks that call a dump pipe script that sends hook events data to api server

2. Server parses events and stores them in sqlite by session and agent id - mostly just stores data, minimal processing

3. Dashboard UI uses websockets to get real-time events from the server

4. UI does most of the heavy lifting by parsing events, grouping by agent / sub-agent, extracting out tool calls to dynamically create filters, etc.

It took a lot of iterations to keep things simple and performant.

You can easily modify the app/client UI code to fully customize the dashboard. The API app/server is intentionally unopinionated about how events will be rendered. This was by design to add support for other agent events soon.


The hooks approach seems much cleaner for real-time. Did you run into any issues with the blocking hooks degrading performance before you switched to background?

Sort of. It wasn't really noticeable until I did an intentional audit of performance, then noticed the speed improvements.

Node has a 30-50ms cold start overhead. Then there's overhead in the hook script to read local config files, make http request to server, and check for callbacks. In practice, this was about 50-60ms per hook.

The background hook shim reduces latency to around 3-5ms (10x improvement). It was noticeable when using agent teams with 5+ sub-agents running in parallel.

But the real speed up was disabling all the other plugins I had been collecting. It piles up fast and is easy for me to forget what's installed globally.

I've also started periodically asking claude to analyze it's prompts to look for conflicts. It's shockingly common for plugins and skills to end up with contradictory instructions. Opus works around it just fine, but it's unnecessary overhead for every turn.


If you're just saving it into sqlite, why is server even needed?

This tool is useful if you want to see all the internal commands claude agents are making in real-time:

https://github.com/simple10/agents-observe


Thanks! This was step one in my daily driver stack - better observability. I also bundled up a bunch of other observability services in https://github.com/simple10/agent-super-spy so I can see the raw prompts and headers.

The next big layer for my personal stack is full orchestration. Something like Paperclip but much more specialized for my use cases.


Yes, this. They need as much lock-in as possible before IPO. Most likely less about cash flow and more about IPO story telling.

We'll know for sure when they add full OpenClaw-like features to Claude Code like remote channels & heartbeat support. Both are partially implemented already.


NemoClaw is mostly a trojan horse of sorts to get corporate OpenClaw users quickly ported over to Nvidia's inference cloud.

It's a neat piece of architecture - the OpenShell piece that does the security sandboxing. Gives a lot more granular control over exec and network egress calls. Docker doesn't provide this out of the box.

But NemoClaw is pre-configured to intercept all OpenClaw LLM requests and proxy them to Nvidia's inference cloud. That's kinda the whole point of them releasing it.

I can be modified to allow for other providers, but at the time of launch, there was no mention of how to do this in their docs. Kinda a brilliant marketing move on their part.


> the OpenShell piece that does the security sandboxing. Gives a lot more granular control over exec and network egress calls. Docker doesn't provide this out of the box.

I think the experimental Docker Ai Sandboxes do this as well: https://docs.docker.com/ai/sandboxes/ Plus free choice of inference model.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: