Discussion about this post

User's avatar
ToxSec's avatar

always a pleasure to listen to Andrew :)

Expand full comment
Neural Foundry's avatar

Outstanding curation this week. The Step-GUI numbers are legit impressive, 80.2% on AndroidWorld is basically production ready for alot of enterprise workflows. The self-evolving training pipeline with the Calibrated Step Reward System is clever, achieving 90%+ annotation accuracy at 10-100x lower cost basically solves the labeling bottleneck. I've been messing with GUI automation agents for a few months now and the bigest issue was always getting enough high-quality trajectory data without burning through manual annotation budgets. The hierarchical GUI-MCP architecture makes sense too, separating low-level atomic ops from high-level task delegation maps well to how humans actualy think about automating tasks.

Expand full comment

No posts

Ready for more?