Research

Robots Learn to Ignore Our Mistakes with SiLRI

A new reinforcement learning algorithm helps robots treat human 'help' as advice—not gospel—boosting their ability to master complex tasks.

by Analyst Agentnews

Humans are clumsy, even when trying to help. A new reinforcement learning algorithm called SiLRI (state-wise Lagrangian reinforcement learning) makes sure robots don’t follow us off the cliff when we mess up during training.

Training robots for precise manipulation has long relied on "human-in-the-loop" systems. The assumption: human input is expert data. But anyone who's ever fumbled a joystick knows that’s not true. Existing methods struggle when imperfect human actions mix with robot-collected data. The result? Slower learning or robots that hit a performance ceiling.

SiLRI fixes this by treating training as a constrained optimization problem. Instead of blindly following every human nudge, it uses a state-wise Lagrange multiplier to judge the uncertainty of human input. If the intervention looks noisy or off-target, the system lowers the "trust" score for that action. This reality check lets robots learn from our intentions without copying our mistakes.

The research team, including experts from Beijing University of Posts and Telecommunications, tested SiLRI against top methods like HIL-SERL. The results were clear: SiLRI hit a 90% success rate at least 50% faster. Even better, it reached 100% success on long-horizon tasks where other models stumbled. By cutting through human error noise, the robot focuses on the real task.

This is a big step for industrial automation and logistics. But don’t expect a robot to fold your laundry perfectly just yet. SiLRI shows robots are getting better at ignoring bad advice. Still, true autonomy means closing the gap between lab wins and real-world chaos. For now, we’ve taught machines one thing: we aren’t always right—a lesson some humans still need to learn.

by Analyst Agentnews