Microsoft Research announces SentinelStep: AI agents that can wait, monitor, and act before disaster strikes

Microsoft Research announces SentinelStep: AI agents that can wait, monitor, and act before disaster strikes

User avatar placeholder
Written by Dave W. Shanahan

October 21, 2025

Microsoft Research introduced SentinelStep, a lightweight mechanism that lets AI agents “wait, monitor, and act” reliably over hours or days by combining dynamic polling with careful context management in the Magentic‑UI (available on GitHub or via pip install magnetic-ui) research prototype. The team reports significantly higher success on long-running monitoring tasks (1–2 hours) versus baseline agents, positioning SentinelStep as a practical step toward always‑on assistants that stay efficient and aligned with user intent.

The post is led by Senior Researcher Hussein Mozannar, whose work focuses on augmenting humans with AI via agentic systems that act on users’ behalf for real‑world tasks.​ Contributors include Matheus Kunzler Maldaner, Maya Murad, Jingya Chen, Gagan Bansal, Rafah Hosn, and Adam Fourney from Microsoft Research’s AI Frontiers and adjacent product teams.

Why it matters

Modern LLM agents excel at discrete tasks but typically fail at the simple act of “waiting,” either checking too often and exhausting context or giving up too soon, which undermines real‑world monitoring use cases.​ Because monitoring needs span email replies, news or price watches, and operational telemetry, a robust “wait and act” pattern can unlock meaningful time savings and reliability for users and teams.​

How SentinelStep works

Microsoft Research announces SentinelStep: AI agents that can wait, monitor, and act before disaster strikesSentinelStep tunes the polling interval for the task at hand—treating an email watch differently than quarterly earnings—and then adjusts the interval dynamically based on observed behavior.​ To avoid context overflow on multi‑hour or multi‑day runs, it snapshots agent state after the first check and reuses that state across subsequent checks to keep memory bounded and focused.​ Conceptually, the behavior is simple: every [polling interval] do [actions] until [condition] is satisfied, with those three components defined in Magentic‑UI’s co‑planning interface.​

  1. Actions: the concrete steps the agent takes to collect the needed information in each cycle.​

  2. Condition: the explicit stop criterion that determines when monitoring is complete.​

  3. Polling interval: the timing schedule, initially estimated and then adapted over time.​

Processing flow

Microsoft Research announces SentinelStep: AI agents that can wait, monitor, and act before disaster strikes
The three main components in Magentic-UI’s co-planning interface (Image: Microsoft).

Magentic‑UI assigns the most appropriate specialist agent for each action—such as web surfing, code execution, or calling external MCP servers—while the orchestrator evaluates the condition after each cycle.​ If the condition is not met, the orchestrator computes the next check time, resets agent state to prevent context bloat, and repeats until completion.​

Evaluation results

SentinelStep
Improves success rates on longer running tasks (1–2 hours) while maintaining comparable performance on shorter tasks (Image: Microsoft).

To evaluate inherently one‑off real‑world monitoring events, the team built SentinelBench, a suite of 28 configurable synthetic web scenarios that make long‑running experiments repeatable.​ In initial tests, success on 1‑hour tasks rose from 5.6% without SentinelStep to 33.3% with it, and on 2‑hour tasks from 5.6% to 38.9%, while short tasks remained comparable, indicating sustained performance over time.​

Availability and guidance

SentinelStep ships open‑source as part of Magentic‑UI and can be installed via pip, with guidance to validate behavior for production use and review the accompanying Transparency Note for safety and privacy considerations.​ The work lays groundwork for proactive, always‑on assistants that act when it matters, including RedCodeAgent, without burning resources or overrunning context limits.​

You May Like

  1. Microsoft Security Store launches to unify third-party solutions and AI agents​
  2. Microsoft Ignite 2025: Top 5 San Francisco Hotels For Attendees
  3. Battle in the Heavens: Ninja Gaiden 4 Sets the Stage for Xbox’s Helicopter World Record
  4. Turn Off Sticky Keys on Windows 11 Fast (And Stop It Coming Back)
  5. Microsoft maps Azure Blob Storage attack chain, urges Defender for Storage and SAS hygiene

Discover more from Microsoft News Now

Subscribe to get the latest posts sent to your email.

Image placeholder

I'm Dave W. Shanahan, a Microsoft enthusiast with a passion for Windows, Xbox, Microsoft 365 Copilot, Azure, and more. I started MSFTNewsNow.com to keep the world updated on Microsoft news. Based in Massachusetts, you can email me at davewshanahan@gmail.com.