Robots Need SOTIF Too

Abstract: On June 2, 2026, the Chinese national standard project 机器人预期功能安全实施指南 entered public notice, with the comment period scheduled to close on July 2, 2026. I have put this direction into OpenTopic as the second open research theme: Robot SOTIF. The goal is not to copy autonomous-driving SOTIF directly into robotics, but to build an evidence chain from standards, ODD, scenarios, triggering conditions, physical interaction, LLM/VLA decision safety, and finally to a defensible safety case.

Figure 1: Evidence chain for Robot SOTIF

I recently saw a national standard project notice that I think deserves a careful note.

On June 2, 2026, the Chinese national standard project 机器人预期功能安全实施指南 entered public notice. The project cycle is 18 months, it is under the National Technical Committee on Robotics Standardization, and the public notice period closes on July 2, 2026.

The most important word in that title is not “robot.”

It is “预期功能安全.”

This should not be translated word by word. In this domain, it corresponds to SOTIF: Safety of the Intended Functionality.

The question is not: what if a component fails?

The question is:

If the system has no fault, and the function is operating as designed, but it still produces unsafe behavior in a complex open scenario, how do we identify, quantify, verify, and mitigate that risk?

The automated-driving industry has been wrestling with this question for many years. Now it is formally entering the standardization context for robots.

1. Why This Is an Important Signal

Traditional robot safety has often centered on separation, guarding, emergency stop, power and force limitation, speed limitation, and collaborative applications.

These remain important. Standards such as ISO 10218, ISO/TS 15066, and ISO 13482 provide essential safety foundations for industrial robots, collaborative robots, and personal care robots.

But robots are changing.

They no longer work only inside fences, and they no longer repeat fixed motions only on structured production lines. Mobile service robots, cleaning and disinfection robots, security inspection robots, logistics and delivery robots, elder-care robots, and medical robots will all enter open environments, facing ordinary users, complex tasks, and unstable scenarios.

More importantly, large language models and vision-language-action models are moving robot decision systems from rule logic toward semantic understanding, task decomposition, and end-to-end action generation.

At that point, the safety question is no longer only “did it hit someone?”

It also includes:

Did the robot misunderstand the user’s intent?
Did it continue to operate outside its ODD boundary?
Did it get too close, move too fast, or stop too abruptly in order to finish the task?
Did risk accumulate step by step over a long-horizon task?
Can refusal, fallback, human intervention, and runtime logs become auditable evidence?

That is the value of Robot SOTIF.

2. Why I Put It Into OpenTopic

The original intent of OpenTopic is to open-source research questions in the same spirit that we open-source datasets.

Honestly, I have had some materials on hand for a while that could be developed further. I had already organized three reports on embodied-intelligence safety, physical-interaction safety, and LLM/VLA decision safety. But if those reports only sit in a folder, their value is limited.

After this standard project notice appeared, I reread the materials and concluded that the best response was not another ordinary commentary article. It was to turn the topic into a maintainable open problem entry.

So I added the second theme to OpenTopic:

Robot SOTIF: Open Research Theme on Robot Safety of the Intended Functionality

The theme starts with three lines of inquiry:

My handling rule is deliberately conservative: standards are cited only from official sources. SOTA papers and experimental figures that have not been checked item by item are kept as research leads, not written as factual conclusions.

What this field needs most right now is not noise. It needs evidence chains.

3. What Can Migrate From Automated Driving

The most valuable lesson Robot SOTIF can take from automated driving is not a direct copy of ISO 21448.

What can really migrate is the methodology.

First, ODD.

In automated driving, ODD usually covers roads, weather, lighting, traffic participants, speed ranges, and similar dimensions. Robots also need ODD, but the dimensions will be richer: environment, task, user, mobile body, contacted object, communication state, floor condition, payload, and regulatory boundary.

Second, scenarios.

Automated driving has already built a scenario-based safety-evaluation framework. ISO 34502 provides a three-level abstraction of functional scenarios, logical scenarios, and concrete scenarios. Robots need a similar framework, except the scenarios no longer revolve only around road traffic. They must also incorporate task semantics, human-robot distance, contact state, and user behavior.

Third, triggering conditions.

The core of SOTIF is not to say vaguely that “there is risk.” It is to identify the conditions under which risk is triggered. For robots, triggering conditions may come from perception insufficiency, semantic misinterpretation, invalid task assumptions, a user suddenly moving closer, a slippery floor, payload shift, communication delay, or abnormal model confidence.

Fourth, safety argumentation.

Robot SOTIF should not end with a single score. It needs a safety case: claims, arguments, and evidence. Data, simulation, physical testing, runtime monitoring, fallback strategies, and logs should all become reviewable evidence.

4. Mobile Physical AI Is the Larger Background

In a recent proposal for a special guideline call, I defined “mobile physical AI” as AI’s ability to understand and interact with the laws of the real physical world, achieving autonomous perception, decision-making, motion, and execution in complex open environments.

That definition is broader than “robots.”

It covers intelligent vehicles, mobile robots, drones, logistics equipment, flying cars, and other mobile embodied systems. The real question is whether these systems can act safely, naturally, and effectively in the real physical world.

One key methodological framework here is HMRM: Human-like Mobility Reference Model. Its purpose is to establish comparable reference behavior distributions across different mobile bodies.

The paired concept is HLMB: Human-like Mobility Behaviour. It refers to the externally observable behavior of the evaluated object during tasks such as passing, avoidance, following, yielding, delivery, and handoff.

In short, HMRM is the reference model, and HLMB is the evaluated object. Together, they allow Safety, Smoothness, and Efficiency to enter the same cross-body evaluation chain.

This is naturally connected to Robot SOTIF.

Because in many cases, safety is not an isolated threshold. It is a behavior-distribution problem.

Take jerk as an example. In a passenger car, it relates to ride comfort. In a commercial vehicle, it also relates to cargo integrity. In a last-mile delivery robot, it relates to parcel stability. In a larger mobile robot, it may relate to chassis stability, equipment protection, and pedestrian expectation.

If we only ask whether the system collided with something, we will miss many of the safety problems that matter in real deployment.

5. What To Do Next

This theme is not a conclusion. It is an open problem entry for deeper research, corporate validation, and product implementation.

I hope researchers, enterprise engineering teams, evaluation organizations, and factory-site teams in related fields can treat it as an open problem list that can keep moving.

There are many possible directions:

Classify Robot SOTIF scenarios and triggering conditions, and convert them into reviewable test items
Build ODD templates and deployment-admission checklists for mobile service robots
Evaluate approach, yielding, stopping, and avoidance behaviors for last-mile delivery robots
Design safety monitors and runtime fallback strategies for LLM/VLA robot task planning
Develop robot decision-safety evaluation protocols based on real-trajectory replay
Map HMRM across mobile robots and intelligent vehicles, and turn it into product evaluation metrics
Build robot safety-case templates for standardization and product admission

I will continue maintaining this theme. After the public notice period closes, if the project content changes, I will update it again.

In the short term, this is a standard project. In the long term, it may become a new research direction.

The robotics and mobile physical AI communities do not need to repeat every fall that automated-driving safety has already taken over the past decade.

The valuable transfer is not copying answers. It is reusing the problem awareness, evidence structure, and engineering discipline.

Sources:

1. Why This Is an Important Signal#

2. Why I Put It Into OpenTopic#

3. What Can Migrate From Automated Driving#

4. Mobile Physical AI Is the Larger Background#

5. What To Do Next#

1. Why This Is an Important Signal

2. Why I Put It Into OpenTopic

3. What Can Migrate From Automated Driving

4. Mobile Physical AI Is the Larger Background

5. What To Do Next