VDA AI in QM: Germany Sets the Rules for AI First — What Does It Mean for China's Autonomous Driving Industry?

Abstract: In March 2026, Germany’s VDA published the global automotive industry’s first standardized guideline for AI quality management — VDA 20 AI in Quality Management (191 pages). This article provides an in-depth analysis of its AIQM three-tier risk classification, 80-item checklist, and 12 application cases. It examines the reference value for China’s autonomous driving industry and explores China’s leading advantages and window of opportunity in end-to-end evaluation methodologies and data infrastructure.

Figure 1

Full document download link at the end of this article

Chapter Guide

I. Background — What is VDA AI in QM? Why should the global automotive industry pay attention?

II. Core Framework — Risk classification, 80-item checklist, application cases, and the AI safety standards landscape

III. Reference Value for China’s AD Industry — Filling the AI quality management gap, change management classification

IV. Additional Considerations for China — Regulatory differences, organizational culture differences, data ecosystem differences

V. Challenges and Recommendations — Inherent limitations + end-to-end explainability, data drift, supply chain coordination

VI. China’s Leading Advantages — AI adoption speed, end-to-end evaluation, data infrastructure

VII. Key Takeaways — Action items for OEMs/Tier 1s, standards bodies, and researchers

VIII. One-Line Summary — Whoever sets the rules for AI on the road first defines the next decade

I. Background

In March 2026, the German Association of the Automotive Industry (VDA) published VDA 20 — AI in Quality Management (hereafter the “Yellow Volume”), the global automotive industry’s first systematic standardized guideline for AI quality management, totaling 191 pages.

Why does this matter?

The VDA standards carry influence in the global automotive industry on par with ISO standards — VDA 6.3 (Process Audit) and VDA 6.5 (Product Audit) are virtually mandatory for any automotive supplier entering the German market. The release of VDA 20 signals that the use of AI in automotive quality management is no longer optional — it now has a normative framework and evaluation methodology.

Figure 2

Yellow Volume chapter framework

II. Core Framework of the Yellow Volume

2.1 AIQM Three-Tier Risk Classification — Assigning “Safety Levels” to AI Systems

The Yellow Volume’s most significant contribution is the AIQM (AI in Quality Management) three-tier risk classification methodology, which evaluates AI systems across seven risk dimensions.

Seven Risk Dimensions:

Classification Results:

AIQM-3 (Highest requirements): Triggered when any of the seven dimensions rates “High” → All assessment items must be satisfied

AIQM-2 (Moderate requirements): A subset of assessment items applies

AIQM-1 (Baseline requirements): A minimal set of assessment items applies

A key insight: Whenever an AI system involves functional safety at ASIL C/D or SOTIF type 3/4, it automatically triggers AIQM-3. This means that all AI components used in L2+ driver assistance and L3+ autonomous driving must meet the highest quality management tier.

2.2 Eight-Stage Assessment Checklist — Roughly 80 “Hard Questions”

After AIQM risk classification, the second step is a detailed technical assessment of the AI system. VDA defines eight stages aligned with the AI system development lifecycle:

Application Domain Definition → Data Understanding → Data Collection → Data Preparation → Modeling → Evaluation → Deployment → Operations & Maintenance

Each stage contains specific assessment questions. For example:

Application Domain: Have realistic and achievable objectives been defined? Have explainability requirements been specified?

Data Collection: Is data collection adequately documented? Is it reproducible? Is version management in place?

Modeling: Is the model selection documented? Is hyperparameter tuning systematic?

Operations: Is there a continuous monitoring mechanism? How is data drift detected? How are changes managed?

2.3 Twelve Application Cases — From Theory to Practice

Factors influencing the successful use of AI in organizations — Chapter 6 of the Yellow Volume provides 12 specific application cases for AI in automotive quality management, each described using a standardized template:

Figure 3

Each case description includes: Description, Framework Conditions, Added Value, Challenges, Implementation Process, Specific Examples, Change Management Plan, and Methods for Interpreting and Evaluating AI Output. The last two items are where VDA truly distinguishes itself — it doesn’t just tell you “how to use AI,” it also tells you “how to audit AI results” and “how to manage changes to AI.”

2.4 VDA 20’s Position in the AI Automotive Safety Standards Landscape

VDA 20 does not exist in isolation — to understand its value, it must be placed within the broader landscape of AI automotive safety standards.

A key finding: In its 191 pages, VDA 20 does not cite ISO/PAS 8800:2024 — the world’s first automotive AI safety standard, published in February 2024. This is likely because VDA 20 had a lengthy development cycle (typically 2-3 years), and ISO/PAS 8800 was published when VDA 20 was already in late-stage editing. For practitioners, however, VDA 20 and ISO/PAS 8800 must be used in conjunction:

VDA 20 answers: “How do we manage AI quality?”

ISO/PAS 8800 answers: “How do we ensure AI safety in vehicles?”

ISO 26262/21448 answers: “How do we ensure vehicle system safety?”

The three are complementary — none can be omitted.

III. Reference Value for China’s Autonomous Driving Industry

3.1 Filling the “AI Component Quality Management” Standards Gap

In China, the standards framework for functional safety (GB/T 34590, aligned with ISO 26262) and SOTIF (GB/T 43267-2023, aligned with ISO 21448) is largely established, but these standards primarily target traditional systems engineering approaches — they assume system behavior is deterministic and traceable.

End-to-end AI components break this assumption:

Model behavior is probabilistic, not deterministic

The same input may yield different outputs

The decision-making process is a “black box”

Model performance can “drift” over time

VDA 20 provides a systematic quality management framework for precisely these issues. It does not address “Is the AI safe?” (that falls under ISO 26262/21448/8800) — it addresses “How do we manage AI quality?”

3.2 Three-Tier Change Management Classification — Highly Practical

VDA defines a three-tier change classification for each application case:

Reference value for autonomous driving: Current OEMs update their end-to-end models via OTA at a high frequency (some even weekly), but change management processes for these updates are often inadequate. VDA’s three-tier classification can be directly adopted — classifying each OTA model update as A/B/C, with different approval workflows for each tier.

3.3 “Interpretation and Evaluation of AI Output” — Addressing the Trust Problem

VDA requires each application case to define “how to interpret and evaluate AI output.” This includes:

Transparency labeling of AI output (which outputs are AI-generated, with what confidence level)

Reference example validation (testing AI output against known correct answers)

Feedback mechanisms (users can flag AI output as “correct/incorrect”)

Trigger conditions for human review (under what circumstances must a human confirm)

This is highly relevant to SOTIF verification processes in autonomous driving — particularly when validating end-to-end model decisions, where explaining “why the model made this decision” remains a core challenge.

IV. Additional Considerations for Adoption in China

4.1 Regulatory Differences

Key differences:

China does not have a unified AI regulation like the EU AI Act, but it has multiple fragmented regulations across agencies (CAC, MIIT, MOST). AIQM Risk Dimension 1 needs to be mapped to China’s AI algorithm filing and deepfake synthesis management regulations.

China’s cross-border data transfer restrictions are stricter than the EU’s. If driving behavior data involves geographic location information, it may trigger the “important data” designation under the Data Security Law, restricting data export. This directly impacts Risk Dimension 2.

China’s autonomous driving market access system is evolving rapidly. The GB standard series and the Administrative Provisions on Intelligent and Connected Vehicle Access are being developed, and some elements (such as L3 access conditions) have not yet been finalized. The ISO 21448 classification referenced in AIQM Risk Dimension 7 needs to be mapped to China’s access tier system.

4.2 Organizational Culture Differences

VDA 20 Chapter 3 describes in detail the “organizational culture elements for AI adoption” — reflecting the German approach to systematic management. When implementing in China, note the following:

Germany emphasizes “process first” — define complete processes and approval mechanisms before rolling out. Chinese enterprises tend to prefer “use it first, standardize later.”

Germany’s dual-control review culture — nearly all Class A changes require two-person confirmation. Chinese enterprises may find this difficult to implement during rapid iteration.

Germany’s extremely high documentation requirements — VDA requires documentation at nearly every step. Chinese teams need to build the habit of maintaining “AI usage logs.”

4.3 Data Ecosystem Differences

The data environment VDA assumes differs from the reality in China:

VDA assumes data can be shared across organizations (e.g., suppliers providing quality data to OEMs) — China’s data silo problem is more severe

VDA assumes standardized data interfaces (e.g., MES/ERP integration) — the level of digitalization varies significantly among Chinese automakers

VDA does not address the specifics of Chinese NLP — in use cases 6.2 (8D Problem Solving), 6.9 (VDA Knowledge Q&A), and other NLP applications, Chinese word segmentation, polysemy, and domain-specific terminology handling are more complex than German

V. Challenges and Recommendations

5.1 Inherent Limitations of the Yellow Volume

5.2 Technical Challenges in China

Challenge 1: Explainability of End-to-End AI Models

VDA 20 Risk Dimension 4 requires “model explainability.” However, current end-to-end autonomous driving models (such as Tesla FSD) are fundamentally black boxes — which directly triggers an AIQM-3 rating. How to satisfy explainability requirements while maintaining the end-to-end architecture remains an unsolved technical challenge.

Possible directions:

Attention map visualization as partial explainability evidence

Building “evaluation benchmarks” (e.g., a Driver Foundation Model/DFM) to externally assess whether model behavior is human-like, rather than explaining the internal decision process

Challenge 2: Sustained Data Quality Assurance

VDA requires “data drift detection” and “continuous data quality monitoring.” For autonomous driving:

Training data distributions drift as cities expand, seasons change, and roads are reconstructed

A “data quality dashboard” needs to be established — similar to our typical scenario parameter convergence matrix — to continuously monitor the stability of data baselines

Challenge 3: AI Quality Management Across the Supply Chain

The automotive industry is highly specialized. An autonomous driving system may incorporate AI components from multiple suppliers (different suppliers for driving automation and cabin intelligence). The VDA 20 framework currently operates at the single-system level and lacks a supply-chain-level AI quality management coordination mechanism.

VI. China’s Leading Advantages and Opportunities

6.1 AI Adoption Speed and Scale Far Exceed Europe’s

China’s automotive industry is adopting AI at a pace far beyond Germany’s:

Opportunity: China can build on the VDA 20 framework and combine it with the rich domestic AI application experience to develop AI quality management standards better suited to the end-to-end era — covering not only “AI as a tool in QM” but also “quality management of AI as a core vehicle function component.”

6.2 First-Mover Advantage in End-to-End Evaluation Methodologies

VDA 20 explicitly states that it “does not cover vehicle-function AI components,” which is precisely a standards vacuum. China has already developed multiple research tracks in this area:

End-to-end behavioral evaluation: Multiple universities and companies are exploring the use of naturalistic driving data to build statistical benchmarks for “how humans actually drive,” serving as a reference for evaluating end-to-end system behavior — moving beyond “did it crash or not” to “does it drive like a human”

Scenario-driven SOTIF evaluation: Preventability assessment methods based on real crash data are transitioning from academic research to industry adoption (the implementation of GB/T 43267 provides the institutional foundation)

Mass-production data closed loop: Multiple automakers have established complete closed loops of “data collection → scenario mining → simulation verification → OTA updates,” accumulating extensive end-to-end evaluation experience

These methodologies can fill the “vehicle-function AI” gap left by VDA 20. Our team has also been exploring this direction — using aerial naturalistic driving data to build a Driver Foundation Model (DFM) that quantitatively evaluates how closely end-to-end systems approximate human driving behavior.

6.3 Differentiated Data Infrastructure Advantage

China’s data ecosystem has several structural advantages that Germany lacks:

Traffic scenario complexity: China’s mixed traffic (motor vehicles + non-motorized vehicles + pedestrians), urban road density, and driving behavior diversity far exceed Europe’s, meaning AI models trained and validated in China inherently face higher-difficulty scenarios

Traffic accident big data: The Ministry of Public Security’s system has accumulated nationwide crash data at a scale and coverage level that is globally rare

Roadside perception data: Intelligent connected vehicle demonstration zones (Beijing Yizhuang, Suzhou Xiangcheng, Changsha, etc.) have established roadside perception infrastructure at scale

Vehicle-end mass-production data: Massive data collected from multiple end-to-end production vehicle fleets creates a continuously growing data flywheel

Aerial naturalistic driving data: Multiple research groups, including our team, are building drone-based naturalistic driving datasets that provide “bird’s-eye view” multi-vehicle interaction behavior data

These data layers are complementary — crash data defines “where is it dangerous,” aerial data describes “how humans drive,” and vehicle-end data reveals “how the car drives” — together providing a verification foundation for AI quality management that Germany cannot match.

VII. Key Takeaways

For OEMs/Tier 1s

VDA 20 is not “something for the future” — it is something you need to start preparing for now. If your product uses AI components (whether for visual inspection or end-to-end decision-making), VDA 20 gives you a clear starting point: perform AIQM risk classification first, then evaluate item by item using the checklist.

For Standards Bodies

VDA 20 provides an excellent reference framework for developing China’s AI quality management standards — but it cannot be adopted wholesale. China needs to:

Add quality management methods for vehicle-function AI (explicitly not covered by VDA)

Map to China’s AI regulatory framework (replacing EU AI Act references)

Include special requirements for end-to-end models (explainability alternatives, OTA change management, etc.)

For Researchers

VDA 20 exposes a massive research gap — how to quantitatively assess AI system quality.

Current evaluation methods are mostly qualitative “yes/no” judgments, lacking quantitative metrics. From naturalistic driving behavior statistical benchmarks, to scenario preventability quantitative assessment, to statistical convergence methods for safety parameters — these directions all contain a wealth of scientific questions waiting to be answered.

VIII. One-Line Summary

VDA has set “rules” for AI, but these rules only cover AI in the factory and the office. AI on the road — end-to-end autonomous driving — is still waiting for its rules. Whoever defines them first will set the rules of the game for the next decade.

Figure 4

VDA 20 AI in QM Full Document Download Link:

https://vda-qmc.de/wp-content/uploads/2026/03/VDA-AI-in-QM_Yellow-Volume.pdf

Chapter Guide#

I. Background#

II. Core Framework of the Yellow Volume#

2.1 AIQM Three-Tier Risk Classification — Assigning “Safety Levels” to AI Systems#

2.2 Eight-Stage Assessment Checklist — Roughly 80 “Hard Questions”#

2.3 Twelve Application Cases — From Theory to Practice#

2.4 VDA 20’s Position in the AI Automotive Safety Standards Landscape#

III. Reference Value for China’s Autonomous Driving Industry#

3.1 Filling the “AI Component Quality Management” Standards Gap#

3.2 Three-Tier Change Management Classification — Highly Practical#

3.3 “Interpretation and Evaluation of AI Output” — Addressing the Trust Problem#

IV. Additional Considerations for Adoption in China#

4.1 Regulatory Differences#

4.2 Organizational Culture Differences#

4.3 Data Ecosystem Differences#

V. Challenges and Recommendations#

5.1 Inherent Limitations of the Yellow Volume#

5.2 Technical Challenges in China#

Challenge 1: Explainability of End-to-End AI Models#

Challenge 2: Sustained Data Quality Assurance#

Challenge 3: AI Quality Management Across the Supply Chain#

VI. China’s Leading Advantages and Opportunities#

6.1 AI Adoption Speed and Scale Far Exceed Europe’s#

6.2 First-Mover Advantage in End-to-End Evaluation Methodologies#

6.3 Differentiated Data Infrastructure Advantage#

VII. Key Takeaways#

For OEMs/Tier 1s#

For Standards Bodies#

For Researchers#

VIII. One-Line Summary#