Trends: A Decisive Shift

The DoD’s new assurance frameworks move autonomy vendors from a “trust us” to a “prove it” model, focusing on the need for predictability, modularity and scalability.

Trust in autonomous systems cannot be built on assumptions.

When deploying unmanned vehicles, warfighters must know that, beyond a doubt, these systems will perform as expected—no matter the domain. One unexpected move could lead to many problems, and be the difference between our warfighters making it out of a contested environment safely or not making it out at all.

The new SkyCarrier autonomous drone launch and recovery system from Teledyne FLIR Defense has been successfully trialed on multiple maritime vessels, including fast attack boats and rigid hull inflatable boats (RHIBs) at speeds up to 50km/h (30mph), demonstrating its reliability in real-world conditions. Photo Courtesy of Teledyne FLIR Defense.

So called “trusted autonomy” enables operators to safely delegate to systems that are proven to be secure, reliable, controllable, lawful and predictable, Forterra Chief Growth Officer Scott Sanders said. It’s become essential, as “modern warfare moves at machine speeds, and the U.S. needs to be ready to move and react in seconds.”

“When communications drop, the network of vehicles and equipment will still have to follow the commander’s intent and make the right call in that moment,” he said. “If we can’t trust that, we’ll either move slow and lose or move fast and cause real, unintended harm.”

In defense applications, autonomy can be leveraged for anything from small weapons to large logistics drones, Joby Aviation Autonomy Lead Maxime Gariel said. Joby builds systems that can support logistic missions, for example. All these systems have to make decisions and be able to react to changing environments. Vendors must ensure the systems make the right decisions—especially as their abilities and use cases continue to expand—keeping surrounding aircraft and people safe.

The Department of Defense (DoD) has developed new assurance frameworks for autonomous vendors to follow, requiring them to prove their systems will work, without question, in every scenario. These frameworks reshape how autonomous systems, whether operating in the air, on the ground or in the water, are tested, validated and fielded. With this shift, trusted autonomy has become the prerequisite for autonomous systems to scale.

New DoD guidebooks signal a “decisive shift,” said Miguel Ángel de Frutos Carro, director and CTO at UAV Navigation-Grupo Oesía, and require autonomy vendors to “demonstrate not just technological capability, but also trust, resilience, cybersecurity and interoperability.” They also must prove they can perform in contested and degraded environments.

“In practice,” de Frutos Carro said, “this translates into systems that perform predictably under stress, are inherently cyber-secure, and can integrate seamlessly into joint and coalition operations.”

But demonstrating trust isn’t easy. It requires vigorous testing, evaluation, verification and validation, along with governance through regulation and engineering safeguards, Sanders said. While this can be a long, difficult process, it’s the only way to ensure autonomous systems predictably perform their jobs.

“Trusted autonomy in defense goes beyond whether it can drive a vehicle and intelligently execute on missions. It’s about whether soldiers, operators and regulators believe it will behave predictably and in the way it’s intended to when things go awry,” Sanders said. “Proving reliability in those moments requires millions of testing miles, explainable AI and bulletproof cybersecurity.”

Joby’s Superpilot-equipped Cessna 208 flies autonomously over the Pacific Ocean during the Resolute Force Pacific (REFORPAC) exercise. Photo courtesy of Joby Aviation.

Evolving Guidance

Guidebooks from the DoD move autonomy vendors from a “trust us” to a “prove it” model, Sanders said. Vendors now must show how their AI works, build in fail-safes and back everything up with transparent testing, cybersecurity and supply-chain evidence.

One such guidebook, published by the office of Developmental Test, Evaluation and Assessment (DTE&A), focuses on AI, providing “recommended practices for early and developmental test and evaluation (DT&E) of AI applications and AI-enabled systems,” according to the office website. The test and evaluation (T&E) guidance focuses on how AI models impact test strategy, planning, preparation, execution, analysis and reporting.

Comprehensive T&E for AI-enabled systems is a challenge. How soldiers leverage AI continues to evolve, but the tools required to test it are still emerging, according to the website. Traditional strategies and test designs may not work, and both testing and assessment must be iterative and continue even after fielding.

“Because you can’t exhaustively test AI under every condition, the data drifts, adversaries adapt, and comms can drop, it can be an extra challenge,” Sanders said. “With the DoD’s higher bar, we have to provide evidence of what systems will and won’t do, and offering evidence of performance in every possible scenario is an ongoing process. We do this by enabling the department to see exactly how we test our full instrumentation and validation pipeline.”

Other guidance focuses on cybersecurity as well. The DoD Artificial Intelligence Cybersecurity Risk Management Tailoring Guide, published in July, “establishes the DoD cybersecurity risk management tailoring guidance for the acquisition, development, use, sustainment, monitoring, and disposal of artificial intelligence (AI) systems” and “seeks to help DoD organizations manage cybersecurity risks in the use of AI systems throughout the system lifecycle and thus encourage warfighter trust.”

Cyber tailoring guidance, de Frutos Carro said, “underscores the need for resilience by design: zero-trust architectures, rejection of compromised inputs, secure supply chains, and autonomous fault management.” The company’s proprietary autopilot and Attitude Heading Reference System (AHRS) “embody this philosophy, ensuring that only trusted data is used for navigation and control, even in GNSS-contested environments.”

Interoperability is another critical DoD demand, de Frutos Carro said: “The DoD stresses open architectures to simplify integration across multiple platforms and reduce training burdens.”

There’s also been an effort by the DoD to standardize interfaces to make it easier to purchase and integrate systems from various vendors so they can communicate with each other, Gariel said.

“Vendors are working with government partners, the DoD and others,” Gariel said, “to figure out how these systems of systems work together and how they can better build confidence in the end user.”

The DoD frameworks and guidebooks promote careful testing to ensure these systems will work in “real-world situations and can handle tough 21st century operational challenges,” said Anastacia MacAllister, technical director of autonomy and artificial intelligence for General Atomics Aeronautical Systems. And autonomy vendors are taking heed.

“In essence, these guidebooks codify what we have long prioritized,” de Frutos Carro said. “Resilient, cyber-secure and interoperable autonomy.”

UAV Navigation-Grupo Oesía’s VECTOR autopilots in the production chain at one of its centers of excellence. Image: UAV Navigation-Grupo Oesía

The Response

Defense customers, MacAllister said, have sent a clear demand signal: create secure systems that can handle tough situations on their own while keeping humans in control.

These systems must earn trust before they can be fielded at scale, and that is what new DoD assurance frameworks aim to achieve. Autonomy vendors have adjusted, with trusted autonomy now directly shaping how companies like UAV Navigation develop technology roadmaps.

To support trusted autonomy, the company has retooled development and assurance pipelines to be adaptive and responsive, de Frutos Carro said. Simulations, hardware-in-the loop testing, daily flight validations and feedback from operational deployments allow them to refine systems to ensure they remain robust and predictable even in contested or changing environments.

For Forterra, the new DoD demands mean demonstrating clear safety limits, real human oversight, cyber resilience, and supply-chain integrity, Sanders said. Long duration, full system testing also has become critical to ensure safety and reliability.

Systems should be tested daily to ensure they can be trusted in the field, Sanders said. Once deployed, autonomous systems will have to operate for weeks, even months, without on-site support. Warfighters “must build trust with their robotic counterparts to work every time.” Long-term testing ensures both hardware and software remain robust, and can demonstrate the “technical endurance” required for defense operations.

“Autonomous systems will undoubtedly face edge use cases, especially on the battlefield,” Sanders said. “If a system isn’t being tested every day, it won’t have the data and validation needed to enable an end user to trust the platform.”

General Atomics executes live test flights at least monthly, MacAllister said, on platforms like the MQ-20 Avenger. Pilots receive hands-on experience working alongside autonomous systems during these flights, preparing them for a successful pairing on the battlefield.

“This [live test flights] helps them build familiarity and develop a shared mental model,” MacAllister said. “Ultimately, these exercises help operators train like they will fight, with autonomy, making it easier to trust the system as a reliable teammate.”

The DoD’s new assurance frameworks reshape the focus from theoretical capability to demonstrable, field-proven performance, de Frutos Carro said. The work isn’t done once the solution is delivered; it’s critical to continuously engage with users and monitor the systems so adjustments can be made as needed.

“They require that autonomous systems not only meet specifications in controlled environments but also prove resilience in flight tests, operational scenarios, and contested conditions such as GNSS denial, jamming and spoofing,” he said. “Importantly, these evaluations are increasingly observed or conducted directly by DoD personnel, ensuring an independent layer of trust.”

Image: General Atomics Aeronautical Systems

It Won’t Be Easy

One of the biggest challenges in developing autonomous systems, de Frutos Carro said, is the pace of change in operational requirements.

“Threats, countermeasures and mission profiles evolve rapidly, so vendors must be able to adapt architectures, software and integration pathways without disrupting safety or reliability,” he said. “This demands internal processes that are both agile and disciplined, capable of managing complexity while maintaining strict assurance.”

Integrating across diverse platforms and ecosystems also can be difficult, de Frutos Carro said. Autonomous systems must be able to integrate with other vehicles, command systems and communications networks. Adding interoperability while also “preserving security and predictability adds significant complexity,” and has become “a critical driver of cost and schedule.”

The greatest pressure, de Frutos Carro said, typically comes in late-stage integration and customer acceptance. This is where costs and delays “hit hardest” as “aligning system behavior with real-world mission requirements, training users and demonstrating performance under operationally relevant conditions can be more resource-intensive than the design itself.”

“The mandate for trusted autonomy completely reshapes development,” Sanders said. “There’s an inherent increase in cost, time and complexity, but it creates a moat, meaning if your technology doesn’t pass muster, it’s dead in the water. But if you nail it, you have a deployable product that will be trusted and in demand.”

Technology must be built with the warfighter in mind, MacAllister said, which means it must be “safe, reliable, easy to work with and have the ability to explain its decisions.” This need for explainability and trust can help guide algorithmic choices, user interface designs or logging strategies.

“The challenge with trusted autonomy,” MacAllister said, “is making sure these systems can handle unexpected situations, stay protected from cyber threats, and explain what they’re doing so people can trust them.”

Another challenge is building confidence in the end user, said Teledyne FLIR’s Rich Cunha, who oversees SkyCarrier, a new autonomous launch and recovery platform. Users must understand how these systems of systems work together, and not feel the need to constantly monitor them.

Transparency will help autonomy vendors build that trust, starting from day one of development. And that trust must be built on evidence.

“They [the DoD] want to be sure autonomous systems are safe and reliable in the field; they must work in both a lab and the real world,” Sanders said. “Focusing on testing the technology, how humans interact with it, and how it all fits into bigger systems makes validation faster and clearer, and it gives everyone more confidence when these systems are actually deployed.”

The solutions: Data centric architecture. Image: RTI

Enabling Autonomy

The DoD has also put a focus on upgradability, MacAllister said, “to enable rapid best of breed capability insertion for the warfighter.”

And the DoD wants those upgrades to happen much faster than in the past. Twenty years ago, that meant reducing development time from 10 years to just a few, said David Barnett, VP of market intelligence for RTI. Now, the focus is on microservcies architecture, making systems faster to update—within months, weeks or even daily.

RTI enables autonomy via highly modular communication software developed to interconnect sensor data to the applications that process the data in real time. Autonomous systems must be very loosely coupled; if you’re going to add a new AI algorithm, you don’t want to have to change the software in the camera or LiDAR sensor that’s feeding the new AI application.

“The one thing that’s certain, and especially with the rapid advancements in AI, is that you’re going to want to evolve functionality quickly over time,” Barnett said, “so you can field new capabilities fast to respond to new competitive threats but also as the technology itself improves.”

He’s also seen a push for enhanced security. These systems aren’t safe if they’re not secure, he said, noting “safety and security go hand in hand, especially if you start to look at collaborative autonomy.”

You can’t always depend on having connectivity between the operator and the vehicles themselves, Barnett said, but the vehicles may be collaborating. Ensuring comms between the systems is secure can be a challenge as they’re operating in real time and often over networks that aren’t reliable.

“When you’re deploying something at the edge in a mission environment where systems are very dynamic, you can’t depend on any cloud or central service,” Barnett said. “Everything has to communicate sort of ad hoc and peer to peer over challenging networks and still be secure.”

He continued: “We’ve had to put a lot of effort into coming up with security approaches that can meet those requirements. You can’t just take what you’re using in your web browser and put that into your unmanned vehicle and expect it to talk securely and reliably with another unmanned vehicle in real time.”

Working Together

Companies like Teledyne FLIR are focused on enhancing collaboration between system sensors and outside sensors, Cunha said, so they can provide warfighters with a more complete picture.

An example, he said, is mine clearing. Various sensors are leveraged to detect hazards between point A and point B, providing feedback on high and low risk areas.

“We’re quickly seeing a desire from our customers to move from single system autonomous systems to a system of systems approach, where you have many systems working together toward an end goal,” Cunha said. “That adds another layer of complexity, especially if the systems aren’t from the same OEM.”

That’s another reason defining standards is so critical, Cunha said. Everything must be as interoperable as possible.

SMET vehicles fully integrated with Forterra’s AutoDrive capabilities. Image: Forterra

Looking Ahead

Manufacturers will continue to build updatable systems for the defense industry that are “much more agile and flexible from a software perspective,” Barnett said. Hardware is more difficult to upgrade, especially if you’re talking about legacy systems. Equipment in the field will receive algorithm updates as needed. Barnett likens it to what we already experience with consumer devices.

Autonomous systems will be leveraged in a growing number of defense applications, Cunha said, and will have various levels of autonomy. Assurance of these systems will shift toward more continuous and collaborative models, de Frutos Carro said.

“Rather than relying solely on late-stage certification milestones, assurance will be distributed across the lifecycle supported by digital engineering, model-based certification, and closer feedback loops between developers, operators and regulators,” de Frutos Carro said. “This approach should help contain costs, accelerate fielding, and deliver systems that evolve at the speed of operational needs.”

And while meeting these standards, now and as they evolve, presents challenges and more work for vendors, it’s time and money well spent. When vendors prove autonomy can be trusted, time and time again, it opens up the door to scale but, most importantly, it helps ensure our soldiers stay safe.

“Demonstrating trust is a lot of work from the very beginning to make sure we have the right processes in place and that those processes are set up correctly,” Gariel said. “It does slow down the initial development, but over time, we get a more secure system and something that’s less prone to attacks. I think that’s pretty beneficial.”