3. Evaluation

1

The design evaluation process is intended to assess and validate the system’s architecture, or to differentiate between various design alternatives, thereby facilitating incremental improvements in successive development cycles. The SCE methodology outlines two key components pertinent to system evaluation: (1) the prototype and/or simulation, and (2) the evaluation, which details the assessment approach and findings.

2

3

Within the SYNERGISE project the focus for TNO is developing a framework that describes the optimal human-machine teaming (HMT) of the human-AI system, not the development of a complete human-AI system. As such, the evaluation in this section focusses on evaluating and iterating on the envisioned human-machine teaming framework, encompassing team design patterns, and describing what information and the method of communication that is desired by the user from the system.

4

5

Within the SYNERGISE project, initial assessments were conducted through CFTs (Component Field Tests), followed by comprehensive SFTs (System Field Tests). The CFTs involved only selected technologies, whereas the SFTs integrated all developed technologies for thorough testing.

6

7

This evaluation chapter will present all prototypes created for the HMT framework, accompanied by their corresponding CFT tests. Prototype documentation illustrates the concepts and principles of HMT, while test reports detail interactions with users and summarize the results obtained.

8

9

=== CFT1: Data Visualisation ===

10

11

This first CFT evaluated how sensor data - specifically health, location, and communication data - can support decision-making in USAR operations at tactical and operational levels. Questionnaires were used to determine users' data needs and preferences for [[visualization formats>>doc:.a\. Prototype.WebHome]]. The [[study found>>doc:.b\. Test.WebHome]] that different roles require different types and levels of data detail: tactical users prefer aggregated overviews, while operational users need real-time, detailed information.

12

13

=== CFT2: Concept of Operations ===

14

15

During the second CFT the Concept of Operations (ConOps) was evaluated. Through[[ structured sessions>>doc:.d\. Test.WebHome]] involving presentations of multiple[[ use cases>>doc:.c\. Prototype .WebHome]] and participant feedback, the study explored operational feasibility, safetly, and team dynamics. [[Key findings>>doc:.d\. Test.WebHome]] highlighted the importance of context in technology deployment, the need for rapid and logical task distribution, and the critical role of human oversight.

16

17

=== CFT3: Two Detailed Use Cases ===

18

19

The third CFT, evaluated [[two use cases>>doc:.e\. Prototype .WebHome]]: autonomous indoor drones for victim detection and physiological/ environmental sensors for monitoring first responder health. The study assessed usability, trust, and integration into emergency workflows through structured sessions with firefighters, USAR personnel, and drone pilots. [[Key findings highlight>>doc:.f\. Test.WebHome]] that while both technologies show promise, their success depends on seamless integration into operational proces, respecting roles, protocols, and situational constraints to become trusted tools in the field.

20

21

=== CFT4 and onwards: Human-Machine Teaming ===

22

23

In total four usecases with claims regarding their human-machine teamwork have been developed using the SCE method. During the upcoming CFT4 the human-machine teaming with a number of additional technologies such as the AnyMAL and snake robot will be explored further, as well as iterating over all usecase during the coming IFT's (Integration Field Tests).

24

25

Wiki source code of 3. Evaluation