Abstract

This brief extends prior research by the authors on studying the impacts of interventions provided by either a human or an artificial intelligence (AI) process manager on team behaviors. Our earlier research found that a created AI process manager matched the capabilities of human process management. Here, these data are studied further to identify the impact of different types of interventions on team behaviors and outcomes. This deeper dive is done via two unique perspectives: comparing teams’ problem-solving processes before and after interventions are provided, and through a regression analysis between intervention counts and performance. Results show overall mixed adherence to the provided interventions, and that this adherence also depends on the intervention type. The most significant impact on the team process arises from the communication frequency interventions. Furthermore, a regression analysis identifies the interventions with the greatest correlation with team performance, indicating a better selection of interventions from the AI process manager. Paired together, the results show the feasibility of automated process management via AI and shed light on the effective implementation of intervention strategies for future development and deployment.

1 Introduction

Process management can mitigate process deficiencies in team problem solving [1]. In fact, our earlier research has shown that an artificial intelligence (AI) agent acting as a process manager can perform just as well as their human counterparts in the same role [2]. Such hybrid teaming, which coalesces the strengths of humans with AI agents, is emerging as one way to increase the team performance across a variety of research domains and tasks [35]. Compared to humans, we posit that the advantage of automating process management via an AI agent is twofold. First, AI possesses the ability to track multiple metrics simultaneously and over time. Second, AI can ascertain underlying patterns within complex data, such as team communication, that may not be readily perceivable via direct inspection from a human manager.

Accordingly, this current work gains deeper insights on the impact of the process manager interventions from our prior AI process manager study [2,6]. In that work, results compared the differences between a constructed AI process manager agent and human process managers during an interdisciplinary drone design and path-planning problem. The comparisons encompassed overall team performance, manager intervention strategy, and perceived effectiveness of the interventions, with results revealing similar team outcomes and manager behaviors across many of these dimensions. The AI-managed teams’ performance even trended towards being marginally better than the human-managed teams after a sudden change to the problem constraints. Post-study questionnaires with team members showed that they perceived themselves as following the interventions provided by the managers. However, the analyses to date have not established a relationship between the managers’ interventions and the actual intended changes produced in the behaviors of the teams. Consequently, this current paper examines this relationship and investigates how certain interventions selected by the managers correlate with team performance.

To work towards these goals, Sec. 2 provides a high-level overview of the earlier AI manager study, including the collected data and results. Then Secs. 3.13.3 present the methods used to analyze interventions and the before and after impacts on the team process. Section 3.4 shows a regression analysis exploring the correlations of the selection of interventions by managers and team performance. Finally, Secs. 4 and 5 conclude with implications of automated process management and suggestions for furthering its development.

2 Data Source—Prior Artificial Intelligence Manager Study

While only a brief overview of the experiment and methodology will be provided here, more details can be found in the original work [2]. During our prior study, participants worked together via an online, collaborative research platform called HyForm2. In interdisciplinary teams of five members, teams designed drones and built routes to maximize the profit from food and parcel deliveries to customer markets [7]. Within the HyForm platform, team members communicated with each other via a chat interface through channels to specific team members. Furthermore, HyForm recorded every distinct action from each team member. In HyForm, members were limited to only act or chat with other team members, thus enabling complete reconstruction of the teams’ problem-solving behaviors and processes by researchers.

The problem-solving teams consisted of two drone designers who built drones, two operations specialists who designed delivery plans with these drones, and one problem manager who submitted final plans and communicated between the two disciplines. The experiment consisted of two, 20-min sessions, between which a change to the customer market imposed a “problem shock.” For the first session, teams began with an initial $15,000 budget and received $100 in profit per pound of package delivered and $200 in profit per pound of food order delivered. During session 2, following the shock, teams solved the same problem, but the cost of drones decreased by 70% and deliveries switched to lower-weight, personal protective equipment for COVID-19.

Two experimental conditions dictated whether a human (15 teams) or the AI process manager agent (16 teams) oversaw the problem-solving process of a team. In pre-determined intervals, the process managers (whether human or AI) could intervene from a prescribed list of interventions. The list incorporated 12 possible interventions to choose from, six related to team communication and six related to team member actions (see Sec. 3 and Table 3). During these discrete instances, the managers could choose not to intervene when they felt it was unnecessary or inopportune for their team. An earlier study by the authors, which compared human-only and AI-assisted teams on a similar task using HyForm, inspired the types of interventions populating this list [8]. The process managers observed/tracked several measures in real-time, including features of teams’ communication and team members’ actions. Starting at five minutes into each problem-solving session, the managers could intervene every 2.5 min, allowing for 12 possible instances. The human managers used an interface in the HyForm platform that enabled them to see real-time team communication and action data. While the human managers took this observational approach in determining when to intervene, the AI agent was trained on earlier problem-solving data to determine appropriate interventions. See Gyory et al., 2022 for details on the training process of the AI agent [2].

Results uncovered remarkable similarities between the two types of managers [2]. These similarities extended across a variety of dimensions, including team performance, perceived effectiveness of the interventions, as well as general process manager behaviors. Figure 1 shows the similarity in the distribution of intervention types provided by the managers. However, some differences emerged: a trend towards higher performance from the AI-managed teams, the number of interventions provided by each type of manager, as well as to whom the managers provided interventions to. In total, the human process managers intervened 127 times, using all 12 interventions from the prescribed set, while the AI agent intervened 167 distinct times, but only using 8 out of the 12 interventions. Both manager types focused 40% more on communication-based than action-based interventions (Fig. 1(a)). Figure 1(b) depicts the proportion of interventions across all managers (combining conditions), dividing communication-based interventions into those related to communication frequency and those related to communication content. These categorizations are determined based on what the interventions are meant to promote within team behaviors. While the earlier work provided this holistic overview of the impact of interventions on team performance, the goal of this current work is to study the direct impacts of the interventions from a new and more granular perspective. These impacts include adherence by team members and correlations with performance.

3 Methods and Results

Section 3 studies the impacts of the managers’ interventions (combining conditions) on team process, analyzing team behaviors that directly relate to a specific intervention, immediately before and following an intervention. As a different comparison is taken depending on the intervention type, the sections are broken down by categories (shown in Fig. 1(b)). Section 3.1 studies the interventions related to communication frequency, Sec. 3.2 studies those related to communication content, and Sec. 3.3 studies the action-based interventions. Lastly, Sec. 3.4 presents the regression analysis identifying the correlation of specific interventions on team performance and across manager types.

3.1 Communication Frequency Interventions.

The first set of interventions relates to communication frequency:

  1. Drone designers, please try to communicate with each other more.”

  2. “Ops team, please try to communicate with each other more.”

  3. “Hi problem manager, please try to communicate with your team more.”

The process managers offer these 117 times, constituting 35% of the total interventions. As these interventions intend to increase communication in a specific discipline (either drone designers, operations specialists, or the problem manager), their impact is determined by directly measuring the communication frequency of all members in a discipline before and after the intervention. Communication frequency is a strict count of the number of distinct chat communications, regardless of content. Figure 2 shows the total communication count across all 117 instances, each bin representing 30 s of data. The figure only reflects the communication within the specific disciplines receiving one of these interventions. For example, for a specific team, if “Ops team, please try to communicate with each other more,” is given, only the communication of the operations discipline is considered during the timeframe, rather than the entire team’s discourse.

In Fig. 2, each bin contains 30 s of real-time data, ending at the time labeled on the x-axis. For example, the center bin (30 s) contains the occurrence of the interventions occurring between 0 s and 30 s afterward. The figure shows a sharp, over threefold, increase in communication frequency within the 30 s following these interventions. This increase in communication lasts up to the succeeding 90 s. This threefold increase indicates a large and immediate influence of this intervention type on the communication frequency of the teams.

3.2 Communication Content Interventions.

The next set of interventions relates to communication content:

  • (4)

     “Team, I think you should try focusing more on adjusting the design parameters to meet the goals of the problem, and share this with each other (cost, capacity, speed, budget, weight, etc.).”

  • (5)

    “Team, try focusing more on your strategy. Try optimizing and increasing/decreasing size of components and share this with each other.”

  • (6)

    “Hi team, try sharing your goals with each other a bit more and make sure they’re aligned.”

The managers offer these 96 distinct times, constituting 29% of the total interventions used. These concentrate on the content of the discourse amongst the team rather than purely on the communication frequency. The entire 2.5 min windows, before and after an intervention, rather than 30 s increments, are used. This is done to measure the communication content more adequately since more communication data is necessary to make semantic comparisons. Also, this is the largest window of time that can be used without overlapping interventions. Recall that possible interventions occur every 2.5 min from each other.

Measuring the impact of the first two interventions involves the identification of specific keywords related to design parameters and design strategy, respectively. Non-exhaustively, this includes keywords such as: “velocity,” “payload,” “miles,” “houses,” “payload,” and “profit” for design parameters, and “increase,” “decrease,” “minimize,” “optimize,” and “balance,” for design strategy. Earlier problem-solving studies conducted with HyForm motivated the identification of these keyword sets, which were selected by the authors [8]. The final communication content intervention focuses on the alignment or cohesion of the team. Algorithmically, the AI agent triggers this intervention through the natural language processing technique, Latent semantic analysis (LSA), to measure discourse similarity [9]. LSA has been used to study design communication previously, the reason it is chosen here [10,11]. The overall team’s similarity is measured by averaging across all pairwise comparisons of similarity amongst all members of a team.

To quantitatively assess the impact of all three of the communication content interventions, the net change between the 2.5-min windows preceding and following an intervention is computed. A positive net change indicates an increase in a measure while a negative net change indicates a decrease. The assumption is that a positive influence from the process manager will cause an increase, or positive net change, in the corresponding measure (i.e., higher cohesion or more keyword usages). After computing the net change across all 96 instances of these three interventions, results indicate that there is not much of an impact on cohesion, with only a net increase in similarity of, +0.05 (whereas the LSA similarity measure can range from −1 to 1). The same is observed in design strategy keywords, showing a net increase of +0 keywords. However, there exists a larger impact from the design parameter interventions, with a total net increase in design parameter usage of +104 keywords across all design teams.

3.3 Design Action Interventions.

The last set of interventions analyzed is design action interventions, offered by the process managers 80 distinct times and constituting 24% of the total interventions. These include:

  •  (7)

    Drone designers, it would be helpful if you can continue working on and refining your drone designs a bit more.

  •  (8)

    Hey drone design team, I would recommend evaluating and submitting your current design and starting fresh.

  •  (9)

    Hey drone design team, check out the suggestions from the drone design agent.

  • (10)

    Ops planners, it would be good to continue working on and refining your plans a bit more.

  • (11)

    Hey operations team, I suggest that you try evaluating and submitting your plan and starting fresh.

  • (12)

    Hey operations team, try running the path-planning agent to help.

These interventions intend to elicit specific actions from team members. For example, “Hey operations team, try running the path-planning agent to help,” intends to push the operations discipline to run an assistive, path-planning agent in HyForm. Accordingly, whether a specific action occurs within a 1 min period after an intervention determines the effectiveness of the intervention. The assumption underlying using a 1 min interval is that after that period, the action is less likely to be caused by the input from the process manager. Moreover, design actions occur more frequently than team communication and these smaller time intervals can be used to better study the immediacy of any effects. The exception to this approach involves the two interventions related to drone/path iteration (interventions 7 and 10). Instead of looking for one specific action, the impact is measured with an aggregate count of any action related to drone/path plan iterations within the 1-min intervals.

Results across the set of action-based interventions differ quite dramatically. Figure 3 shows the overall action count for the design iteration-based interventions. The designation of time intervals on the x-axis in the figure follows the same nomenclature as that in Fig. 2. As shown in Fig. 3, there is no substantial increase in the aggregate of design iteration actions following these specific interventions. In fact, the trend across time shows a decrease, indicating that these interventions are not well followed by teams. Again, the figure reflects the actions within the specific disciplines that receive one of these interventions rather than over entire teams.

Table 1 (rows one to three) presents the results for the remaining four design action interventions. Since these interventions are intended to induce one specific type of design action, the impact is considered in a binary sense (i.e., whether the intended action occurs within the 1 min period). In the first row, the table presents the total number of intervention instances. The second row shows the total number of effective instances where the intended action occurs within one minute. The third row shows the percentage of effective interventions. Out of these four interventions, the only one showing high adherence from team members prompts the drone designers to evaluate and submit their designs (column 1). Otherwise, the interventions show low levels of adherence. These results sharply contrast with team members claiming to follow the interventions as indicated through the post-study questionnaires. However, the questionnaires did not evaluate at the individual intervention level, but rather overall adherence.

In addition to quantifying these interventions in a binary sense, time-to-action is also explored. The time-to-action disregards the 1 min interval assumption and measures the time between when the process manager provides the intervention and the first time that the intended action actually occurs. The underlying assumption is that if these actions are caused by the process managers, as opposed by random or natural occurrences, the time-to-action will be shorter during the time periods where one of these four specific interventions is provided. Rows four and five of Table 1 provide this additional perspective. Row four shows the time-to-action for the specific time intervals when these four interventions occur. Row five provides the time-to-action across all intervention intervals, including the intervals when other interventions occurred. Overall, the time-to-action is revealed to be shorter during the specific instances of occurrence (row four) for three out of the four interventions. It should be noted that statistical analysis of this data presents challenges with these small sample sizes. Thus, while these results are descriptive, more data should be collected to allow for more definitive, inferential assessments.

3.4 Regressing Interventions and Team Performance.

Linear regression models are also trained on the data to determine the correlation across the set of interventions with team performance. The overarching rationale is that certain interventions may be more predictive of team performance than others. The predictor variables in the model are the counts of the number of times managers use each of the 12 interventions for a team, including “No intervention”, totaling 13 independent variables. Recall that at any point, the process managers could choose not to intervene with any sort of stimulation to their team. Maximum profit represents the response variable as the indicator of team performance. A separate linear regression model is trained for each problem-solving session and each team condition, resulting in four models.

Table 2 presents the model statistics results from the four linear regressions, including the R2-values, F-statistics, and p-values. For both process manager conditions, the models for the first problem-solving session are significantly more predictive of overall performance than the models for the second problem-solving session (measured by their R2 values). Due to this difference, these two models are analyzed further to gain insights into the effects of interventions on team performance and differences between the manager types.

Table 3 offers more details for these two regression models. The second column shows each predictor variable: the 12 possible interventions and “No intervention”. The third and fourth columns present the resulting estimates on each of the dimensions for the two models. Columns five and six show the resulting p-values on each of these estimates for the two models. The final column shows the specific intervention each dimension represents. Comparing the two models in this way highlights some interesting insights among the interventions, as well as differences between the AI agent and human process managers. A positive value on a parameter estimate signals a positive correlation between interventions and team profit, whereas a negative value indicates an inverse impact. Since the regression models are over counts, the parameter estimates indicate the likely impact on profit from a single use of each intervention.

4 Discussion

The before and after analyses presented in Secs. 3.13.3 show the intriguing differences between which interventions produce intended changes within the behaviors of the teams. Overall, the three communication frequency interventions produce the largest impact on the team process. Perhaps being the most straightforward interventions of the entire set (both in terms of comprehension and implementation perspectives), it is rational that team members would be more likely to follow these interventions. Communication content shows mixed results, with design parameters discourse showing the highest adherence. The nature of how different types of information need to be communicated, especially within the specific team structure in the study, supports this finding. For example, the design parameters are concepts that are more likely needed to be, and more easily able to be, shared across the entire team. On the other hand, design strategies can be more easily understood on an individual basis. Since both the drone designers and operations planners focus on their own specific designs, design strategy is not as critical to share across the entire team. The design action interventions (x7, x8, x9, x10, x11, x12) also show mixed adherence. While the 1 min period to identify an appropriate action is an assumption of this work, the time-to-action analysis shows that intended actions occur quicker during applicable interventions.

The regression models from Sec. 3.4 provide deeper insight into how different interventions correlate with team performance. For both models, dimension x13, (“No intervention”) has a small influence on team performance, indicated by the small magnitude of its estimates relative to the other estimates within their respective model. This finding also provides a validation of the “No intervention” option, as one would not expect this option to have a large influence on team outcome, either positively or negatively. Dimensions x4 and x5 of the models highlight an additional insight. Both represent the interventions focused on communication content—design parameters and design strategy, respectively. These dimensions have relatively large and positive magnitudes on their parameter estimates, indicating a greater, positive impact on team performance. In fact, dimension x4 has one of the largest values for each model. Compared with the before and after analyses, this intervention also had a large impact on team process behaviors. The five interventions not used by the AI process manager result in no estimator (x3, x7, x9, x11, and x12). Comparing across manager conditions, three of these interventions (x3, x9, and x12,) hurt teams in the human process manager condition, resulting in large, negative magnitudes of the parameter estimates. Integrating this with results from the before and after analyses, four out of the five interventions not used by the AI process managers (x7, x9, x11, and x12) are the design actions. Thus, these interventions might not have been impactful on team performance because they did not induce the associated behavioral impacts intended by the managers.

Comparing across manager models, several interventions yield different predictions of team performance. Significant differences are computed by the 95% confidence intervals on the parameter estimates and identifying those that do not overlap. The interventions resulting in these differences include x2, x4, x6, and x8. While the latter three all correlate positively with performance, x8 does not. In the human-managed teams, it negatively correlates with performance while in the AI-managed teams it positively correlates with performance. In fact, overall, the AI process manager model has only one intervention that negatively correlates with performance, x1, whereas the human process manager model has six. Thus, there are fewer interventions that hurt the team in the AI process manager condition. Also, recall that the AI only selected 8 out of the 12 interventions. These differences in parameter estimate strongly indicate an overall better strategy, or selection of interventions, by the AI process manager. More analyses should be conducted to fully understand the nuances of these implications.

To develop future AI process managers, the agent developed in this work can serve as a valuable testbed for future research. From an implementation standpoint, different team process features and strategies can be tested in a computational setting before deployment in practice. Specifically, from this work, the before and after analyses guide the implementation of an additional feedback mechanism into the AI agent. These analyses identify whether the teams adhere to the interventions via the production of intended process behavior changes within the team. This information (whether the intended changes occur) can be reinserted back into the process manager as an additional measure to use in deciding the next intervention. With the current methodology, the AI process manager often used the same interventions multiple times in a row, especially if teams did not heed the manager’s advice the first time. The AI can instead use this additional feedback to mitigate this repetition and implement a different wording or phrasing to get the same information across. This reoccurrence of interventions was noted by many of the participants in post-study questionnaires and could be one reason for low adherence to some of the interventions.

5 Conclusion

The results from this work provide deeper insights into the effects of interventions from automated process management. In our previous work, it was shown that an AI process manager performed just as well as human manager. While members on the teams reported that they acted in accordance with the interventions provided, this work presents before and after analyses to uncover whether this was actually the case. Results show mixed adherence to the interventions depending on the intervention type. While the interventions focused on communication frequency and design parameter communication content show high influence on team process, others such as those focused on design actions show less of an impact. However, measuring the time-to-action shows a tendency for teams to perform certain actions more quickly under a relevant intervention. In addition to analyzing these direct impacts on team processes, regression models show interventions that yield higher correlations with team performance. Most critically, the regression analysis indicates a better selection of interventions by the AI agent compared to the human process managers. Overall, the results from this research guide further strategies to support the automation of process management of engineering design teams.

Footnote

Acknowledgment

The authors would like to acknowledge Nicolás F. Soria Zurita for his help running the study, and Jay Martin and Corey Balon for their assistance and support with the HyForm platform. This work was supported by the Air Force Office of Scientific Research under Grant No. FA9550-18-1-0088 and by the Defense Advanced Research Projects Agency through cooperative agreement N66001-17-1-4064. Any opinions, findings, conclusions, or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the sponsors.

Conflict of Interest

There are no conflicts of interest.

References

1.
Gyory
,
J. T.
,
Cagan
,
J.
, and
Kotovsky
,
K.
,
2019
, “
Are You Better Off Alone? Mitigating the Underperformance of Engineering Teams During Conceptual Design Through Adaptive Process Management
,”
Res. Eng. Des.
,
30
(
1
), pp.
85
102
.
2.
Gyory
,
J. T.
,
Soria Zurita
,
N. F.
,
Martin
,
J.
,
Balon
,
C.
,
McComb
,
C.
,
Kotovsky
,
K.
, and
Cagan
,
J.
,
2021
, “
Human Versus Artificial Intelligence: A Data-Driven Approach to Real-Time Process Management During Complex Engineering Design
,”
ASME J. Mech. Des.
,
144
(
2
), p.
021405
.
3.
Dellermann
,
D.
,
Ebel
,
P.
,
Söllner
,
M.
, and
Leimeister
,
J. M.
,
2019
, “
Hybrid Intelligence
,”
Bus. Inf. Syst. Eng.
,
61
(
5
), pp.
637
643
.
4.
Roll
,
I.
,
Wiese
,
E. S.
,
Long
,
Y.
,
Aleven
,
V.
, and
Koedinger
,
K. R.
,
2014
,
Design Recommendations for Intelligent Tutoring Systems
, Vol.
Vol. 2
,
U.S. Army Research Laboratory
,
Orlando, FL
, pp.
169
182
.
5.
Liew
,
C.
,
2018
, “
The Future of Radiology Augmented With Artificial Intelligence: A Strategy for Success
,”
Eur. J. Radiol.
,
102
, pp.
152
156
.
6.
Zurita
,
N. F. S.
,
Gyory
,
J. T.
,
Balon
,
C.
,
Martin
,
J.
,
Kotovsky
,
K.
,
Cagan
,
J.
, and
McComb
,
C.
,
2022
, “
Data on the Human Versus Artificial Intelligence Process Management Experiment
,”
Data Br.
,
41
, p.
107917
.
7.
HyFormTM GitHub
,” https://github.com/hyform/drone-testbed-server/releases/tag/2021-March-v2, Accessed April 23, 2021.
8.
Gyory
,
J. T.
,
Song
,
B.
,
Cagan
,
J.
, and
McComb
,
C.
,
2021
, “
Communication in AI-Assisted Teams During an Interdisciplinary Drone Design Problem
,”
International Conference on Engineering Design
,
Gothenburg, Sweden
,
Aug. 16–20
.
9.
Landauer
,
T. K.
,
Foltz
,
P. W.
, and
Laham
,
D.
,
1998
, “
An Introduction to Latent Semantic Analysis
,”
Discourse Process.
,
25
(
2–3
), pp.
259
284
.
10.
Dong
,
A.
,
2005
, “
The Latent Semantic Approach to Studying Design Team Communication
,”
Des. Stud.
,
26
(
5
), pp.
445
461
.
11.
Foltz
,
P. W.
, and
Martin
,
M. J.
,
2008
,
Team Effectiveness in Complex Organizations
,
Routledge
,
New York
, pp.
445
466
.