Detecting and Optimising Team Interactions in Software Development

Christian Zingg; Alexander von Gernler; Carsten Arzig; Frank; Schweitzer; Christoph Gote

arXiv:2302.14609·cs.SE·March 1, 2023

Detecting and Optimising Team Interactions in Software Development

Christian Zingg, Alexander von Gernler, Carsten Arzig, Frank, Schweitzer, Christoph Gote

PDF

Open Access

TL;DR

This paper introduces a data-driven, real-time method to detect and optimize team interaction structures in software development, enhancing knowledge sharing and team performance.

Contribution

It presents a novel approach using a block-constrained configuration model to analyze team interactions from platform traces, validated through a case study and interviews.

Findings

01

Accurately detects team interaction preferences

02

Enables comparison against synthetic benchmarks

03

Identifies areas for improving knowledge diffusion

Abstract

The functional interaction structure of a team captures the preferences with which members of different roles interact. This paper presents a data-driven approach to detect the functional interaction structure for software development teams from traces team members leave on development platforms during their daily work. Our approach considers differences in the activity levels of team members and uses a block-constrained configuration model to compute interaction preferences between members of different roles. We apply our approach in a case study to extract the functional interaction structure of a product team at the German IT security company genua GmbH. We subsequently validate the accuracy of the detected interaction structure in interviews with five team members. Finally, we show how our approach enables teams to compare their functional interaction structure against synthetically…

Tables2

Table 1. Table 1 : Summary statistics on team members and interactions for the four roles.

	Team members		Interactions
	Total	Per year	Total	Per year
Developers	67	30 – 51	483,878	33,108 – 73,928
Documenters	8	3 – 6	19,885	180 – 3,614
Product Owner	5	1 – 3	18,888	372 – 5,751
Stakeholders	62	18 – 40	21,451	551 – 5,650

Table 2. Table 2 : Summary statistics on team members and interactions for the three development platforms.

	Team members		Interactions
	Total	Per year	Total	Per year
Issue tracker	118	44 – 68	77,616	3,662 – 15,745
Code review platform	65	17 – 28	93,256	4,759 – 14,571
Version control system	75	34 – 57	101,179	8,152 – 18,093

Equations4

ω_{r_{1} r_{2}} := - lo g (1 - \frac{A _{r_{1} r_{2}}}{Ξ _{r_{1} r_{2}}}) .

ω_{r_{1} r_{2}} := - lo g (1 - \frac{A _{r_{1} r_{2}}}{Ξ _{r_{1} r_{2}}}) .

Pot := \frac{H ^{observed}}{H ^{max}} \in [0, 1] .

Pot := \frac{H ^{observed}}{H ^{max}} \in [0, 1] .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Visualization and Analytics · Team Dynamics and Performance · Business Strategy and Innovation

Full text

\authoralternative

Christian Zingg, Alexander von Gernler, Carsten Arzig, Frank Schweitzer, and Christoph Gote

Detecting and Optimising Team Interactions in Software Development

Christian Zingg1,5 Alexander von Gernler2,6 Carsten Arzig2,7

Frank Schweitzer1,3,8 Christoph Gote1,4,9

1Chair of Systems Design, ETH Zurich, Zurich, Switzerland

2genua GmbH, Kirchheim bei München, Germany

3Complexity Science Hub, Vienna, Austria

4Data Analytics Group, Department of Informatics, University of Zurich, Zurich, Switzerland

[email protected] [email protected] [email protected] [email protected] [email protected]

Abstract

The functional interaction structure of a team captures the preferences with which members of different roles interact. This paper presents a data-driven approach to detect the functional interaction structure for software development teams from traces team members leave on development platforms during their daily work. Our approach considers differences in the activity levels of team members and uses a block-constrained configuration model to compute interaction preferences between members of different roles. We apply our approach in a case study to extract the functional interaction structure of a product team at the German IT security company genua GmbH. We subsequently validate the accuracy of the detected interaction structure in interviews with five team members. Finally, we show how our approach enables teams to compare their functional interaction structure against synthetically created benchmark scenarios. Specifically, we evaluate the level of knowledge diffusion in the team and identify areas where the team can further improve. Our approach is computationally efficient and can be applied in real time to manage a team’s interaction structure.

1 Introduction

Designing and maintaining an efficient organisational structure is essential for highly performant software development teams [27, 44, 22, 40, 49, 20]. This is especially the case in agile software development teams which—similar to Open Source Software teams [33]—have a strong focus on self-organisation and organisational flexibility [46]. The key concept behind agile software engineering is a high level of adaptivity, continuous evolution, and flexibility to changes in requirements [5]. As a consequence, the real interaction structure of such teams changes over time to adapt to new challenges. As such, it deviates from the team’s original organisational structure to a new unknown one [37, 36, 47].

Not knowing the team’s real interaction structure can have a broad range of negative consequences for the team. In the best case, it leads to reduced productivity or a decrease in software quality due to less well-managed and, therefore, less efficient information exchange [25, 45, 6]. However, in the worst case, it can result in the undetected emergence of developers possessing mission-critical unshared knowledge—e.g., lone wolfs, bottlenecks, or organisational silos [45, 44, 34]—which can have a devastating impact when they leave the team [3].

Despite its importance, the question of how to quantitatively and efficiently derive and evaluate a team’s functional interaction structure based on real observed interactions remains open. Closing this gap, we analyse the interactions from a product team at genua GmbH, a German IT security company, and make the following contributions:

•

Starting from the observed interactions of the team across three development platforms, we show that approaches merely counting the number of interactions between team members fail to detect the real interaction structure, as they cannot account for differences in team members’ activities.

•

Instead, we propose a novel method based on a block-constrained configuration model (BCCM) [9] that accounts for each team member’s unique capacity to initiate and receive (directed) interactions. Yielding interaction preferences aggregated on the level of team member roles, our method allows us to quantify the team’s interaction structure on each development platform individually, as well as across all platforms.

•

We validate the extracted interaction structure through semi-structured interviews with five team members from the product team at genua. Using the information obtained from the interviews, we further extend the extracted interaction structure with meta-information on each observed type of interaction. As a result, we obtain the team’s organigraph [32], visualising how different roles in a team functionally work together.

•

Finally, we show how our block model approach can also be used to compare the knowledge diffusion in the observed interactions with two other hypothetical scenarios. We find that the team currently achieves knowledge diffusion in the upper third of the possible range. Our analysis further shows that extending the agile methods employed by the team is the most promising way to improve knowledge diffusion further.

2 Data

In this paper, we study the case of software development in a product team at genua. To this end, in Section 2.1, we first introduce the four roles all members of the product team are subdivided into. In Section 2.2, we then discuss how we mined team interactions from genua’s development platforms.

2.1 Roles

Based on their tasks and responsibilities, genua classifies the members of the product team into four roles:

Developers develop, review, and integrate code and changes to implement new features and fix bugs. 2. 2.

Documenters write and maintain the user manual and release notes of the product. 3. 3.

Product Owners coordinate the team and are responsible for scheduling and prioritising issues. 4. 4.

Stakeholders only have a peripheral role within the team. The majority are customer-facing, selling the product to new clients, maintaining it on their sites, or training their internal staff regarding its use. Others perform quality assurance and application testing. Finally, some work on other internal projects adjacent to the product.

To obtain the roles for all team members and years, we followed a two-step process. First, we created lists of all team members active within a given year. Then, we iterated through these lists with two long-term team members to identify each member’s correct role. In rare cases where the two team members were uncertain regarding a role, they contacted additional team members more familiar with the respective case. The resulting data set contains (i) team members’ IDs and (ii) their roles for (iii) each year. We provide summary statistics for this data set in Table 1.

2.2 Interactions

The product team uses three different platforms to track their work. An issue tracker is used to manage and discuss implementations of issues, i.e., bug fixes or feature requests. When new code to resolve an issue is developed, this is tracked on the team’s code review platform. Finally, the team employs Git as version control system to collaborate on the codebase. As discussed in the following, for each platform, we mined pseudo-anonymised data capturing all actions performed by team members. In addition, we identified practices to extract the interactions corresponding to these actions. We provide summary statistics for the resulting interactions on all platforms in Table 2.

2.2.1 Issue Tracker

The team uses the tool Redmine [29] as their issue tracker. Similarly to an online forum, Redmine maintains separate discussion threads for all issues. In Fig. 1a we show an example of a discussion thread in which two team members, a stakeholder $S$ and a developer $D$ , create entries over time. The two team members interact when they read each others’ discussion entries. For each discussion entry, we collected (i) the ID of the team member creating it, (ii) the ID of the issue it belongs to, and (iii) the time of the entry’s creation. For reasons of confidentiality, we could not collect the content of the entries.

Together with three members of the product team, we further identified the following two practices (r1–r2) that allow us to obtain the interactions corresponding to the creation of each discussion entry:

r1.

Before team members write their first entry in a thread, they read the thread’s first entry to read the issue’s description. Additionally, they read the two most recent messages to learn about the current context of the discussion that their entry will continue. 2. r2.

For all subsequent entries, team members read the thread’s first entry to remind themselves of the issue. Additionally, they read every entry posted since (and including) their previous discussion entry.

Team member $D$ reading a discussion entry of member $S$ is equivalent to information flowing from $S$ to $D$ . Hence, we model all interactions derived from practices r1–r2 as directed links between the author and the reader of a discussion entry. We illustrate this in Fig. 1a, where, for clarity, only the extracted links for $D$ are shown.

2.2.2 Code Review Platform

To resolve issues, team members need to develop, review, and ultimately integrate changes to the codebase. This process is tracked and managed on the code review platform Aegis [31]. For Aegis, we again mined all actions of team members related to all changes. Specifically, we extracted (i) the ID of the team member performing an action, (ii) the ID of the corresponding change, (iii) the time at which the action was performed, and (iv) the type of the action. The possible types of actions are development $D$ , review $R$ , and integration $I$ . The developer, reviewer, and integrator of a change must be different team members. In the ideal case, a change is first developed, then positively reviewed, and finally successfully integrated. However, both review and integration can fail, requiring further development and, hence, resulting in more complex action sequences (see Fig. 1b for an example).

The change development process requires extensive interactions between team members that can be derived from the recorded actions following practices r3–r7, visualised in Fig. 1b:

r3.

A reviewer $R$ discusses the review’s outcome with the developer $D$ of the change. 2. r4.

An integrator $I$ discusses the integration’s outcome with the developer $D$ of the change. 3. r5.

If the integration fails, the integrator $I$ further discusses the detected problems with the reviewer $R$ that positively reviewed the change. 4. r6.

A developer $D$ that continues development after a failed review or integration discusses with the corresponding previous reviewer $R$ or integrator $I$ . 5. r7.

If a developer $D_{2}$ , reviewer $R_{2}$ , or integrator $I_{2}$ take over from a previous developer $D_{1}$ , reviewer $R_{1}$ , or integrator $I_{1}$ , a handover discussion takes place.

As all discussions resulting from r3–r7 are bi-directional, we model them as undirected links between the involved team members.

2.2.3 Version Control System

From the Git-based version control system, we obtain interactions by extracting yearly co-editing networks using the Python package git2net [21]. Motivated by the finding that a significant proportion of coordination between developers occurs via the code base [7], specifically when editing the same code [13], co-editing networks link team members consecutively modifying the same line of code. The links are directed following the arrow of time and time-stamped according to the time of the edit. As for the other two platforms, team members are represented by their pseudo-anonymised ID.

3 Interaction Structure Detection

Having introduced genua’s product team, how can we detect its functional interaction structure, i.e., the interaction preferences of the different roles? As an intuitive approach, we could simply count the interactions between the different roles. In Section 3.1, we show the results of this approach. We then discuss its shortcomings in Section 3.2 and propose a more comprehensive block model approach allowing us to overcome them in Section 3.3. Finally, in Section 3.4, we use this approach to detect the functional interaction structure of the studied product team at genua.

3.1 Interaction counting approach

The functional interaction structure of a team describes the interaction preference members of each role have towards the members of other roles. Therefore, it appears natural to simply count for each pair of roles how often their members interact and then identify which pairs have many interactions. In Fig. 2a, we show these counts for the different development platforms used by the product team at genua. On each platform, there is one role whose members are inactive. Documenters do not use the issue tracker but instead track the changes in the documentation entirely via the code review platform. Similarly, Stakeholders interact on the issue tracker but do not appear on the code review and version control platforms. For all platforms, we observe interactions between all active roles. As indicated by the width of links, the interaction counts are in the same order of magnitude and low compared to the number of interactions observed among Developers. These conclusions also hold when combining the interactions from the three platforms. As we see in Fig. 2b, Developers frequently interact among themselves, and the only missing interactions are between Stakeholders and Documenters as they are not active on the same platform.

3.2 From interaction counts to interaction preferences

The interaction counting approach only shows the activity of each role and disregards:

(i)

how many interactions each role can initiate, and 2. (ii)

how many interactions each role can receive.

To highlight the consequences of this, we consider the synthetic example with 6 individuals from 3 roles shown in Fig. 3a. The example assumes 2 Stakeholders interacting among themselves with moderate activity and 3 Developers interacting among themselves but with high activity. In addition, 1 Product Owner coordinates between the two groups. This Product Owner interacts 50 times with Developers but only 10 times with Stakeholders. Does this mean that the Product Owner has an interaction preference with Developers? While this seems to be the case according to the interaction counting approach, we argue that it is not. Developers appear in 1550 interactions in total, whereas Stakeholders appear in only 30 interactions. The Product Owner is involved in 10 of these 30 interactions, which suddenly seems like a lot.

3.3 Block model approach

We control for the activity of roles in our computation of the interaction structure using a block-constrained configuration model (BCCM) [9], in which we define the blocks as the roles of the team members. We select the BCCM over the more established stochastic block model (SBM), as the SBM cannot correctly reproduce the empirically observed activities of the team members [9]. To fit the BCCM to observed interactions, these need to be given as an interaction network where the nodes are the team members and the (multi-)edges their (multiple) interactions. For an example, we refer to Fig. 3a. The BCCM introduces a term $\omega_{r_{1}r_{2}}$ that measures the propensity with which the roles $r_{1}$ and $r_{2}$ interact according to their in- and out-degrees. To determine $\omega_{r_{1}r_{2}}$ from a given interaction network, we use the function bccm in the R library ghypernet [10], which implements:

[TABLE]

Here, $r_{1}$ and $r_{2}$ are two roles, $A_{r_{1}r_{2}}$ is the number of interactions between team members with these roles, and $\Xi_{r_{1}r_{2}}$ is a normalisation term for the maximum number of possible interactions between $r_{1}$ and $r_{2}$ . For a detailed discussion of Eq. 1 we refer to [9].

Fig. 3b shows the $\omega$ values between all pairs of roles for our example with the interaction counts shown in Fig. 3a. Despite the relatively low interaction count of 10 between the Product Owner and Stakeholders, the low activity of Stakeholders results in a comparatively large $\omega=0.164$ between the two roles. In contrast, the high activity of Developers yields a ten-times lower $\omega=0.016$ between Product Owner and Developers, despite 50 interactions taking place. Thus, the BCCM suggests that the Product Owner is ten times as likely to interact with Stakeholders than with Developers when taking into account that Stakeholders are less active than Developers. This is also reflected in Fig. 3c, where we normalise the interaction propensities $\omega$ for each role individually.

If we have no interaction preferences, we would expect equal normalised interaction propensities of $33\%$ 111Or $50\%$ if we disregard possible interactions among Product Owners as there usually is just one. towards all three roles. The deviation from this expectation correctly reveals a strong positive interaction preference of the Product Owner towards Stakeholders and a strong negative interaction preference towards Developers.

3.4 Detecting genua’s interaction structure

As shown in Fig. 4, we now apply our block model approach to detect the functional interaction structure of the studied product team at genua. Based on the interaction networks for multiple platforms and multiple years that we collected in Section 2, we first compute the interaction preferences separately for each platform and year before averaging them across the years. As our team contains members of four roles, we consider the threshold to distinguish between positive and negative preferences to be $25\%$ 222The threshold of $25\%$ assumes that interactions can occur between members of the same role. This is intuitively true for Developers, Stakeholders, and Documenters as there are always multiple active members of these roles. The Product Owner is a special case. While in principle, there is only one Product Owner active at any point in time, our data contains multiple transitions between Product Owners, resulting in two Product Owners being recorded for a year. Therefore, we opted to treat Product Owners analogous to the other roles and did not introduce an exception.. In Fig. 4b, we show the resulting positive and negative interaction preferences.

Positive interaction preferences

On the issue tracker, Stakeholders, the Product Owner, and Developers are active. The interaction preferences suggest that Stakeholders predominantly interact with themselves and the Product Owner. In turn, the Product Owner has an interaction preference towards Developers. Finally, Developers again preferentially interact among themselves.

The code review platform has activity from all roles other than Stakeholders. Here, the interaction preferences suggest interactions from the Product Owner towards both Developers and Documenters, who in turn show a preference to interact among themselves.

Finally, the version control system shows a similar pattern as the code review platform. Again, only the Product Owner, Developers, and Documenters are active. The Product Owner has an interaction preference towards Developers, and Developers and Documenters interact primarily among themselves. The only difference to the code review platform is the absence of a positive preference between the Product Owner and Documenters.

Negative interaction preferences

We visualise these negative interaction preferences in the right column of Fig. 4b. The positive preferences discussed above imply the existence of corresponding negative preferences towards the other roles. This means that the structure of the negative preferences, i.e., which links exist, is complementary to the structure of the positive preferences. Therefore—rather than its structure—we are particularly interested in the strength of the negative preferences displayed by each role.

Stakeholders are only active on the issue tracker, where they have positive interaction preferences with themselves and the Product Owner. As Documenters are not active on the issue tracker, Stakeholders only have a negative interaction preference towards Developers.

The Product Owner is active on all platforms. As indicated by the thin width of the links, the Product Owner does not show strong negative interaction preferences to other roles. However, we find a self-loop suggesting that Product Owners interact significantly less than expected among themselves. This is intuitive, as there is only one Product Owner active at any point in time. Therefore, if we observe more than one Product Owner in one of our yearly snapshots, this indicates a transition between the two at some point during the year. However, as they are active consecutively and not simultaneously, we find fewer interactions than their activity suggests.

Developers are also active on all platforms. As indicated by the similar width of all edges from Developers to all other roles, Developers do not show strong negative interaction preferences towards any role.

Finally, Documenters are active on the code review and version control platforms. For Documenters, we find a strong negative interaction preference towards Developers.

So far, we have discussed interaction preferences for each platform separately. We now combine them to obtain the functional interaction structure of the team across all platforms. As the positive and negative interaction preferences are complementary, both reveal the same interaction structure. However, positive interaction preferences are more natural to interpret. Therefore, we show the functional interaction structure obtained by combining the positive interaction preferences in Fig. 4c. Overall, we find that Stakeholders represent the input into the development team. Stakeholders interact primarily with the Product Owner who, in turn, has strong interaction preferences towards Developers and Documenters. These two roles represent sinks in the team, primarily interacting among their own role and not with each other.

4 From Interaction Structure To Organigraph

We now validate the detected interaction structure and enrich it with information on the function of these relations—yielding the team’s organigraph. To this end, we conducted interviews with five experienced members of the studied product team (Int1 – Int5). Our set of interviewees consists of two Developers, two Stakeholders, and one former Product Owner, ensuring that we get a broad range of first-hand perspectives into the product team’s development processes. All interviews were held online in a video chat in March 2021 and lasted for approx. 60 minutes, followed by a debriefing. We set up each interview in a semi-structured format, combining closed-ended survey questions with open-ended discussions [1]. The interviews were conducted without the aid of any additional material. As we summarise in the following, the interviews validated and explained all interaction preferences identified in our quantitative analysis (cf. Section 3). We show the organigraph enriching our quantitative results with the explanations from the interviews in Fig. 5.

4.1 The role of Stakeholders

As we discussed in Section 2.1, the majority of Stakeholders are customer-facing and often located at the customer’s sites. Consequently, they are the first to learn about new bugs, required features, or new use cases for which they forward feedback directly to the Product Owner.

“The customers’ wishes for new features are supposed to be assigned to the Product Owner.” [Statement by Int3]

“Stakeholders discuss new features always in direct coordination with the PO.” [Statement by Int4]

Simultaneously, the geographical distribution of Stakeholders explains their reduced interactions with Documenters and Developers.

“Stakeholders are not at the company’s [i.e., genua’s] site, and therefore can’t just go into a Developer’s office and ask.” [Statement by Int4]

Further, Stakeholders do not have access to the code review and version control platforms, explaining the observed lack of interactions there.

“Actually, normal Stakeholders have nothing to do with Aegis [the code review platform] and Git [the version control system].” [Statement by Int3]

Internally, Stakeholders update and advise each other on common problems and critical bugs, which they then champion to be prioritised in the team’s weekly bug meetings.

4.2 The role of the Product Owner

Collecting the information from Stakeholders, the Product Owner leads the weekly bug meetings and is responsible for scheduling and prioritising what is being worked on.

“In bug meetings, the Product Owner, some Stakeholders, and also a couple of Developers, who took care of the bugs, discuss prioritisation, and their initial analysis.” [Statement by Int1]

The Product Owner then coordinates and oversees the rest of the team. Thus, the Product Owner indeed acts as a fixed mediator for feedback from the Stakeholders to the Developers and Documenters, confirming the results of our quantitative analysis.

4.3 The roles of Developers and Documenters

Following the bug meeting, the Developers work on changes resolving the bugs or implementing the discussed features and Documenters update the product’s documentation accordingly.

“Based on the outcome of the bug meeting, the Developers develop. And the Documenters can, of course, also see what is written and then document this.” [Statement by Int1]

All interviewees agreed that these two processes occur mostly independently, explaining the infrequent interactions between Developers and Documenters.

“Documenters and Developers have their closed problem domains. The Developer tries to get a feature working from a technical perspective, and the Documenter tries to explain it to a user at the other end.” [Statement by Int5]

However, interactions within their own role still occur very frequently. One key reason for this is genua’s internal review process, which requires all changes—including changes to the documentation—to be developed, reviewed, and integrated by three different team members, which automatically sparks interactions between many different members.

“Whenever something is changed, someone has to look at it [i.e., review and integrate it into the codebase].” [Statement by Int5]

In conclusion, with our interviews, we could validate and explain all detected interaction preferences—both positive and negative—between roles. This validates our quantitative approach and shows that we can extract the functional interaction structure of teams accurately and in a computationally efficient manner.

5 Interaction Structure Optimisation

In Section 3, we started our analysis of the product team from the observed individual interactions between team members. We then grouped the team members according to their role in the team, yielding the functional interaction structure capturing the interaction preferences between members of different roles reported in Fig. 4c. Through this step, we aggregated all team members of a given role into a single representative node in the resulting role interaction network. Implicitly, this assumes that all members of a role are similar to the degree that they can be considered as interchangeable. This strong assumption is unlikely to be fully fulfilled in any real-world organisation (cf. Section 7 for a discussion). The role definitions from Section 2.1 already state that, e.g., Stakeholders take on multiple different functions and specialisations. Similarly, our interviews also suggest a degree of heterogeneity amongst Developers, both in terms of their experience and their knowledge of different parts of the codebase.

“I think there are still comfort zones where people make initial changes and whom you let do it [make changes in a specific area of the codebase].” [Statement by Int5]

However, by employing agile development methods, the team actively aims to promote and enhance knowledge diffusion among the Developers, to reduce the risk of knowledge loss when a member leaves the team.

“One of the philosophies of Scrum is that ‘everyone can do everything’ to address precisely the problems arising when the bus comes [referring to the truck factor, which is also known as the bus factor], or Google simply pays more. Thus we try to counteract exactly these problems in advance through XP [Extreme Programming] and pair programming [deliberate pairing of team members with different expertise].” [Statement by Int1]

This motivates a final experiment in which we use our extracted interaction preferences to assess where the team currently stands and to what extent further homogeneity among the members of roles could improve knowledge diffusion within the team.

To quantify knowledge diffusion, we use the measure potentiality (Pot) proposed in [53]. Potentiality utilises the notion of entropy to quantify the extent to which members distribute their interactions across the entire team rather than among a few specific collaborators:

[TABLE]

Here, $H^{\text{observed}}$ is the entropy of the observed interaction distribution, and $H^{\text{max}}$ is the highest possible entropy achieved when all team members interact with everyone else equally often. A potentiality close to $1$ indicates that most members interact with the entire team, whereas a potentiality close to [math] indicates that many members only have a few interaction partners.

As we discussed in Section 3.3, we encode the team’s interaction structure through the $\omega$ parameters of the corresponding BCCM (cf. Eq. 1). In contrast, potentiality is computed on interaction networks. Given an interaction structure—i.e., a specified BCCM—we obtain the distribution of likely interaction networks using the sampling approach implemented in [10]. Subsequently, we compute potentiality for all sampled networks obtaining a distribution of values capturing—and hence allowing us to compare—the team’s knowledge diffusion for different interaction structures.

We report our results in Fig. 6. In \faSquare, we show the potentiality computed for the observed interactions (obsInt) over time. We compare the knowledge diffusion in the observed case to two synthetically created benchmark scenarios suggested by genua. In the first scenario (ecdeDevs), shown in \faSquare, we assume that the team achieves the stated aim that “everyone can do everything” (ecde) among Developers, effectively making them interchangeable. This corresponds to a BCCM model where all developers are aggregated into a single block, while all other team members are represented by individual blocks. Finally, in the second scenario (ecdeAll) shown in \faSquare, we assume that “everyone can do everything” holds not only for Developers but for all roles. This corresponds to a BCCM where, analogous to the organigraph in Fig. 5, all team members are aggregated into four blocks corresponding to their role.

For all platforms and all years we observe $\text{Pot}(\textsc{ecdeAll})\geq\text{Pot}(\textsc{ecdeDevs})\geq\text{Pot}(\textsc{obsInt})$ . This ordering aligns with our expectation that knowledge diffusion has an inverse relation to the heterogeneity of members of a role.

Notably, the difference between ecdeDevs and ecdeAll is always diminishingly small. This means that almost all possible improvements in knowledge diffusion can already be achieved if “everyone can do everything” among Developers. As indicated by obsInt, the team currently achieves a knowledge diffusion corresponding to around 70% of the optimal case ecdeAll. Our analysis suggests that to improve this further, the team should target knowledge diffusion among Developers first.

Comparing the three platforms, we observe significantly higher knowledge diffusion on the code review platform. This indicates that genua’s efforts to promote interactions by requiring that at least three different team members contribute to all changes are successful. Finally, the code review platform is primarily used by Developers, explaining why the difference between obsInt and ecdeDevs is largest here.

In conclusion, we find that the studied team achieves knowledge diffusion in the upper third of the attainable range. Our analysis shows that almost the entire remaining gain can already be achieved by obtaining optimal knowledge diffusion among developers (cf. ecdeDevs). Working towards this, the team at genua implements various agile methods, including Scrum, Extreme Programming, and pair programming.

6 Threats to Validity

Our study is subject to some threats to validity which we discuss in the following.

Internal validity

While we have taken the utmost effort and care to obtain complete and correct data on the interactions among all members of the product team, there are three limitations that we discuss in the following.

First, for our study, we mined all actions logged in the complete databases of all three development platforms used by the analysed team. From our discussions with team members, we learned that no development occurs without generating entries on these platforms as the team strictly enforces all bugs and feature requests to be tracked and version controlled. That said, due to confidentiality concerns, we could not obtain and analyse any text data. Next to the content of the interactions on the three development platforms, this also means that we did not have access to any email or chat communication. Finally, interactions such as personal discussions are not recorded. As a consequence, these interactions are missing from our data.

Second, as discussed in Section 2.2, the development platforms record actions instead of interactions between team members. In discussion with members of the product team, we identified a set of practices allowing us to extract the interactions corresponding to the recorded actions. However, we expect a degree of heterogeneity in the behaviour of team members, which is not covered by the practices. Furthermore, we expect team members to adapt their behaviour over time and based on the context of the situation, resulting in changes over time.

Finally, the three development platforms record different types of interactions (discussions, code reviews, and co-editing of code). In our interviews, we discussed the possibility of weighing the different types of interactions for our combined results. However, there was no consensus among our interviewees concerning which platform is most important for them with regard to knowledge exchange. Therefore, for our combined results, we treated all platforms as equal.

Construct validity

In the first part of our study, we aimed to extract the functional relations between roles in a product team. To this end, we studied the team’s interactions using a block model approach. The resulting relations match those identified in our subsequent interviews, confirming the usefulness of our approach. However, we cannot entirely rule out unlikely cases in which our approach missed relations that none of the five interviewed team members was aware of, as those would show up neither in our data nor the interviews.

In the second part, we used the resulting functional interaction structure to assess knowledge diffusion in the team. To this end, we used the entropy-based measure potentiality. While our results suggest that potentiality captures knowledge diffusion adequately, additional measures, e.g., also capturing the content of interactions, could further improve our analysis. Unfortunately, as we did not have access to any text data, we could not further explore this.

Finally, for our two hypothetical benchmark scenarios ecdeDevs and ecdeAll, we assumed perfect homogeneity among all team members of a role. We argue that the scenarios are helpful as optimal cases the team can work towards. However, different activity levels of team members, turnover, and differences in team members’ experience with the product will always cause the scenarios to remain purely theoretical. In addition, our analysis excluded the discussion of the benefits of heterogeneity, e.g., increased productivity and creativity [43, 23], which we will assess in future work.

External validity

Lastly, we performed our analysis in a case study for a single product team at genua, which sparks questions regarding the external validity of our analysis. Our analysis approach solely relies on time-stamped interaction data and information regarding team members’ roles and makes no assumptions concerning their content. Therefore, we do not see concerns regarding the generalisability of our approach.

7 Related Work

In this paper, we introduced an approach to measure the functional interaction structure of a team. We also exemplified how our approach can be used to reveal community smells, i.e., flaws in the interaction structure. Finally, we compared the interaction structure against synthetic benchmark scenarios generating insights on how it can be improved. We discuss the related work for these three aspects separately.

Interaction structure detection

The interaction structure in teams has been studied and characterised in a broad range of empirical studies. Commonly, this is achieved via a network approach. Here, researchers represent individuals as nodes and their interactions as edges. They then compute various network measures to characterise their interaction structure [48, 54, 11]. For example, the betweenness centrality could reveal hubs in OSS teams who route the information flow from peripheral developers into the core team [47]. Entropy measures could show that humans interact with a broad range of peers in the early stages of group formation but narrow down their contacts as time proceeds [28]. Similarly, the potentiality, an entropy-based measure, was used to quantify the distribution of interactions across a team [53], thereby proxying the resilience to forming knowledge islands. Using the degree assortativity and clustering coefficient, the impact of the departure of a core developer on an Open Source team was measured [50]. In a similar approach, the authors of [6] studied how well various network measures predict the risk of introducing software defects. Using non-network approaches, the authors of [39] characterised the interaction structure spatially by detecting locations in an office building where employees frequently interact. In other works, the task redistribution between software developers was studied with agent-based models [30, 12]. The key difference to this work is that all approaches mentioned above study the interaction structure between individuals. Instead, we focus on the functional interaction structure derived from the interactions between members of different roles in a team.

Community smells

Flaws in the interaction structure of software development teams are often referred to as community smells. For a thorough review, we refer to [8]. Typically, community smells cause inefficiency by hindering the information exchange within a team. As collections of knowledge workers, software development teams require and encourage the creation of specialised knowledge [18] such as technical knowledge on the implementation of specific code areas [51] or long-term experience in a project [17]. This can lead to the emergence of small sub-groups of developers that solely possess specific knowledge—so-called organisational silos. If these structures are not detected and countered in a timely manner, they can put the team at risk, as this knowledge is lost if these members leave. Thus, community smells can even threaten an entire team’s existence [3]. The authors of [42] show that regaining lost knowledge can require months of sifting through old code, commit logs, etc., in which the team makes little to no progress. The risk behind organisational silos is amplified by the fast employee turnover in IT companies, which is typically below two years [35]. The literature has proposed different measures to estimate the size of such organisational silos, e.g., the bus factor or truck number [3, 19, 26]. These works, however, focus on individuals and not on the group structure of a team. Our approach enables the detection of community smells affecting the group structure as given by the members’ roles. Thereby, our approach can, for example, be used to identify roles with low knowledge diffusion, whose members bear the risk of forming organisational silos.

Interaction structure optimisation

Finally, the literature shows that teams can eliminate community smells effectively if they are able to find them [see, e.g., 15]. Knowledge in software development teams is largely shared through interactions [38]. Therefore, community smells are typically countered by incentivising and increasing the number of interactions within the team to improve the distribution of knowledge [14]. To distribute interactions, agile development frameworks encourage periodic team meetings [24], pair-programming [16], where two developers sit together at one computer and program, or pair-rotation [51], where the pairs are reshuffled periodically. Other approaches pursuing similar aims are Scrum [41], Kanban [2], Extreme Programming [4], or DevOps [52]. These approaches have become popular in modern IT teams, but judging how effective they are in a given team is challenging. Closing this gap, our approach enables teams to assess and manage their functional interaction structure in real time. In addition, by enabling teams to evaluate their current interaction structure against alternative benchmark scenarios, our approach facilitates the identification of optimal interaction structures the team can work towards.

8 Conclusions

An efficient interaction structure facilitates knowledge diffusion, allowing the team to maintain its performance and retain its knowledge base even when team members leave. However, particularly in flexible, self-organised teams, the interaction structure evolves over time. While this allows the team to adapt to new challenges, it also bears the risk of undesirable outcomes, such as reduced software quality or the emergence of community smells.

This paper showed how the functional interaction structure, i.e., the preferences with which members of different roles interact with each other, can be directly inferred from the traces that team members create on their development platforms during their daily work. To this end, we first demonstrated that approaches merely counting the interactions between members belonging to different roles are insufficient, as they fail to account for the unique activity levels of the team members. Instead, we proposed an approach that considers how many interactions members of each role can initiate and receive using a block-constrained configuration model. This allows us to compute the interaction preferences between members of different roles.

In a case study, we mined comprehensive data tracking the development process of a product team at the German IT security company genua GmbH across three development platforms. We then applied our approach to extract the functional interaction structure of the team.

We conducted semi-structured interviews with five team members in which we validated the accuracy of the detected interaction structure. In addition, the interviews allowed us to enrich the detected interaction structure with information on the purpose of each interaction. This made the interaction structure interpretable and yielded the team’s organigraph.

During the interviews, we further learned that to prevent knowledge loss, genua strives for homogeneity among members of a role—i.e., “everyone can do everything” across members of a role. This motivated a final experiment in which we showed how our approach enables teams to compare themselves against synthetic benchmark scenarios. Specifically, we studied the knowledge diffusion in the development team and compared it to two scenarios suggested by genua. The first scenario assumed homogeneity only among Developers, while the second scenario assumed homogeneity for all roles. Our results demonstrated that the team currently reaches knowledge diffusion in the upper third of the attainable range. We further showed that reaching homogeneity for all roles in the team is not required. Instead, almost all possible gains in knowledge diffusion can already be achieved by further promoting interactions between Developers, which the team does by applying Scrum, Extreme Programming, and pair programming. Our approach is computationally efficient, allowing the team to track the results of their efforts and manage their interaction structure in real-time, based solely on readily available development data.

Acknowledgements

We thank the 5 anonymous interviewees from genua for the valuable insights provided during the interviews. We further thank Giona Casiraghi for useful discussions about the BCCM and contributions to the R implementation of the Potentiality function.

Bibliography54

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Adams [2015] Adams, W. C. (2015). Conducting semi-structured interviews. In: Handbook of Practical Program Evaluation , Hoboken, NJ, USA: John Wiley & Sons, Inc. pp. 492–505.
2Ahmad et al. [2013] Ahmad, M. O.; Markkula, J.; Oivo, M. (2013). Kanban in software development: A systematic literature review. In: 2013 39th Euromicro Conference on Software Engineering and Advanced Applications . IEEE, pp. 9–16.
3Avelino et al. [2016] Avelino, G.; Passos, L.; Hora, A.; Valente, M. T. (2016). A novel approach for estimating truck factors. In: 2016 IEEE 24th International Conference on Program Comprehension (ICPC) . IEEE, vol. 2016-July, pp. 1–10.
4Beck [1999] Beck, K. (1999). Extreme Programming explained: Embrace change . USA: Addison-Wesley Longman Publishing Co., Inc.
5Beck et al. [2001] Beck, K.; Beedle, M.; Van Bennekum, A.; Cockburn, A.; Cunningham, W.; Fowler, M.; Grenning, J.; Highsmith, J.; Hunt, A.; Jeffries, R.; et al. (2001). Manifesto for agile software development. Agile Alliance .
6Bettenburg and Hassan [2010] Bettenburg, N.; Hassan, A. E. (2010). Studying the Impact of Social Structures on Software Quality. In: 2010 IEEE 18th International Conference on Program Comprehension . IEEE, pp. 124–133.
7Bolici et al. [2016] Bolici, F.; Howison, J.; Crowston, K. (2016). Stigmergic coordination in FLOSS development teams: Integrating explicit and implicit mechanisms. Cognitive Systems Research 38 , 14–22.
8Caballero-Espinosa et al. [2022] Caballero-Espinosa, E.; Carver, J. C.; Stowers, K. (2022). Community smells—The sources of social debt: A systematic literature review. Information and Software Technology , 107078.