RefDiff: Detecting Refactorings in Version Histories

Danilo Silva; Marco Tulio Valente

arXiv:1704.01544·cs.SE·August 7, 2018

RefDiff: Detecting Refactorings in Version Histories

Danilo Silva, Marco Tulio Valente

PDF

TL;DR

RefDiff is an automated tool that accurately detects various refactoring operations between code revisions in git repositories, aiding understanding of software evolution.

Contribution

RefDiff introduces a novel combination of heuristics based on static analysis and code similarity to identify 13 refactoring types with high precision and recall.

Findings

01

Achieved 100% precision and 88% recall on an oracle of 448 refactorings.

02

Outperformed existing state-of-the-art refactoring detection tools.

03

Effective across seven Java projects.

Abstract

Refactoring is a well-known technique that is widely adopted by software engineers to improve the design and enable the evolution of a system. Knowing which refactoring operations were applied in a code change is a valuable information to understand software evolution, adapt software components, merge code changes, and other applications. In this paper, we present RefDiff, an automated approach that identifies refactorings performed between two code revisions in a git repository. RefDiff employs a combination of heuristics based on static analysis and code similarity to detect 13 well-known refactoring types. In an evaluation using an oracle of 448 known refactoring operations, distributed across seven Java projects, our approach achieved precision of 100% and recall of 88%. Moreover, our evaluation suggests that RefDiff has superior precision and recall than existing state-of-the-art…

Tables8

Table 1. TABLE I: Relationship types

Relationship	Condition
	$(t_{b}, t_{a}) \in T_{b} \times T_{a}$ , such that:
Same Type		$name (t_{b}) = name (t_{a}) \land π (t_{b}) \sim π (t_{a})$
Rename Type		$name (t_{b}) \neq name (t_{a}) \land π (t_{b}) \sim π (t_{a}) \land sim (t_{b}, t_{a}) > τ$
Move Type		$name (t_{b}) = name (t_{a}) \land π (t_{b}) ≁ π (t_{a}) \land sim (t_{b}, t_{a}) > τ$
Move and Rename Type		$name (t_{b}) \neq name (t_{a}) \land π (t_{b}) ≁ π (t_{a}) \land sim (t_{b}, t_{a}) > τ$
Extract Supertype		$(∄ x \in T_{b} \| x \sim t_{a}) \land (\exists y \in T_{a} \| t_{b} \sim y \land subtype (y, t_{a})) \land {sim}_{p} (t_{a}, t_{b}) > τ$
	$(m_{b}, m_{a}) \in M_{b} \times M_{a}$ , such that:
Same Method		$sig (m_{b}) = sig (m_{a}) \land π (m_{b}) \sim π (m_{a})$
Rename Method		$name (m_{b}) \neq name (m_{a}) \land π (m_{b}) \sim π (m_{a}) \land sim (m_{b}, m_{a}) > τ$
Change Method Signature		$name (m_{b}) = name (m_{a}) \land sig (m_{b}) \neq sig (m_{a}) \land π (m_{b}) \sim π (m_{a}) \land sim (m_{b}, m_{a}) > τ$
Pull Up Method		$sig (m_{b}) = sig (m_{a}) \land subtype (π {(m_{b})}^{\sim}, π (m_{a})) \land sim (m_{b}, m_{a}) > τ$
Push Down Method		$sig (m_{b}) = sig (m_{a}) \land supertype (π {(m_{b})}^{\sim}, π (m_{a})) \land sim (m_{b}, m_{a}) > τ$
Move Method		$name (m_{b}) = name (m_{a}) \land π (m_{b}) ≁ π (m_{a}) \land \neg subOrSuper (π {(m_{b})}^{\sim}, π (m_{a})) \land sim (m_{b}, m_{a}) > τ$
Extract Method		$(∄ x \in M_{b} \| x \sim m_{a}) \land (\exists y \in M_{a} \| m_{b} \sim y \land y \in callers (m_{a})) \land {sim}_{p} (m_{a}, m_{b}) > τ$
Inline Method		$(∄ x \in M_{a} \| m_{b} \sim x) \land (\exists y \in M_{b} \| y \sim m_{a} \land y \in callers (m_{b})) \land {sim}_{p} (m_{b}, m_{a}) > τ$
	$(f_{b}, f_{a}) \in F_{b} \times F_{a}$ , such that:
Same Field		$name (f_{b}) = name (f_{a}) \land type (f_{b}) = type (f_{a}) \land π (f_{b}) \sim π (f_{a})$
Pull Up Field		$name (f_{b}) = name (f_{a}) \land type (f_{b}) = type (f_{a}) \land subtype (π {(f_{b})}^{\sim}, π (f_{a})) \land sim (f_{b}, f_{a}) > τ$
Push Down Field		$name (f_{b}) = name (f_{a}) \land type (f_{b}) = type (f_{a}) \land supertype (π {(f_{b})}^{\sim}, π (f_{a})) \land sim (f_{b}, f_{a}) > τ$
Move Field		$name (f_{b}) = name (f_{a}) \land type (f_{b}) = type (f_{a}) \land π (f_{b}) ≁ π (f_{a}) \land \neg subOrSuper (π {(f_{b})}^{\sim}, π (f_{a})) \land sim (f_{b}, f_{a}) > τ$

Table 2. TABLE II: Projects/commits used int the calibration

Repository URL	Commit
github.com/linkedin/rest.li	54fa890
github.com/droolsjbpm/jbpm	3815f29
github.com/gradle/gradle	44aab62
github.com/jenkinsci/workflow-plugin	d0e374c
github.com/spring-projects/spring-roo	0bb4cca
github.com/BuildCraft/BuildCraft	a5cdd8c
github.com/droolsjbpm/drools	1bf2875
github.com/jersey/jersey	d94ca2b
github.com/undertow-io/undertow	d5b2bb8
github.com/kuujo/copycat	19a49f8

Table 3. TABLE III: Thresholds calibration results

Ref. Type	#	$τ$	TP	FP	FN	Precision	Recall
Rename Type	2	0.4	2	0	0	1.000	1.000
Move Type	2	0.9	2	0	0	1.000	1.000
Extract Superclass	2	0.8	2	0	0	1.000	1.000
Rename Method	24	0.3	22	3	2	0.880	0.917
Pull Up Method	7	0.4	7	0	0	1.000	1.000
Push Down Method	2	0.6	2	0	0	1.000	1.000
Move Method	24	0.4	21	1	3	0.955	0.875
Extract Method	25	0.1	25	9	0	0.735	1.000
Inline Method	6	0.3	5	2	1	0.714	0.833
Pull Up Field	2	0.5	2	0	0	1.000	1.000
Push Down Field	5	0.3	5	0	0	1.000	1.000
Move Field	1	0.5	1	1	0	0.500	1.000
Total	102		96	16	6	0.857	0.941

Table 4. TABLE IV: Selected projects

Repository URL	Description	LOC
github.com/Atmosphere/atmosphere	The Asynchronous WebSocket/Comet Framework	65,841
github.com/clojure/clojure	The Clojure programming language	58,417
github.com/google/guava	Google Core Libraries for Java 6+	374,068
github.com/dropwizard/metrics	Capturing JVM- and application-level metrics, so you know what’s going on	24,242
github.com/orientechnologies/orientdb	An Open Source NoSQL DBMS with the features of both Document and Graph DBMSs	168,924
github.com/square/retrofit	Type-safe HTTP client for Android and Java by Square, Inc.	17,073
github.com/spring-projects/spring-boot	Spring Boot makes it easy to create Spring-powered, production-grade applications and services with absolute minimum fuss	39,190

Table 5. TABLE V: Refactoring types in the evaluation oracle

		Supported by
Ref. Type	#	RDiff	RMinr	RCraw	RFind
Rename Type	35	yes	yes	yes	no
Move Type	31	yes	yes	no	no
Extract Superclass	16	yes	yes	no	yes
Rename Method	70	yes	yes	yes	yes
Pull Up Method	15	yes	yes	yes	yes
Push Down Method	68	yes	yes	yes	yes
Move Method	31	yes	yes	yes	yes
Extract Method	29	yes	yes	no	yes
Inline Method	52	yes	yes	no	yes
Pull Up Field	33	yes	yes	no	yes
Push Down Field	42	yes	yes	no	yes
Move Field	26	yes	yes	no	yes
Total	448

Table 6. TABLE VI: Precision and recall by refactoring type

	RDiff		RMinr		RCraw		RFind
Ref. Type	Precision	Recall	Precision	Recall	Precision	Recall	Precision	Recall
Rename Type	1.000	1.000	1.000	1.000	0.750	0.429
Move Type	1.000	0.968	1.000	0.968
Extract Superclass	1.000	0.875	1.000	0.875			0.484	0.938
Rename Method	1.000	0.943	1.000	0.886	0.971	0.486	0.868	0.843
Pull Up Method	1.000	0.600	1.000	0.733	0.500	0.067	1.000	0.571
Push Down Method	1.000	0.971	1.000	0.176	1.000	0.265	1.000	0.491
Move Method	1.000	1.000	1.000	0.742	0.090	0.323	0.054	0.759
Extract Method	1.000	0.897	1.000	0.862			0.607	0.586
Inline Method	1.000	0.981	1.000	0.423			0.917	0.688
Pull Up Field	1.000	0.576	1.000	0.970			1.000	0.394
Push Down Field	1.000	0.929	1.000	0.929			1.000	0.333
Move Field	1.000	0.269	0.583	0.808			0.097	0.923

Table 7. TABLE VII: Overall precision and recall

Approach	TP	FP	FN	Precision	Recall
RDiff	393	0	55	1.000	0.877
RMinr	326	15	122	0.956	0.728
RCraw	78	108	141	0.419	0.356
RFind	231	645	129	0.264	0.642
RCraw*	78	56	141	0.582	0.356
RFind*	231	241	129	0.489	0.642

Table 8. TABLE VIII: Execution time

		RDiff execution time				RMinr execution time
Repository	Commits	Min. (ms)	Max. (ms)	Avg. (ms)	Total. (s)	Min. (ms)	Max. (ms)	Avg. (ms)	Total. (s)
androidannotations/androidannotations	29	1	4,956	451	13	1	1,753	211	6
bumptech/glide	41	1	3,349	594	24	2	8,992	466	19
elastic/elasticsearch	946	1	42,344	1,897	1,795	1	103,943	1,105	1,046
libgdx/libgdx	69	0	5,112	805	56	1	6,774	578	40
netty/netty	225	0	3,384	640	144	0	59,736	665	150
PhilJay/MPAndroidChart	14	1	816	245	3	1	310	79	1
ReactiveX/RxJava	120	1	810,744	10,475	1,257	1	17,369	538	65
spring-projects/spring-framework	478	1	15,019	1,205	576	1	6,133	920	440
square/okhttp	45	1	1,526	380	17	1	616	178	8
zxing/zxing	23	1	773	342	8	1	502	230	5
Total	1990	0	810,744	1,956	3,893	0	103,943	894	1,779

Equations16

sim (e_{1}, e_{2}) = \frac{\sum _{t \in U} min ( w ( e _{1} , t ) , w ( e _{2} , t ))}{\sum _{t \in U} max ( w ( e _{1} , t ) , w ( e _{2} , t ))}

sim (e_{1}, e_{2}) = \frac{\sum _{t \in U} min ( w ( e _{1} , t ) , w ( e _{2} , t ))}{\sum _{t \in U} max ( w ( e _{1} , t ) , w ( e _{2} , t ))}

w (e, t) = m_{e} (t) \times idf (t)

w (e, t) = m_{e} (t) \times idf (t)

idf (t) = lo g (1 + \frac{∣ E ∣}{n _{t}})

idf (t) = lo g (1 + \frac{∣ E ∣}{n _{t}})

idf (y) = lo g (1 + \frac{∣ E ∣}{n _{t}}) = lo g (1 + \frac{3}{2}) = 0.398

idf (y) = lo g (1 + \frac{∣ E ∣}{n _{t}}) = lo g (1 + \frac{3}{2}) = 0.398

idf (else) = lo g (1 + \frac{∣ E ∣}{n _{t}}) = lo g (1 + \frac{3}{1}) = 0.602

idf (else) = lo g (1 + \frac{∣ E ∣}{n _{t}}) = lo g (1 + \frac{3}{1}) = 0.602

si m_{p} (e_{1}, e_{2}) = \frac{\sum _{t \in U} min ( w ( e _{1} , t ) , w ( e _{2} , t ))}{\sum _{t \in U} w ( e _{1} , t )}

si m_{p} (e_{1}, e_{2}) = \frac{\sum _{t \in U} min ( w ( e _{1} , t ) , w ( e _{2} , t ))}{\sum _{t \in U} w ( e _{1} , t )}

F_{1} = 2 \times \frac{precision \times recall}{precision + recall}

F_{1} = 2 \times \frac{precision \times recall}{precision + recall}

precision = \frac{tp}{tp + fp}

precision = \frac{tp}{tp + fp}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

RefDiff: Detecting Refactorings in Version Histories

Danilo Silva1, Marco Tulio Valente2

Department of Computer Science

Universidade Federal de Minas Gerais

Belo Horizonte, Brazil

Email: [email protected], [email protected]

Abstract

Refactoring is a well-known technique that is widely adopted by software engineers to improve the design and enable the evolution of a system. Knowing which refactoring operations were applied in a code change is a valuable information to understand software evolution, adapt software components, merge code changes, and other applications. In this paper, we present RefDiff, an automated approach that identifies refactorings performed between two code revisions in a git repository. RefDiff employs a combination of heuristics based on static analysis and code similarity to detect 13 well-known refactoring types. In an evaluation using an oracle of 448 known refactoring operations, distributed across seven Java projects, our approach achieved precision of 100% and recall of 88%. Moreover, our evaluation suggests that RefDiff has superior precision and recall than existing state-of-the-art approaches.

Index Terms:

refactoring; software evolution; software repositories; git.

I Introduction

Refactoring is a well-known technique to improve the design of a system and enable its evolution [1]. In fact, existing studies [2, 3, 4, 5, 6] present strong evidences that refactoring is frequently applied by development teams, and it is an important aspect of their software maintenance workflow.

Therefore, knowing about the refactoring activity in a code change is a valuable information to help researchers to understand software evolution. For example, past studies have used such information to shed light on important aspects of refactoring practice, such as: how developers refactor [2], the usage of refactoring tools [7, 2], the motivations driving refactoring [4, 5, 6], the risks of refactoring [4, 5, 8, 9, 10], and the impact of refactoring on code quality metrics [4, 5]. Moreover, knowing which refactoring operations were applied in the version history of a system may help in several practical tasks. For example, in a study by Kim et al. [4], many developers mentioned the difficulties they face when reviewing or integrating code changes after large refactoring operations, which moves or renames several code elements. Thus, developers feel discouraged to refactor their code. If a tool is able to identify such refactoring operations, it can possibly resolve merge conflicts automatically. Moreover, diff visualization tools can also benefit from such information, presenting refactored code elements side-by-side with their corresponding version before the change. Another application for such information is adapting client code to a refactored version of an API it uses [11, 12]. If we are able to detect the refactorings that were applied to an API, we can replay them on the client code automatically.

Although there are approaches capable of detecting refactorings automatically, there are still some issues that hinder their application. Specifically, the precision and recall of such approaches still need improvements. In this paper, we try to fill this gap by proposing RefDiff, an automated approach that identifies refactorings performed in the version history of a system. RefDiff employs a combination of heuristics based on static analysis and code similarity to detect 13 well-known refactoring types. When compared to existing approaches, RefDiff leverages existing techniques and also introduces some novel ideas, such as the adaptation of the classical TF-IDF similarity measure from information retrieval to compare refactored code elements, and a new strategy to compare the similarity of fields by taking into account the similarity of the statements that reads from or writes to them.

In the paper, we also describe in details a study to evaluate the precision and recall of RefDiff and three existing refactoring detection approaches: Refactoring Miner [6], Refactoring Crawler [13], and Ref-Finder [14, 15]. In our study, RefDiff achieved precision of 100% and recall of 88%, which were the best results among the evaluated approaches.

In summary, the contributions we deliver in this work are:

•

RefDiff, which is a new approach to detect refactoring in version histories. We provide a publicly available111RefDiff and all evaluation data are public available in GitHub:

https://github.com/aserg-ufmg/RefDiff implementation of our approach that is capable of finding refactorings in Java code within git repositories in a fully automated way;

•

a publicly available oracle of 448 known refactoring operations, applied to seven Java systems, that serves as an evaluation benchmark for refactoring detection approaches; and

•

an evaluation of the precision and recall of RefDiff, comparing it with three state-of-the-art approaches.

The remainder of this paper is structured as follows. Section II describes related work, focusing on the three approaches we compare with RefDiff. Section III presents the proposed approach in details. Section IV describes how we evaluated RefDiff and discusses the achieved results. Section V discusses threats to validity and we conclude the paper in Section VI.

II Related Work

Empirical studies on refactoring rely on means to identify refactoring activity. Thus, many different techniques have been proposed and employed for this task. For example, Murphy-Hill et al. [2] collected refactoring usage data using a framework that monitors user actions in the Eclipse IDE, including calls to refactoring commands. Negara et al. [7] also used the strategy of instrumenting the IDE to infer refactorings from fine-grained code edits. Other studies use metadata from version control systems to identify refactoring changes. For example, Ratzinger et al. [16] search for a predefined set of terms in commit messages to classify them as refactoring changes. In specific scenarios, a branch may be created exclusively to refactor the code, as reported by Kim et al. [5]. Another strategy is employed by Soares et al. [17]. They propose an approach that identify behavior-preserving changes by automatically generating and running test-cases. While their approach is intended to guarantee the correct behavior of a system after refactoring, it may also be employed to classify commits as behavior-preserving. Moreover, many existing approaches are based on static analysis. This is the case of the approach proposed by Demeyer et al. [18], which finds refactored elements by observing changes in code metrics.

Static analysis is also frequently used to find differences in the source code [13, 19, 3, 14, 15]. Approaches based on comparing source code differences have the advantage of beeing able to identify each refactoring operation performed. As RefDiff is one of these approaches, it can be directly compared with others within this category. In the next sections, we will describe three of such approaches.

II-A Refactoring Miner

Refactoring Miner is an approach introduced by Tsantalis et al. [3], that was later extend by Silva et al. [6] to mine refactorings in large scale in git repositories. This tool is capable of identifying 14 high-level refactoring types: Rename Package/Class/Method, Move Class/Method/Field, Pull Up Method/Field, Push Down Method/Field, Extract Method, Inline Method, and Extract Superclass/Interface.

Refactoring Miner runs a lightweight algorithm, similar to the UMLDiff proposed by Xing and Stroulia [20], for differencing object-oriented models, inferring the set of classes, methods, and fields added, deleted or moved between two code revisions. First, the algorithm matches code entities in a top-down order (starting from the classes and going to the methods and fields) looking for exact matches on their names and signatures (in the case of methods). Next, the removed/added elements between the two models are matched based only on the equality of their names in order to find changes in the signatures of fields and methods. Third, the removed/added classes are matched based on the similarity of their members at signature level. Finally, a set of rules enforcing structural constraints is applied to identify specific types of refactorings.

In a first study, using the version histories of JUnit, HTTPCore, and HTTPClient, Tsantalis et al. [3] found 8 false positives for the Extract Method refactoring (96.4% precision) and 4 false positives for the Rename Class refactoring (97.6% precision). No false positives were found for the remaining refactorings. In a second study that mined refactorings in 285 GitHub hosted Java repositories, Silva et al. [6] found 1,030 false positives out of 2,441 refactorings (63% precision). However, the authors also evaluated Refactoring Miner using as a benchmark the dataset reported by Chaparro et al. [21], in which it achieved 93% precision and 98% recall.

II-B Refactoring Crawler

Refactoring Crawler, proposed by Dig et al. [13], is an approach capable of finding seven high-level refactoring types: Rename Package/Class/Method, Pull Up Method, Push Down Method, Move Method, and Change Method Signature. It uses a combination of a syntactic analysis to detect refactoring candidates and a more expensive reference graph analysis to refine the results.

First, Refactoring Crawler analyzes the abstract syntax tree of a program and produces a tree, in which each node represents a source code entity (package, class, method, or field). Then, it employs a technique known as shingles encoding to find similar pairs of entities, which are candidates for refactorings. Shingles are representations for strings with the following property: if a string changes slightly, then its shingles also change slightly. In a second phase, Refactoring Crawler applies specific strategies for detecting each refactoring type, and computes a more costly metric that determines the similarity of references among code entities in the two versions of the system. For example, two methods are similar if the sets of methods that call them are similar, and the sets of methods they call are also similar. The strategies to detect refactorings are repeated in a loop until no new refactorings are found. Therefore, the detection of a refactoring, such as a rename, may change the reference graph of code elements and enable the detection of new refactorings.

The authors evaluated Refactoring Crawler comparing pairs of releases of three open source software components: Eclipse UI, Struts, and JHotDraw. Such components were chosen because they provided detailed release notes describing API changes. The authors relied on such information and on manual inspection to build an oracle of known refactorings in those releases, containing 131 refactorings in total. The reported results are: Eclipse UI (90% precision and 86% recall), Struts (100% precision and 86% recall), and JHotDraw (100% precision and 100% recall).

II-C Ref-Finder

Ref-Finder, proposed by Prete et al. [14, 15], is an approach based on logic programming capable of identifying 63 refactoring types from the Fowler’s catalog[1]. The authors express each refactoring type by defining structural constraints, before and after applying a refactoring to a program, in terms of template logic rules.

First, Ref-Finder traverses the abstract syntax tree of a program and extracts facts about code elements, structural dependencies, and the content of code elements, to represent the program in terms of a database of logic facts. Then, it uses a logic programming engine to infer concrete refactoring instances, by creating a logic query based on the constraints defined for each refactoring type. The definition of refactoring types also consider ordering dependencies among them. This way, lower-level refactorings may be queried to identify higher-level, composite refactorings. The detection of some types of refactoring requires a special logic predicate that indicates that the similarity between two methods is above a threshold. For this purpose, the authors implemented a block-level clone detection technique, which removes any beginning and trailing parenthesis, escape characters, white spaces and return keywords and computes word-level similarity between the two texts using the longest common sub-sequence algorithm.

The authors evaluated Ref-Finder in two case studies. In the first one, they used code examples from the Fowler’s catalog to create instances of the 63 refactoring types. The authors reported 93.7% recall and 97.0% precision for this first study. In the second study, the authors used three open-source projects: Carol, jEdit, and Columba. In this case, Ref-Finder was executed in randomly selected pairs of versions. From the 774 refactoring instances found, the authors manually inspected a sample of 344 instances and found that 254 were correct (73.8% precision). However, in a study by Soares et al. [22] using a set of randomly select versions of JHotDraw and Apache Common Collections containing 81 refactoring instances in total, Ref-Finder achieved only 35% precision and 24% recall.

III Proposed Refactoring Detection Algorithm

RefDiff employs a combination of heuristics based on static analysis and code similarity to detect refactorings between two revisions of a system. Thus, RefDiff takes as input two versions of a system, and outputs a list of refactorings found.

The detection algorithm is divided in two main phases: Source Code Analysis and Relationship Analysis. In the first phase, the source code of the system is parsed and analyzed to build a model that represents each high level source code entity, such as types, methods, and fields. Two models are built to represent the system before ( $E_{b}$ ) and after the changes ( $E_{a}$ ). For efficiency, only code entities that belong to modified source files (added, removed or edited) are analyzed. Each of these two models is a set of types, method, and fields contained in the source code. Specifically, $E_{b}=(T_{b}\cup M_{b}\cup F_{b})$ , such that $T_{b}$ , $M_{b}$ , and $F_{b}$ are the sets of types, methods, and fields in the source code before the changes, and $E_{a}=(T_{a}\cup M_{a}\cup F_{a})$ , such that $T_{a}$ , $M_{a}$ , and $F_{a}$ are the sets of types, methods, and fields after the changes.

The second phase of the algorithm, Relationship Analysis, consists in finding relationships between source code entities before and after the code changes. Specifically, the algorithm builds a bipartite graph with two sets of vertices: code entities before ( $E_{b}$ ) and code entities after ( $E_{a}$ ). The edges of this graph are represented by the set of relationships $R$ between code entities. For example, a certain method $m_{1}\in M_{b}$ may correspond to a method $m_{2}\in M_{a}$ that was renamed by a developer. This would correspond to a Rename Method relationship between $m_{1}$ and $m_{2}$ and, consequently, to a Rename Method refactoring.

Table I presents all relationships that RefDiff can identify between types, methods, or fields. We search for relationships between source code entities considering each relationship type in the order they are presented in the table. The following sections detail how such relationships are identified.

III-A Matching Relationships

Some kinds of relationships map code entities before the change to code entities after the change. For example, let $t_{1}\in T_{b}$ be a type in the version before the change. If our algorithm finds another type $t_{2}\in T_{a}$ with the same qualified name, it adds a relationship Same Type between $t_{1}$ and $t_{2}$ in $R$ . This is a matching relationship, because $t_{1}$ corresponds to $t_{2}$ after the change. Other examples of matching relationship are Move Type, Rename Type, and Pull Up Method. In contrast, suppose that our algorithm finds that $m_{2}$ is a method that was extracted from another method $m_{1}$ . In this case, there is an Extract Method relationship between $m_{1}$ and $m_{2}$ , but this is not a matching relationship, because $m_{1}$ does not correspond to $m_{2}$ after the change. From this point on, we use the notation ${e_{1}\sim e_{2}}$ to represent a matching relationship between $e_{1}$ and $e_{2}$ .

We discriminate matching relationships from non-matching relationships because their detection algorithm is similar. For each matching relationship type, we find all pairs of entities $(e_{b},e_{a})\in E_{b}\times E_{a}$ that fall under the conditions specified in Table I. Each relationship type has its specific conditions. For example, as presented in Table I, the conditions for identifying a Rename Method between $m_{1}\in M_{b}$ and $m_{2}\in M_{a}$ are:

•

the names of $m_{1}$ and $m_{2}$ should be different;

•

there should exist a matching relationship between the container classes of $m_{1}$ and $m_{2}$ ; and

•

the similarity index between $m_{1}$ and $m_{2}$ , denoted by $\operatorname{sim}(m_{1},m_{2})$ , should be greater than a threshold $\tau$ .

Whenever these conditions hold, we add the triple $(e_{b},e_{a},\operatorname{sim}(e_{b},e_{a}))$ in a list of potential Rename Method relationships.

The last step to find the actual relationships consists in selecting non-conflicting relationships from the list of potential relationships and add them to the graph. For example, there may be in the list two potential Rename Method relationships: $(e_{1},e_{2},0.5)$ and $(e_{1},e_{3},0.8)$ . However, a code entity can not be involved in more than one matching relationship. Thus, only one of them must be chosen, because $e_{1}$ could not be renamed to $e_{2}$ and to $e_{3}$ . The criterion we use is to choose the triple with the higher similarity index. This means that, in the aforementioned example, we would choose the triple $(e_{1},e_{3},0.8)$ and discard $(e_{1},e_{2},0.5)$ . In Section III-C we describe in details how the similarity index is computed.

III-B Non-matching Relationships

In the previous section, we discussed that an entity could not be involved in multiple matching relationships, but this property does not hold for non-matching relationships. For example, suppose that a developer extracted some code from a method $m_{1}$ into a new method $m_{2}$ , i.e., an Extract Method refactoring was applied. It is also possible that the developer extracted another part of $m_{1}$ into a new method $m_{3}$ .

Given that non-matching relationships do not conflict with each other, the algorithm to identify them is simpler. We just need to find all pairs of entities $(e_{b},e_{a})\in E_{b}\times E_{a}$ that fall under the conditions specified in Table I. For example, the conditions for identifying an Extract Method relationship between $m_{1}\in M_{b}$ and $m_{2}\in M_{a}$ are:

•

there should not exist a method $x\in M_{b}$ such that $x\sim m_{2}$ (i.e., $m_{2}$ was added);

•

there should exist a method $y\in M_{a}$ such that $m_{1}\sim y$ (i.e., $m_{1}$ was not removed);

•

$y$ should call $m_{2}$ ; and

•

the similarity index between $m_{2}$ and $m_{1}$ , denoted by $\operatorname{sim_{p}}(m_{2},m_{1})$ , should be greater than a threshold $\tau$ .

Besides Extract Method, our approach supports the detection of Inline Method and Extract Supertype relationships.

III-C Computing Similarity

Bibliography24

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Fowler, Refactoring: Improving the Design of Existing Code . Addison-Wesley, 1999.
2[2] E. R. Murphy-Hill, C. Parnin, and A. P. Black, “How we refactor, and how we know it,” IEEE Transactions on Software Engineering , vol. 38, no. 1, pp. 5–18, 2012.
3[3] N. Tsantalis, V. Guana, E. Stroulia, and A. Hindle, “A multidimensional empirical study on refactoring activity,” in Conference of the Centre for Advanced Studies on Collaborative Research (CASCON) , 2013, pp. 132–146.
4[4] M. Kim, T. Zimmermann, and N. Nagappan, “A field study of refactoring challenges and benefits,” in 20th Symposium on the Foundations of Software Engineering (FSE) , 2012, pp. 50:1–50:11.
5[5] ——, “An empirical study of refactoring challenges and benefits at Microsoft,” IEEE Transactions on Software Engineering , vol. 40, no. 7, July 2014.
6[6] D. Silva, N. Tsantalis, and M. T. Valente, “Why we refactor? confessions of Git Hub contributors,” in 24th Symposium on the Foundations of Software Engineering (FSE) , 2016, pp. 858–870.
7[7] S. Negara, N. Chen, M. Vakilian, R. E. Johnson, and D. Dig, “A comparative study of manual and automated refactorings,” in 27th European Conference on Object-Oriented Programming (ECOOP) , 2013, pp. 552–576.
8[8] M. Kim, D. Cai, and S. Kim, “An empirical investigation into the role of API-level refactorings during software evolution,” in 33rd International Conference on Software Engineering (ICSE) , 2011, pp. 151–160.