emcat - extract method comparison analysis tool
There are plenty of source code refactoring automation systems detecting "long method" code smell and fixing it with "extract method" refactoring, like JDeodorant, SEMI, etc. This tool provide a way to compare them on the unlabeled data, i.e. the raw source code from any project.
Suppose we have two such systems A and B, and a set of Java source code files J. To compare such systems we propose following approach. First, we distinguish two functionalities such tools provide: detecting "long methods" and "extracting methods". Those functionalities are independent on each other, and a system can support any of it, or both. Without loss of generality we consider systems A and B support both functionalities.
Analysis of detection functionality is done through calculating source code metrics (NCSS and Cyclomatic Complexity) of all detected method. Let DetectedA(J) and DetectedB(J) be a two sets of detected "long methods", by systems A and B respectively, among files in J. Having distribution of source code metrics on those sets of methods may tell as which system targets large and complex methods. In addition, we can compare those distributions with distribution of source code metrics of all methods in files J.
To compare extraction functionality we need each detected method to be annotated with source code range marked for extraction. Having that range we may commit that extraction. Let AfterExtractionA(J) and ExtractedA(J) be sets of methods after extraction and extracted methods itself. Calculate same source code metrics, as in previous paragraph, allows us to get the distribution of Cyclomatic complexity decrease during extraction and the ratio of extracted block size to the whole method size in terms of non-commented source code statements (NCSS).
The whole process of analysis is reduced to distribution comparison. It could be done in many different ways, like mean, divergence, etc, and is not subject of current tool.
In development
TBD
TBD
Licensed under GPL 3