Duplicate Code

Checkstyle Logo

There are many trade-offs when writing a duplicate code detection tool. Some of the conflicting goals are:

  • Fast
  • Low memory usage
  • Avoid false alarms
  • Support multiple/arbitrary languages
  • Support Fuzzy matches (comments, whitespace, linebreaks, variable renaming, etc.)

Note that there are brilliant commercial implementations of duplicate code detection tools. One that is particularly noteworthy is Simian from RedHill Consulting, Inc.

Simian is reasonably priced (free for noncommercial projects) and includes a Checkstyle plugin. We encourage all users of Checkstyle to evaluate Simian as an alternative to the Checks we offer in our distribution.

The following table summarizes the characteristics of the available Checkstyle plugins for duplicate code detection:

Name Speed Memory Usage False Alarms Supported languages Fuzzy matches
StrictDuplicateCode Medium Very Low Possible but very unlikely any language No
Simian Very high Low Possible but very unlikely many languages, including Java and C/C++/C# Limited support

StrictDuplicateCode

Performs a line-by-line comparison of all code lines and reports duplicate code, i.e. a sequence of lines that differ only in indentation. All import statements in Java code are ignored, any other line - including javadoc, whitespace lines between methods, etc. - is considered (which is why the check is called strict).

Properties

name description type default value
min how many lines must be equal to be considered a duplicate    
charset name of the file charset   System property "file.encoding"

Examples

To configure the check:

<module name="StrictDuplicateCode"/>

To configure the check so that it allows larger equivalent blocks:

<module name="StrictDuplicateCode">
    <property name="min" value="15"/>
</module>

To configure the check so that it handles files with the UTF-8 charset:

<module name="StrictDuplicateCode">
    <property name="charset" value="UTF-8"/>
</module>

Package

com.puppycrawl.tools.checkstyle.checks.duplicates

Parent Module

Checker


Copyright © 2001-2004, Oliver Burn