compares thousands of source code files in multiple directories
and subdirectories to determine which files are the most highly
correlated. This can be used to significantly speed up the
work of finding source code plagiarism, because it can direct
the examiner to look closely at a small amount of code in
a handful of files rather than thousands of combinations.
CodeMatch is also useful for finding open source code within
proprietary code, determining common authorship of two different
programs, and discovering common, standard algorithms within
compares every file in one directory with every file in another
directory, including all subdirectories if requested. CodeMatch
produces a database that can then be exported to an HTML basic
report that lists the most highly correlated pairs of files.
You can click on any particular pair listed in the HTML basic
report see an HTML detailed report that shows the specific
items in the files (statements, comments and strings, identifiers, or
instruction sequences) that caused the high correlation.
currently supports the following programming languages. Supporting
new languages is simple and quick. If the language you need
is not in the list, let us know and we'll usually be able
to add it within days.
uses unique algorithms to find various different ways that
source code files are correlated. These algorithms can find
directly copied source code and even source code that has
been modified to avoid detection. For more information on
the CodeMatch algorithms, click here.