Is it truly unique? Search the Internet!
When CodeMatch turns up identical and nearly identical variable
and function names in two sets of source code, how do you know that
this represents copying? Lots of programmers use the names count
and index and array
,
for example. Finding them in two sets of code doesn't mean the code
was copied. How about MQADsPathToFormatName?
That seems pretty unique, right? Actually it's a Windows API and
you could expect to find it in the source code for many different
programs that run on Windows. Search for it on Yahoo and it comes
up 10 times. Not a lot of times, but enough to suggest you do some
more digging and find out what it is. But if two programs contain
a variable called ThisWasCopiedIllegally,
the term doesn't show up at all on Yahoo. That would make you suspicious
- a variable shows up in two programs, one accused of being copied
from the other, and nowhere else on the entire Internet.
This is the beauty of SourceDetective. CodeMatch may find hundreds
or thousands of identifier names that match in two different programs.
SourceDetective, part of the CodeSuite set of tools, automatically
searches for each one on the Internet and creates spreadsheets showing
the number of hits. You can then focus on those identifiers with
few hits, particularly those with 0 hits.
SourceDetective is available with CodeSuite and can be downloaded
for free from the SAFE Corporation website.
|