Skip to content

lambdafu/similarity-tester

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#	This file is part of the software similarity tester SIM.
#	Written by Dick Grune, Vrije Universiteit, Amsterdam.
#	$Id: README,v 2.13 2012-06-09 08:09:17 Gebruiker Exp $

These programs test for similar or equal stretches in one or more program
or text files and can be used to detect common code or plagiarism. See sim.pdf.
Checkers are available for C, Java, Pascal, Modula-2, Lisp, Miranda and
natural language text.

>>>> NEW, June 6, 2012:
     - greatly improved percentage computation
     - increased resolution, reducing false positives in sim_text
     - // comments in C recognized
     - characters 0200-0377 accepted in sim_text
     - s p a c e d   w o r d s recognized in sim_text
     - UNICODE file names accepted
     - manual page in PDF

>>>> NEW, March 11, 2009:
     - -R option to follow directories recursively

==== To install on UNIX/Linux, or on MSDOS if you have a C compiler:

Unpack the archive sim_2_*.zip

To compile and test, edit the Makefile to comment out MSDOS entries and/or
change directory names, as appropriate, and call
	make test
This will generate one executable called sim_c, the checker for C, and will
run two small tests to show sample output.

To install, examine the Makefile, edit BINDIR and MAN1DIR to sensible paths,
and call
	make install

To change the default run size or the page width, adjust the file settings.par
and recompile.

==== To install on MSDOS, if you don't have a C compiler, sim_exe_2_*.zip
contains:

	SIM_C.EXE		similarity tester for C
	SIM_JAVA.EXE		similarity tester for Java
	SIM_PASC.EXE		similarity tester for Pascal
	SIM_M2.EXE		similarity tester for Modula-2
	SIM_LISP.EXE		similarity tester for Lisp
	SIM_MIRA.EXE		similarity tester for Miranda
	SIM_TEXT.EXE		similarity tester for text

==== To extend:

To add another language L, write a file Llang.l along the lines of clang.l
and the other *lang.l files, extend the Makefile and recompile.
All knowledge about a given language L is located in Llang.l; the rest of
the programs expect each token to be a 16-bit character.

Available at present:
	clang.l javalang.l pascallang.l m2lang.l lisplang.l miralang.l text.l


					Dick Grune
					Vrije Universiteit
					de Boelelaan 1081
					1081 HV  Amsterdam
					the Netherlands
					email: [email protected]
					http://www.dickgrune.com