Well the piece of code I have is currently in C# but java is fine I also see a couple open sources things from the page you sent so I am just going to go browsing tonight and see which of these algorithms are suggested as best. I don't really want to have to earn a PHD to write this so if some pour grad student has already done it I sure am not too proud to use any lgpl or better libraries. ken -----Original Message----- From: programmingblind-bounce@xxxxxxxxxxxxx [mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Jared Wright Sent: Tuesday, April 12, 2011 6:48 AM To: programmingblind@xxxxxxxxxxxxx Subject: Re: String Comparison Of course thirty seconds after I send this I stumble on http://staffwww.dcs.shef.ac.uk/people/S.Chapman/stringmetrics.html Hope you like Java! On 4/12/2011 6:45 AM, Jared Wright wrote: > Maybe you've covered this already, but it seems > http://en.wikipedia.org/wiki/String_metric is a great launchpad for the > different methods you could use to accomplish this. I haven't found any > specific libraries yet, sorry. Will keep looking as I have idle time today. > On 4/12/2011 6:28 AM, Ken Perry wrote: >> Sorry I was not clear enough. Let's take two paragraphs as the smallest >> just for a set of data. If I stick an extra sentence at the beginning end >> and middle of one paragraph I want the function to return how much of the >> paragraphs are similar in a percentage bases. The same goes for if every >> other word is similar. As for the language it really doesn't matter I >> write >> in just about anything and everything. If there is a library that does >> this >> in a language I don't know I will learn it to use it. >> >> ken >> >> -----Original Message----- >> From: programmingblind-bounce@xxxxxxxxxxxxx >> [mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Bill Gallik >> Sent: Tuesday, April 12, 2011 1:26 AM >> To: programmingblind@xxxxxxxxxxxxx >> Subject: Re: String Comparison >> >> Ok, what programming language are you using for this? >> >> Also, this is a very open-ended question because you haven't specified >> any >> properties to be compared. For instance, if one string is 10 characters >> long and the other is a half MB then what could the two possibly have in >> common -- wouldn't this return a 0% similarity? >> >> If I were writing such a function, I would design it such that: >> >> 1) are the two strings identical -- if so, return a 100% >> >> 2) if the only difference between the two strings is that some characters >> are upper case and the same lower case letter occupies the identical >> slot in >> >> the second string the function would return a +90% value. >> >> From there, progressively lower similar per centages would be returned >> depending on the requirements of the program. >> ---- >> Holland's Person, Bill >> E-Mail: BillGallik@xxxxxxxxxxxxxx >> - "Evil indeed is the man who has not one woman to mourn him." >> - Sir Arthur Conan Doyle; -- Watson, on the death of Selden in "The >> Hound of >> >> the Baskervilles" >> >> __________ >> View the list's information and change your settings at >> //www.freelists.org/list/programmingblind >> >> __________ >> View the list's information and change your settings at >> //www.freelists.org/list/programmingblind >> > __________ View the list's information and change your settings at //www.freelists.org/list/programmingblind __________ View the list's information and change your settings at //www.freelists.org/list/programmingblind