Various soundex algorithms implementations in 10g?

  • From: "Bobak, Mark" <Mark.Bobak@xxxxxxxxxxxxxxx>
  • To: "oracle-l" <oracle-l@xxxxxxxxxxxxx>
  • Date: Mon, 10 Dec 2007 15:35:34 -0500

Hi,

 

I've just had a query from a developer about soundex algorithms in
Oracle 10gR2.  I know about the soundex() function, which is an
implementation of the Russell soundex (described by Knuth in TAOCP).
The developer seems to think that there are some gaps in the Oracle
implementation, though. (working on getting details/specifics of what he
means by gaps in the implementation).  Also, he's asked about some other
algorithms, namely, Daitch-Mokotov and Double Metaphone.  Anyone have
any experience with either of these on Oracle?  Anyone have PL/SQL
implementations they'd be willing to share? J  For now, I think I'm
looking to code this in PL/SQL, and we won't be using any Oracle Text
features.

 

Finally, these are US Census names that we'll be searching, and from
what I read, there's an algorithm called Jaro-Winkler which was
specifically written for US Census name data, *and* Oracle has an
implementation of that one in UTL_MATCH on 10gR2.  So, I've asked the
developer about this algorithm and whether he's considered it.

 

I guess I'm just looking for any folks that have any experience
implementing any of the above algorithms, any thoughts or things to look
out for, pointers to public implementations of any of these
implementations, etc.....

 

-Mark

 

--
Mark J. Bobak
Senior Database Administrator, System & Product Technologies
ProQuest
789 E. Eisenhower, Parkway, P.O. Box 1346
Ann Arbor MI 48106-1346
+1.734.997.4059  or +1.800.521.0600 x 4059
mark.bobak@xxxxxxxxxxxxxxx <mailto:mark.bobak@xxxxxxxxxxxxxxx> 
www.proquest.com <http://www.proquest.com> 
www.csa.com <http://www.csa.com> 

ProQuest...Start here. 

 

Other related posts:

  • » Various soundex algorithms implementations in 10g?