Ioan, Mica, and I went out for happy hour the other night and some how we got onto the topic of searching for names in a database when you weren’t sure of the spelling. This is a pretty easy thing to do using soundex, which is a simple and fairly effective algorithm.
If you aren’t familiar with soundex, then you might want to read up on this wikipedia article before going any further.
There are a couple of different variations of the soundex algorithm, so if you are going to use it you need to be aware of the differences. The original version discards vowels before removing duplicate letters, and the newer enhanced version of the algorithm removes the duplicated before discarding vowels. This has the effect that some names will have a different soundex code depending on which version of the algorithm is used.
Lets look at a couple of examples on the command line using PHP and MySQL (PHP uses the enhanced soundex algorithm and MySQL uses the original):
php -r "echo soundex('nemo');"
N500
mysql
mysql> select soundex("nemo");
+-----------------+
| soundex("nemo") |
+-----------------+
| N000 |
+-----------------+
1 row in set (0.02 sec)
As we can see, these return different results, so we can’t use them interchangeably. Since we need MySQL’s help here, we’re going to have to do the entire comparison in MySQL. MySQL supports a special SOUNDS LIKE sytax which is the same as saying SOUNDEX(expression1) = SOUNDEX(expression2).
PHP and MySQL
$name = "nemo";
$sql = "SELECT * FROM customers ";
$sql .= "WHERE first_name SOUNDS LIKE '{$name}'";
$result = mysql_query($sql);
Ruby on Rails
While we’re at it let’s look at how to do it in Rails with an ActiveRecord model. Assuming we have a Customer model:
name = "nemo"
customers = Customer.find_all :conditions => ["first_name SOUNDS LIKE ?", name]
Really simple, but helpful stuff.