Archive for December, 2005

Posted on December 29, 2005 at 4:31 pm

I’m routing all my feeds through Feedburner now, so tell me if you run into any problems with your subscriptions.

Posted on December 23, 2005 at 2:40 pm

I’ve found two implementations of the enhanced Soundex algorithm for Ruby. The first is written by Mike Stok, and is a port of his Perl implementation:

def soundex(string)
  copy = string.upcase.tr '^A-Z', ''
  return nil if copy.empty?
  first_letter = copy[0, 1]
  copy.tr_s! 'AEHIOUWYBFPVCGJKQSXZDTLMNR', '00000000111122222222334556'
  copy.sub!(/^(.)\1*/, '').gsub!(/0/, '')
  "#{first_letter}#{copy.ljust(3,"0")}"
end

The other is written by Michael Neumann and is available at the Ruby Application Archive.

I would recommend using Mike Stok’s implementation unless you need to be able to pass in an array. It’s twice as fast as the one at the RAA site. Both algorithms tested over 10,000 iterations:

                user        system    total    real
stok soundex:  0.290000   0.010000   0.300000 (  0.350133)
 raa soundex:  0.600000   0.010000   0.610000 (  0.708554)
Posted on at 1:06 pm

Ioan, Mica, and I went out for happy hour the other night and some how we got onto the topic of searching for names in a database when you weren’t sure of the spelling. This is a pretty easy thing to do using soundex, which is a simple and fairly effective algorithm.

If you aren’t familiar with soundex, then you might want to read up on this wikipedia article before going any further.

There are a couple of different variations of the soundex algorithm, so if you are going to use it you need to be aware of the differences. The original version discards vowels before removing duplicate letters, and the newer enhanced version of the algorithm removes the duplicated before discarding vowels. This has the effect that some names will have a different soundex code depending on which version of the algorithm is used.

Lets look at a couple of examples on the command line using PHP and MySQL (PHP uses the enhanced soundex algorithm and MySQL uses the original):

php -r "echo soundex('nemo');"
N500
mysql
mysql> select soundex("nemo");
+-----------------+
| soundex("nemo") |
+-----------------+
| N000            |
+-----------------+
1 row in set (0.02 sec)

As we can see, these return different results, so we can’t use them interchangeably. Since we need MySQL’s help here, we’re going to have to do the entire comparison in MySQL. MySQL supports a special SOUNDS LIKE sytax which is the same as saying SOUNDEX(expression1) = SOUNDEX(expression2).

PHP and MySQL

$name = "nemo";
$sql = "SELECT * FROM customers ";
$sql .= "WHERE first_name SOUNDS LIKE '{$name}'";
$result = mysql_query($sql);

Ruby on Rails
While we’re at it let’s look at how to do it in Rails with an ActiveRecord model. Assuming we have a Customer model:

name = "nemo"
customers = Customer.find_all :conditions => ["first_name SOUNDS LIKE ?", name]

Really simple, but helpful stuff.

Posted on December 8, 2005 at 5:13 pm

I just got confirmation that one of my photos is included in Texas State University’s latest Study Abroad brochure for the College of Liberal Arts. I was contacted by the University to get permission to use the photo a couple of months ago, but I had forgotten about it until yesterday when I received some copies of the brochure.

The photo is of the basilica in Guanajuato, Mexico and is located at the top center of page 2. You can see this photo better in my photo gallery.

Another one of my photos is due to be published in Splendid Pathways: A Tour Through the World’s Finest Botanical Gardens, although I haven’t got confirmation that it made it through the editing process.

It’s a small thing, but cool.

Posted on December 6, 2005 at 6:20 pm

Yesterday, DHH posted a link to his Pursuit of Beauty slides from the Snakes and Rubies event this weekend.

I went through each of the slides looking for new stuff and found several great new things. If you look at slide 14, you’ll see a new :through parameter on one of the associations. There’s no documentation on this yet, but I did a little experimentation by checking out the latest edge_rails.

To see how this new type of association works, let’s look at the traditional way to handle many-to-many relationships when we want to store additional attributes about the join.

Lets take a simple example to illustrate how we can use the new functionality. Let’s say that we publish several newsletters and we let users sign up for as many of these newsletters as they want. We also need to to track several things about each subscription, such as the email format the user would like to receive that newsletter in. What we want to do in this case is make the join table a model. Let’s call it Subscription. We would have three tables: users, subscriptions, newsletters. The models would be set up like this:

class User < ActiveRecord::Base
  has_many :subscriptions
end

class Newsletter < ActiveRecord::Base
  has_many :subscriptions
end

class Subscription < ActiveRecord::Base
  belongs_to :user
  belongs_to :newsletter
end

Let’s say that we want to see a list of what newsletters John Doe has subscribed to:

@newsletters = []
User.find_by_name("John Doe").subscriptions.each do |s|
  @newsletters < < s.newsletter
end

This works great, but it isn’t very elegant. It would be much nicer if we could just get all the newsletters without having to walk through the subscriptions.

Let’s add the :through associations to the models:

class User < ActiveRecord::Base
  has_many :subscriptions
  has_many :newsletters, :through => :subscriptions
end

class Newsletter < ActiveRecord::Base
  has_many :subscriptions
  has_many :users, :through => :subscriptions
end

class Subscription < ActiveRecord::Base
  belongs_to :user
  belongs_to :newsletter
end

Now we can just access the associated models directly:

@newsletters = User.find_by_name("John Doe").newsletters

There truly is beauty in simplicity.

Update
I’ve updated this article with some better examples based on user feedback.