Kata 6

I took a swipe at implementing Dave Thomas’ Kata 6 which is an assignment dealing with anagrams. The goal is to parse a list of 45000-ish words, finding all the words that are anagrams of other words in the file. Dave claims there are 2,530 sets of anagrams, but I only got 2,506. I’m not sure where the disconnect is, but here’s my solution. I welcome any comments and critiques.

 words = IO.readlines("wordlist.txt")  anagrams = Hash.new([])  words.each do |word|     base = Array.new     word.chomp!.downcase!      word.each_byte do |byte|         base << byte.chr     end      base.sort!      anagrams[base.to_s] |= [word] end  # Find the anagrams by eliminating those with only one word anagrams.reject! {|k, v| v.length == 1}  values = anagrams.values.sort do |a, b|     b.length  a.length end  File.open("anagrams.txt", "w") do |file|     values.each do |line|         file.puts(line.join(", "))     end end  largest = anagrams.keys.max do |a, b|     a.length  b.length end  puts "Total: #{anagrams.length}" # puts "Largest set of anagrams: #{values[0].inspect}" #  print "Longest anagram: #{anagrams[largest].inspect} " #  puts "at #{largest.length} characters each" 

Update: Of course, 10 seconds after uploading the code, I see something I could change. Instead of sorting the anagram hash descending by array length, I could have done the following:

 longest = anagrams.to_a.max do |a, b|     a[1].length  b[1].length end 

This will sort and pull the largest one off. The key is bucket 0 and the interesting array is in bucket 1.