I took a swipe at implementing Dave Thomas’ Kata 6 which is an assignment dealing with anagrams. The goal is to parse a list of 45000-ish words, finding all the words that are anagrams of other words in the file. Dave claims there are 2,530 sets of anagrams, but I only got 2,506. I’m not sure where the disconnect is, but here’s my solution. I welcome any comments and critiques.
words = IO.readlines("wordlist.txt") anagrams = Hash.new([]) words.each do |word| base = Array.new word.chomp!.downcase! word.each_byte do |byte| base << byte.chr end base.sort! anagrams[base.to_s] |= [word] end # Find the anagrams by eliminating those with only one word anagrams.reject! {|k, v| v.length == 1} values = anagrams.values.sort do |a, b| b.length a.length end File.open("anagrams.txt", "w") do |file| values.each do |line| file.puts(line.join(", ")) end end largest = anagrams.keys.max do |a, b| a.length b.length end puts "Total: #{anagrams.length}" # puts "Largest set of anagrams: #{values[0].inspect}" # print "Longest anagram: #{anagrams[largest].inspect} " # puts "at #{largest.length} characters each"
Update: Of course, 10 seconds after uploading the code, I see something I could change. Instead of sorting the anagram hash descending by array length, I could have done the following:
longest = anagrams.to_a.max do |a, b| a[1].length b[1].length end
This will sort and pull the largest one off. The key is bucket 0 and the interesting array is in bucket 1.