By Popular Demand, I Give You… Mixer!

“Popular demand,” yeah… right… That’s the ticket. Anyway, I saw this blog entry by Jamie this morning which caused me to write a Ruby program to mix up the letters of words, leaving the first and last letter as they were. I started out with a simple script, then I added better punctuation support, then I converted it to a class, then I wrote unit tests to run it. Anyway, somebody wanted to see it, so here it is. Is it great code? Probably not. Do I care? Not really.

  1  class Mixer  2      private  3      def randomize(str)  4          if str.length < 4  5              return str  6          end  7  8          letters = str.split(//)  9 10          first = letters[0] 11          mid = letters[1..(letters.length - 2)] 12          last = letters[letters.length - 1] 13 14          new_letters = "" 15 16          while mid && mid.length > 0 17              len = mid.length 18              r = rand(len) 19              new_letters << mid.delete_at(r) 20          end 21 22          first + new_letters + last 23      end 24 25      public 26      def mix_file(filename) 27          lines = IO.readlines(filename) 28 29          mix_lines(lines) 30      end 31 32      def mix_string(str) 33          new_str = "" 34 35          mix_lines(str.split).each do |line| 36              new_str << line 37          end 38 39          new_str 40      end 41 42      def mix_lines(lines) 43          lines.collect! do |line| 44              new_line = "" 45 46              line.split(/s+|,|.|!|(|)|"|'/).each do |word| 47                  new_line << randomize(word) << " " 48              end 49 50              new_line 51          end 52 53          lines 54      end 55  end 

And the unit tests, which don’t actually test anything.

  1  require 'test/unit'  2  require 'mix'  3  4  class MixTest < Test::Unit::TestCase  5      def setup()  6          @mixer = Mixer.new  7      end  8  9      def test_word() 10          x = @mixer.mix_string("testing") 11          puts x 12          assert_not_equal("testing", x, "String not randomized") 13      end 14 15      def test_string() 16          x = @mixer.mix_string("this is a humongous test, dangit") 17          puts x 18      end 19 20      def test_file() 21          article = @mixer.mix_file("testfile.txt") 22          #puts article 23      end 24  end 

I’m sure someone will find this useful… Again, “yeah, right.” Ah well, it was a mildly amusing diversion…

Simplicity and Consistency

Mike Clark this morning has a bit of a nudge for Rael to give Ruby a try. Mike makes the following statement that I completely agree with

The beauty of Ruby is its simplicity and consistency. With Ruby, I find myself writing code to get the job done rather than to appease the compiler.

So true! Since Ruby is a dynamic language, there are no variable types to declare, no static checking; variables are just slots. The number of lines of Ruby code to do something is far less than the equivalent Java code, and I would argue more readable. You don’t have to jump through hoops to make the compiler happy, you just write your code to do what you need done. That’s it. It’s a beautiful thing.

The fact that regular expressions are baked right into the language is also a giant plus. This is how Perl does it, and Matz basically lifted this approach when he created Ruby. Python‘s regex support is not nearly as nice since you have to create a regex and call methods on it instead of using a regex literal and using special variables to get the groups, etc. Where having baked-in regex support really shines is in not having to escape backslashed atoms in the regex. Regexen in Java are even more difficult to read than usual because every backslash is doubled to keep the Java string parser from barfing on unknown escapes.

Kata 6

I took a swipe at implementing Dave Thomas’ Kata 6 which is an assignment dealing with anagrams. The goal is to parse a list of 45000-ish words, finding all the words that are anagrams of other words in the file. Dave claims there are 2,530 sets of anagrams, but I only got 2,506. I’m not sure where the disconnect is, but here’s my solution. I welcome any comments and critiques.

 words = IO.readlines("wordlist.txt")  anagrams = Hash.new([])  words.each do |word|     base = Array.new     word.chomp!.downcase!      word.each_byte do |byte|         base << byte.chr     end      base.sort!      anagrams[base.to_s] |= [word] end  # Find the anagrams by eliminating those with only one word anagrams.reject! {|k, v| v.length == 1}  values = anagrams.values.sort do |a, b|     b.length  a.length end  File.open("anagrams.txt", "w") do |file|     values.each do |line|         file.puts(line.join(", "))     end end  largest = anagrams.keys.max do |a, b|     a.length  b.length end  puts "Total: #{anagrams.length}" # puts "Largest set of anagrams: #{values[0].inspect}" #  print "Longest anagram: #{anagrams[largest].inspect} " #  puts "at #{largest.length} characters each" 

Update: Of course, 10 seconds after uploading the code, I see something I could change. Instead of sorting the anagram hash descending by array length, I could have done the following:

 longest = anagrams.to_a.max do |a, b|     a[1].length  b[1].length end 

This will sort and pull the largest one off. The key is bucket 0 and the interesting array is in bucket 1.

First Cut At Kata 8

Dave Thomas of the Pragmatic Programmers has started publishing programming problems, calling them Kata. He’s just published Kata 8 this morning and I’ve had a go at a solution. The problem is to take a supplied list of words and go through it finding all the six letter words that are constructed from shorter words in the file. The full problem is to write the program in three different ways: one optimized for human consumption, one optimized for speed, and one that is highly extensible.

Presented below is my first cut at this kata. I think it is fairly readable, at 79 lines, so this probably will count as my “fit for human consumption” version. It’s relatively fast, completing in 11 seconds.

Comments? Critiques?

 #!/usr/local/bin/ruby  start = Time.now  # Arrays for each class of word fourLetters = Array.new threeLetters = Array.new twoLetters = Array.new sixLetters = Array.new  # Loop over the word list, segregating the words #  to their respective array IO.foreach("wordlist.txt") do |line|     line.chomp!.downcase!      case line.length     when 2         twoLetters << line     when 3         threeLetters << line     when 4         fourLetters << line     when 6         sixLetters << line     end end  candidates = Array.new  # Build up all combinations of four letters + two letters #  and store in them as candidates fourLetters.each do |four|     twoLetters.each do |two|         wc = four + two          candidates << wc     end end  # Build up all combinations of three letters + three #  letters and store them as candidates threeLetters.each do |three|     threeLetters.each do |otherThree|         wc = three + otherThree         candidates << wc     end end  # Finally, all combinations of two letters + two letters #  + two letters and store those as candidates twoLetters.each do |firstTwo|     twoLetters.each do |secondTwo|         twoLetters.each do |thirdTwo|             wc = firstTwo + secondTwo + thirdTwo             candidates << wc         end     end end  # Now get rid of dups and sort in place candidates.uniq!.sort! puts "Candidates = #{candidates.length}" #  # And the two arrays together leaving only those words #  that appear in both lists matches = sixLetters & candidates  # Now write the matches to a file File.open("matches.txt", "w") do |file|     matches.each do |word|         file.puts(word)     end end  finish = Time.now  puts "Started at #{start}" puts "Finished at #{finish}" puts "Total time #{finish.to_i - start.to_i}"