Scala Gets Operator Overloading Right

02/20/2010 Update: I took this blog post and turned it into a presentation to the Atlanta Scala User Group, which I gave in January 2010. The slides from that presentation are here.

05/31/2009 Update: As Mads Andersen pointed out, in the example below, I had the Complex multiplication wrong. As I was working through the example in the book, I typed it incorrectly. I could argue that this blog post was not about my mastery of complex numbers, and was actually meant to show how operator overloading works in Scala, but in the interest of all-around correctness, I have updated the code to contain the proper implementation of complex number multiplication, and I have worked out the examples by hand using the formula on the Wikipedia page to ensure their accuracy. Now, on to the original article.

I’ve been intrigued by Scala for a while now, and I’m finally taking the time to learn it. As I was reading in my book yesterday, I came to the section on operator overloading. Now, this is a topic that developers who’ve been exposed to it feel very strongly about. It’s not quite as rough a discussion as vi vs. Emacs, but it’s close. Some say that operator overloading makes for more elegant code. Some say it just confuses things. I’ve always been in the elegant camp. I think if you can provide operators for your own classes that work intuitively, you ought to be able to do it. In Java, think about how nice BigDecimal would be to use if you had + and – instead of add() and subtract(). Of course, as with anything of power, you have to be careful not to abuse it. It would make no sense to provide a + operator on a Date class, since adding two dates together makes no sense. You have to be smart about it, but having the ability to provide operators for your own classes performing intuitive functions is a good thing.

So as I’m reading the section on operator overloading, it was nice to see that even though you can override the standard mathematical operators, Scala still maintains the proper precedence that everyone is used to. By this I mean that a + b * c will execute the multiplication first, then add the product of b and c to a. The reason this is interesting is because another language that I still love, Smalltalk, does it wrong (or at least, completely differently), and not just for overloaded operators. Smalltalk has no precedence for mathematical operators at all, ever. So a + b * c will execute the + operation on a, passing in b, and then execute the * operation on that result, passing in c. Always. Thus, 2 + 3 * 5 = 25 in Smalltalk, even though it should equal 17. To get 17, you’d have to write the equation as 2 + (3 * 5). I always found that strange.

The canonical example for operator overloading is a class to represent Complex numbers. I will not break with tradition and will, in fact, steal borrow the one from the book. Here, then, is the definition of the Complex class

class Complex(val real:Int, val imaginary:Int) {
    def +(operand:Complex):Complex = {
        new Complex(real + operand.real, imaginary + operand.imaginary)
    }

    def *(operand:Complex):Complex = {
        new Complex(real * operand.real - imaginary * operand.imaginary,
            real * operand.imaginary + imaginary * operand.real)
    }

    override def toString() = {
        real + (if (imaginary < 0) "" else "+") + imaginary + "i"
    }
}

Notice that we have overridden both the + and * operators. They each take another instance of Complex as an argument, do the proper operations and return a new instance of Complex as their result. Just as you would expect. Now, to exercise these operators, we have this

val c1 = new Complex(1, 2)
val c2 = new Complex(2, -3)
val c3 = c1 + c2

val res = c1 + c2 * c3

printf("(%s) + (%s) * (%s) = %sn", c1, c2, c3, res)

val res1 = c1 + c2 * c3 + c1 * c2

printf("(%s) + (%s) * (%s) + (%s) * (%s) = %sn", c1, c2, c3, c1, c2, res1)

Lines 5 and 9 are the interesting parts. The result of running these statements looks like this

(1+2i) + (2-3i) * (3-1i) = 4-9i
(1+2i) + (2-3i) * (3-1i) + (1+2i) * (2-3i) = 12-8i

which is exactly what you’d expect.

Now, C++ people are probably saying, “But C++ has always done it right!” Indeed. But languages like Smalltalk and Scala are far more fun to work in.

Objective-C 2.0 Properties Are Needlessly Verbose

I’ve been working in Objective-C for a little while now; not quite two years, off and on. I was really excited when Apple announced that Objective-C 2.0 was going to have generated properties, but the syntax they gave us leaves me flat, as it is needlessly verbose.

For those who don’t know, in Objective-C 1.x, if you had an instance variable in a class that you wanted to expose, you had to provide getter and setter methods for it, just like you do in Java, C++ and several other OO languages. You would see something like this in MyClass.h:

@interface MyClass : NSObject {
    NSString *name;
}

- (void) setName: (NSString *) aName;
- (NSString *) name;
@end

and then in MyClass.m, you would see this:

#import "MyClass.h"

@implementation MyClass
- (void) setName: (NSString *) aName
{
    [aName retain];
    [name release];
    name = aName;
}

- (NSString *) name
{
    return name;
}
@end

Objective-C 2.0 promised to eliminate all that boilerplate code in your *.m files for getting and setting variables. But they did it in a strange way. Now, in MyClass.h, you would see this:

@interface MyClass : NSObject {
    NSString *name;
}

@property(nonatomic, retain) NSString *name;
@end

and then in MyClass.m, this

#import "MyClass.h"

@implementation MyClass

@synthesize name

@end

Now, it certainly cut out quite a bit of code for the getter and setter, but why do I have to declare the type of the property twice? You have to declare the instance variable as usual, but then you also have to specify the data type again when you add the @property declaration. There’s no reason I can think of that those two lines couldn’t have been combined into the variable declaration. Objective-C already has tokens that are ignored, such as IBOutlet, so it shouldn’t have been an issue with breaking the parser. And the @synthesize declaration in the *.m file is annoying, but I guess it was necessary to keep the properties from being auto-created in the wrong place.  In my opinion, this is what property declarations should look like

@interface MyClass : NSObject {
    @property(nonatomic, retain) NSString *name;
}

@end

That’s it. No duplication. Simple. Elegant.

Can anyone think of a good reason why they didn’t do it like that?

12/28/2008 15:13:23 Update: As Ahruman pointed out in his comment, I misspoke about IBOutlet. It is not actually ignored, but is used to tag an instance variable for use by Interface Builder. Sorry for the confusion. And be sure to read his comment below. It’s packed with good info that I didn’t know.

Grails Podcast Mentions My Closure Post

Like other bloggers with an ego, I have Google Alerts set up to let me know when someone mentions me or my blog anywhere that Google knows about. I got an alert yesterday letting me know that I’d been mentioned on the latest episode of the Grails Podcast. How cool is that? Specifically, they mentioned my [cref groovy-sql-closure-examples Groovy Sql Closure Examples] post. Thanks, Glen and Sven, for the podcast love. 🙂

I’ve been spending some time with Grails latest and have been really impressed with it. I spent a couple of hours on Saturday playing with it, seeing how much of my Rails knowledge was applicable to Grails. Quite a bit of it, actually. I really like what I’ve seen of Grails, so far. I’d probably have to use it on a real project to really get a feel for it, but it looks like it would be a nice environment to work in.

Dear Apple: Some Java Love, Please?

I love your machines. Truly, I do. Back in 1988 I bought a toaster-model Mac SE, with one megabyte of RAM, and I loved it. It only had a nine inch, black-and-white screen, and I loved it. For various reasons, I sort of lost the love for a while, until 2006. I acquired an iBook G4 in a hardware trade with a friend and I quickly became hooked on the sweet goodness that is OSX. That was in August, 2006. Two months later I bought a Mac Pro, which I love so much I sometimes feel the need to kiss it goodnight.

But there’s one thing about the Mac that bothers me: lousy Java support. Sun handles JDK releases for Windows and Sun machines and every Linux system on the planet. Yet, for some inscrutable reason, you have decided to handle Java for OSX yourself. And, not to be rude, but I just have to say that you suck at maintaining Java for the Mac!!! Let me ‘splain.

Sun released the first version of Java 6 for Windows, Linux and Solaris in December 2006. Two days ago, Sun released the tenth update for Java 6, again for Windows, Linux and Solaris. On September 24, 2008, you guys released Java 6_07, which was nice to finally get it, but it’s only for Leopard systems and it’s only for 64bit machines. My Mac Pro is 64bit and Leopard, but my iBook is 32bit and can’t run Leopard. And what about the tons of other developers out there who don’t meet these requirements? I can’t think of a good reason you have restricted Java 6 in this way, but I can think of a few bad reasons. Probably the easiest to come up with is that you’re trying to force Java developers to buy more expensive Apple machines.

What’s really funny about the crappy state of Java on the Mac is comments from Sir Steve himself, several years ago. I was at JavaOne in 2000. Sir Steve was the Mystery Date™ for the keynote speech on Day One of the conference. His Steveness trots on stage, clad all in black, and proclaims that he was going to make the Mac the ultimate platform for Java developers. Apple would be bundling Java 2 SE with OSX. And the crowd went wild. And he did make the Mac a great Java development platform. For a while. I can’t tell you how many conferences I went to after that, Java conferences, where the majority of developers were toting Mac laptops around. 

But then you started falling behind with the releases. And then you started restricting which of your users were worthy of getting updates. What gives, Apple? If Sun can release timely versions of Java that run on a ton of disparate systems, why can’t you release timely versions that run across your own hardware family? It’s absurd that you are only supporting 64bit Leopard system for the latest versions of Java, and even then you make us wait forever. 

So, how can we fix this? I think you should go back to Sun and say something like,

I’m sorry, Sun. We like to meticulously control everything, but in this case, that desire has caused us to hose down our customers. They’re not happy, and we can’t figure out a good way to appease them. Please, Sun, would you take over maintenance of the JDK/JRE for OSX? We’d really appreciate it.

Or something like that. Something needs to happen soon. Although the lastest version sounds like just another update to Java6, there are actually lots of new features that are going to really improve Java. Except those of us on the Mac have to wait for some unknown amount of time before you guys release your own version. And if we’re not 64bit Leopard, we’re screwed.

Please, Apple, help us out with some timely Java love, OK?

Sincerely,

Joey Gibson

Groovy Sql Closure Examples

My [cref why-i-love-closures post about closures] last week generated quite a bit of traffic and comments, both positive and negative. I decided to followup on that post with a few examples of how to add a method that I believe is missing from Groovy’s Sql class that will execute a closure, and will guarantee that the connection gets closed, no matter the outcome of the closure’s contents.

I’m going to discuss three ways to add a method that does what we want:

  1. Wrap the existing class
  2. Modify the MetaClass of the existing class
  3. Use a Category

All three of these are extremely simple. I think it mostly comes down to preference as to which one you might want to use. Before we get started, let me say that Groovy is optionally-typed. What this means is that you don’t have to declare variable types if you don’t want to. I like not having to declare variable types, so I have not done so in any of these example. You might hate that, and think it sloppy/heretical/evil. If so, and you choose to use any of the code I present, feel free to declare your variable types. So, with that out of the way, let’s start with the wrapping approach.

Wrap the existing class

First, we’ll create a class called ISql that wraps an instance of Sql.

import groovy.sql.Sql

public class ISql
{
   public static newInstance(url, user, pass, driver, closure)
   {
     def con

     try
     {
       con = Sql.newInstance(url, user, pass, driver)

       if (closure)
       {
         closure.call(con)
       }
     }
     finally
     {
       con.close()
     }
   }
}

You can see that we’ve got a class with a single static method called newInstance, to mimic the standard way of creating a Sql instance. It takes the same four arguments that the Sql class does, but it takes one extra argument: a Closure. All this method does is create an instance of Sql, executes the closure, passing in the Sql instance, and then ensures that the connection gets closed, through the call to con.close() in the finally block. To use this class, you can do this

ISql.newInstance(url, user, pass, driver) {con ->
  con.eachRow("select * from Foo") {row ->
    println "ID: $row.id"
  }
}

In this test, we call our special newInstance method, passing in the connection parameters, and then tacking on a closure at the end. Inside the closure, we get access to the connection through the con variable, and then we can do anything we would normally do with a Sql connection. In this case, we execute a query and print the value of each row’s Id column. Nothing too exciting, but it works. No matter what happens inside those closures, the connection is guaranteed to be closed at the end. 

Modify the MetaClass of the existing class

The second way to do this is to add a method to Groovy’s built-in Sql class by modifying its MetaClass. Here’s the code to do that:

Sql.metaClass.static.newInstance << {url, user, pass, driver, closure ->
  def con

  try
  {
    con = Sql.newInstance(url, user, pass, driver)

    if (closure)
    {
      closure.call(con)
    }
  }
  finally
  {
    con.close()
  }
}

Everything after line three is exactly like what we did in the wrapping approach. The magic occurs in line one. In that line, we get the Sql class’ MetaClass, and then grab the static property of the MetaClass. We then add another method called newInstance by using the << operator to append a closure taking the right number of arguments. To call it, we have code that looks almost identical to our last example, but instead of using the ISql class, we’re using Groovy’s built-in Sql class with our special method included.

Sql.newInstance(url, user, pass, driver) {con -&gt;
  con.eachRow(&quot;select * from Foo&quot;) {row -&gt;
    println &quot;ID: $row.id&quot;
  }
}

You can see that with the exception of the missing ‘I’, the code is exactly the same.

Use a Category

Groovy also provides something called Categories that allow you to add methods to existing classes, but they are only usable while the Category is in use. It’s somewhat confusing, and is my least favorite approach, but here’s how it works. You create a class with a static method taking the arguments you want to pass, plus an extra argument, usually called “self”, that will get passed the thing on which you’re calling this method. This special required argument must be the first in the list. Here’s our category called SqlHelper

import groovy.sql.Sql

public class SqlHelper
{
  def static newInstance(self, url, user, pass, driver, closure)
  {
    def con

    try
    {
      con = Sql.newInstance(url, user, pass, driver)

      if (closure)
      {
        closure.call(con)
      }
    }
    finally
    {
      con.close()
    }     
  }
}

Notice that the first parameter to newInstance is that special “self” variable. If you don’t include that argument in the method declaration, calling the method won’t work. I should add that you don’t actually pass any argument for “self” yourself when calling the method. This is handled by Groovy, in much the same way Python programs stuff values into a method’s “self” argument. So, to use the category, you have to wrap your operation in a “use” block

use(SqlHelper)
{
  Sql.newInstance(url, user, pass, driver) {con -&gt;
    con.eachRow(&quot;select * from Foo&quot;) {row -&gt;
      println &quot;ID: $row.id&quot;
    }
  }
}

Here, we declare that we want to use SqlHelper by using a “use” block that contains the category in parentheses. Within the curly brackets of the use (also a closure), Groovy will see our call to newInstance on the Sql class, and will figure out that we want to use the one in the category. It will then call that method, passing the Sql class as the self parameter, and all our other arguments as you would expect. With the exception of having to use the “use” block, this code looks just like the second example.

When Something Goes Wrong

I said that in each case, no matter what happens in the closure, the connection was guaranteed to get closed. Someone commented on the last post that he was concerned that the use of closures would somehow obscure where the problem occurred. It doesn’t. You still get the line number of where things went pear-shaped. For example,

try
{
  ISql.newInstance(url, user, pass, driver) {con -&gt;
    con.eachRow(&quot;select * from boingo&quot;) {row -&gt;
      println &quot;ID: $row.id&quot;
    }
  }
}
catch (Exception e)
{
  e.printStackTrace()
}

In this case, there is no table called “boingo,” and so when I execute this code I get an exception thrown, and from the stack trace, I can see where the problem occurred:

java.sql.SQLException: Invalid object name 'boingo'.
  ...
  at groovy.sql.Sql.eachRow(Sql.java:559)
  at groovy.sql.Sql.eachRow(Sql.java:541)
  ...
  at ISqlTest$_testBadness_closure2.doCall(ISqlTest.groovy:38)

I cut some of the stack dump out for brevity, but you can see that it was a java.sql.SQLException that was thrown, and it references line 38 in the code. That just as much information as you’d get from straight Java in this case, so you should be able to diagnose the problem.

Other Approaches

I should add that you should be able to subclass the Sql class, adding a static method called newInstance that accepts a closure in addition to the four connection arguments. I tried that, but it didn’t work. Actually, it partially worked. The newInstance method I added worked like a champ, but the original newInstance method was no longer visible. I don’t know why, but that’s the behavior I was seeing. It might just be that I’m not familiar enough with Groovy, but I couldn’t get it to work. If anyone knows why, let me know.

To Sum Up

So, those are three ways to add a closure-with-guaranteed-connection-closing method to Groovy. Which of these approaches should you use? Personally, I prefer the second way, adding a method to Sql through its metaclass, but it really comes down to preference. Whichever way you choose should be documented so your teammates understand what’s going on.

I hope that someone on the Groovy team will realize that they left this functionality out of the previous versions and decide to add it for a future version. It really strikes me as odd that this is not in there already, since it seems to fit so well with the language. The fact that various methods inside the Sql class take closures, but not the construction method, makes me think it was just an oversight.

Why I Love Closures

I’ve been a big fan of closures for years. I was first introduced to them in Smalltalk, where they were just called blocks. Ruby has them and also calls them blocks. Java does not have them, though there are proposals (such as this one) to add them to a future version of the language. Groovy has them now, and while they aren’t as shiny as those in Ruby, they do work.

So what’s so great about closures anyway? They are blocks of code that retain links to things that were in scope when they were created, but they also have access to things that are in scope when they execute. They can be passed around, usually to methods that will execute them at a later time, in a possibly different context. That doesn’t sound all that exciting, but what is exciting is when the language in question lets you pass in a closure to do some work, and then cleans up after you when the work is done. Here’s an example of some Ruby code that I have written in tons of scripts.

count = 0

DBI.connect(*CREDENTIALS) do |con|
  File.open(&quot;results.txt&quot;, &quot;w&quot;) do |file|
    con.select_all(sql) do |row|
      file.puts &quot;... interesting data from the query ...&quot;
      count += 1
    end
  end
end

puts &quot;Total records: #{count}&quot;

What that code does is declare a variable, count, that will be used to keep track of how many records we processed. It then connects to a database, creates a file called “results.txt,” executes a SQL select, writes some of the data from each row to the file and increments the count variable. At the end, we print out the count variable.

There are three closures in that bit of code. They begin on lines 3, 4 and 5, respectively. The first connects to the database. The second creates the output file and the third performs a SQL select, writes the output and bumps the counter. Rather concise code, don’t you think?

Do you notice anything missing from this code? Two very important things are not there: closing the file and closing the database connection. I said these bits are not there, but they really are. The block version of DBI.connect ensures that no matter how events unfold, either successfully or with an exception, the database connection will get closed. Similarly, the File.open ensures the file will get closed. This is one of the most beautiful aspects of languages that support closures.

As I said earlier, Groovy has closure support baked in. While I’m not completely thrilled with the syntax, it’s close enough to Ruby’s that it’s not bad. As I was writing this post, I was surprised to discover that Groovy’s SQL module doesn’t support a closure passed to the connection, which means you still have to worry about closing your connection when you’re done. Anyway, Groovy’s file-writing idiom looks like this

new FileWriter(&quot;results.txt&quot;).withWriter {writer -&gt;
  writer.write(&quot;... interesting data ...&quot;)
}

I don’t really like using the -> as the closure parameter delimiter; I’d rather use a | (pipe symbol) like Smalltalk and Ruby do. Regardless of syntax differences, that’s how you write the file, but what about the SQL stuff? Since we can’t use a closure, it’s a bit more involved.

con = Sql.newInstance(url, user, pass, driver)

try
{
  new FileWriter(&quot;output.txt&quot;).withWriter {writer -&gt;
    con.eachRow(&quot;select * from foo&quot;) {row -&gt;
      writer.write(&quot;Foo: ${row.id}n&quot;)
    }
  }
}
finally
{
  con.close()
}

You can see that even though we can’t use a closure to ensure the database connection gets closed, there are still two closures in use. The one beginning on line five opens the file, while the one beginning on line six writes out some of the data from each row returned by the query. Pretty nifty, eh? I’m disappointed that Groovy doesn’t support passing a closure to the database connection, but maybe we’ll get that in a future version.

For comparison, here’s how you would write this program in straight Java.

Connection con = null;
Statement stmt = null;
ResultSet rs = null;

try
{
  Class.forName(driver);
  con = DriverManager.getConnection(url, user, pass);
  stmt = con.createStatement();
	
  rs = stmt.executeQuery(&quot;select * from Foo&quot;);
	
  PrintWriter writer = new PrintWriter(new FileWriter(&quot;output.txt&quot;));

  try
  {
    while (rs.next())
    {
      writer.println(&quot;Foo: &quot; + rs.getString(&quot;Id&quot;));
    }
  }
  finally
  {
    writer.close();
  }			
}
catch (Exception e)
{
  e.printStackTrace();
}
finally
{
  try
  {
    if (rs != null)
    {
      rs.close();
    }
	
    if (stmt != null)
    {
      stmt.close();
    }
	
    if (con != null)
    {
      con.close();
    }
  }
  catch (Exception e)
  {
    e.printStackTrace();
  }
}

The majority of that code deals with cleaning up when the real work has been finished. I don’t know what Java’s closures will ultimately look like, or if we’ll get them at all. I’m hopeful, though, that we’ll get a decent implementation, and can then jettison code like you see above. (I know that using Hibernate or some other mapping tool can eliminate code like this, but not every situation needs that type of framework.)

When Does an HQL Typo *Not* Cause A Parse Error?

I found something interesting at work yesterday. One of our developers mentioned that when he called a certain method with various sets of parameters, he wasn’t getting back what he expected to get back, based on what he knew was in the database. I put on my sleuth hat and began my investigation.

We use Hibernate for our database layer, and we prefer to use the @NamedQuery annotation to store our queries with the entities they represent. This works out very well for us. But back to the problem. I quickly got to the appropriate .java file and inspected the query. (Obviously I’ve changed the class names, but this is essentially what I found in the file.)

select distinct f
from Foo f
left join fetch f.bar
left join fetch f.baz
lef join fetch f.plonk
where f.id = :fooId

Now, do you notice the typo? Check out line five. It says “lef” instead of “left.” When I first saw this, I thought to myself, “How can this even get parsed by Hibernate into SQL, let alone return any results.” We all work in a large room together, so I mused out loud about this problem. One of our other Really Smart Guys™ came over to have a look. He saw the typo, thought for a second and said, “‘lef’ is being treated as an alias for f.baz on the previous line.” And sure enough, he was right. Just as on the second line where you see “from Foo f,” that ‘f’ is an alias for the entity called Foo. We could have put aliases on each “left join” line, at each line’s end, had we wanted or needed to. By misspelling “left” on line five as “lef,” we unintentionally slapped an alias on the join that is on line four. Even though the query is split across multiple lines here, the HQL parser would see it as one continuous string, and after passing by “fetch f.baz” the next token it would see would be “lef,” which it would interpret as an alias for “f.baz.”

So, as far as parsing the query and translating it into SQL goes, everything is just fine. But there is still a problem caused by the misspelling. Since the parser decided that “lef” was actually an alias, the next bit that it sees is “join fetch f.plonk” which results in a regular inner join, instead of the outer join we really wanted. What this means is that for records in the Foo table who can’t be joined to records in the Plonk table, either because the key is null, or there just isn’t a record in the Plonk table that matches, those records will be excluded from the result set. That’s the behavior our developer was seeing. Changing “lef” to “left” made the whole thing work and the developer got the results he needed.