I’m Liking Scala’s XML Literals

How many times have you written a class that you needed to save and restore to/from XML? How did you do it? There are libraries that will do this for you, but I don’t know if any of them have taken the lead. Usually, you have to either have external XML templates that will be filled in, or you have to create the XML through code using DOM or JDOM or some other DOM. I think Scala has a better way.

Enter Scala’s XML literals. In Scala, you can embed XML right in the middle of regular Scala code. For example,

val num = 23

val xml = 
	<person>
		<name>Tom</name>
	</person>
	
println(num)
println(xml)

In this code, we assign to the val called num the Int value of 23. We then assign the xml val an XML Node, that begins with “person”. That’s neat, but not terribly useful. Here’s where it gets really interesting.

import scala.xml.{Node, XML}

class Person(val id: Int,
			 val firstName: String,
			 val lastName: String,
			 val title: String) {

	override def toString = {
		id + ", " + firstName + " " + lastName + ": " + title
	}
	
	def toXML = {
		<person>
			<id>{id}</id>
			<firstName>{firstName}</firstName>
			<lastName>{lastName}</lastName>
			<title>{title}</title>
		</person>
	}
}

object Person {
	def fromXML(node: Node): Person = {
		new Person(
			(node  "id").text.toInt,
			(node  "firstName").text,
			(node  "lastName").text,
			(node  "title").text
		)
	}
}

val p = new Person(23, "Tom", "Servo", "Robot")
println(p)

val xml = p.toXML

println(xml)

XML.saveFull("Person.xml", xml, "UTF-8", true, null)

val xmlFromFile = XML.loadFile("Person.xml")

val p1 = Person.fromXML(xmlFromFile)
println(p1)

That’s a longer example, but there’s a lot of interesting stuff in there. Lines 3 – 20 are defining a Scala class, but look at lines 12 – 19. Those lines define a method called toXML that returns the current instance of the class, serialized to XML. I’ve defined the XML structure that I want, and Scala will fill in the values by replacing the stuff between the curly braces with whatever those expressions evaluate to. In this case, it’s just the instance variables, but it could be anything you want, including more XML. So if you have a more complex data structure, it could call the toXML method on its instance variables, and they could handle their own serialization, which would then be placed where it should be in the parent XML document. Nifty, eh?

Now, look at lines 22 – 31. This bit of code is defining a companion object to the Person class, which defines a method to deserialize XML into an instance of class Person. Strictly speaking, I didn’t have to use a companion object; I could have just defined the fromXML method at the top level, but I think this makes the code cleaner. What this method does is take an XML node, tear it apart, and return an instance of class Person, with its instance variables populated.

Lines 33 – 45 exercise this code. The result of running this example look like this

box:~/tmp> scala person.scala 
23, Tom Servo: Robot
<person>
			<id>23</id>
			<firstName>Tom</firstName>
			<lastName>Servo</lastName>
			<title>Robot</title>
		</person>
23, Tom Servo: Robot
box:~/tmp>

Pretty neat, eh? In the exercise code, we created a Person object, printed it out, serialized it to XML, printed that out, then saved the XML to a file. We then read that file back in, and deserialized it into another instance of class Person. Using XML literals for de/serialization is a nice application. I can see using this quite a lot. I like how it keeps the XML format right there with the code it will represent, so that when you change the code, you’ll remember to change the XML, too.

Bogus Search Results From Sys-Con

Updated! Be sure to scroll down for the latest!

I’m writing a blog post dealing with Scala’s XML literal syntax and how to use it for object de/serialization and so I wanted to get a list of existing Java XML de/serialization libraries. I went to Google and searched for “java xml serialization” and here’s the first result that I got

googleres

Notice the date, which is April 23, 2009. That is, ostensibly, the publication date of the article, right? Now check what happens when you click the link and go to read the article

syscon

Again, notice the date. It’s six years ago! So, where did Google get the 2009 date? That’s the date of the latest comment on the article. I’m not sure how they are getting Google to display the latest comment date in such a way that it looks like the publication date, but it looks like they are. I can’t imagine that Google just picked that date from the article when it spidered the site. Am I reading too much into this, or is Sys-Con gaming Google for better placement?

09/24/2009 4:26PM Update: Just for grins, I did my Google search for “java xml serialization” again. The Sys-Con article is still the top hit, but notice what’s missing from the search result

google2

Notice, that there is no longer a date showing. Very interesting.