Picture of stu

On Lisp -> Clojure, Chapter 9

  • Posted By Stuart Halloway on December 17, 2008
  • Tags

This article is part of a series describing a port of the samples from On Lisp (OL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 9, Variable Capture.

Macro Argument Capture

Macros and "normal" code are written in the same language, and can share access to names, data, and code, This is the source of their power, but can also cause subtle bugs. What happens if a macro caller and a macro implementer both try to use the same name? The macro can "capture" the name, leading to unintended consequences.

OL begins with an example of argument capture, a broken definition of for. Here is a similar macro in Clojure:

(defmacro bad-for [[idx start stop] & body]
  `(loop [~idx ~start limit ~stop]
     (if (< ~idx ~stop)
       (do
     ~@body
     (recur (inc ~idx) limit)))))

The problem is the name limit introduced inside the macro. If you call bad-for after binding the name limit, strange things will happen. What if you try:

(let [limit 5] 
  (bad-for [i 1 10] 
    (if (> i limit) (print i))))

Presumably the intent here is to print some numbers greater than five. But in many Lisps, this would print nothing, because the bad-for macro invisibly binds limit to ten.

Clojure catches this problem early, and fails with a descriptive error:

(let [limit 5] (bad-for [i 1 10] (if (> i limit) (println i))))
-> java.lang.Exception: Can't let qualified name: ol.chap-09/limit

Clojure makes it difficult to accidentally capture limit, by resolving symbols into a namespace. The limit inside the macro resolves to ol.chap-09/limit, and there is no name collision.

Of course, you do not want your macros to use a shared global name either! What you really want is for macros to use their own guaranteed-unique names. Clojure provides this via auto-gensyms. Simple append # to limit in the bad-for example above, and you get good-for:

(defmacro good-for [[idx start stop] & body]
  `(loop [~idx ~start limit# ~stop]
     (if (< ~idx limit#)
       (do
     ~@body
     (recur (inc ~idx) limit#)))))

Now the macro will use a unique generated name like limit__395, and callers can use good-for as expected:

(let [limit 5] (good-for [i 1 10] (if (> i limit) (println i))))
6
7
8
9

Symbol Capture

Another form of unintended capture is symbol capture, where a symbol in the macro unintentionally refers to a local binding in the environment. OL demonstrates the problem with this example:

First, w is a global collection of warnings that have occurred when using a library. In Clojure:

(def w (ref []))

This is different from the OL implementation because in Clojure data structures are immutable, and mutable things must be wrapped in a reference type that has explicit concurrency semantics. In the code above the ref wraps the immutable [].

Second, the gripe macro adds a warning to w, and returns nil. gripe is intended to be used when bailing out of a function called with bad arguments. In Clojure:

(defmacro gripe [warning]
  `(do
     (dosync (alter w conj ~warning))
     nil))

Again, this is fairly different from OL because you must be explicit about mutable state. To update w you must use a transaction (dosync) and a specific kind of update function (such as alter).

Third, there is a library function sample-ratio that performs some kind of calculation, the details of which are irrelevant to the example. sample-ratio also uses gripe to warn and bailout for certain bad inputs. In Clojure:

(defn sample-ratio [v w]
  (let [vn (count v) wn (count w)]
    (if (or (< vn 2) (< wn 2))
      (gripe "sample < 2")
      (/ vn wn))))

This is practically identical to the OL version, since there is no mutable state to (directly) deal with.

Since we are talking about symbol capture, you can probably guess the problem: What happens when the global w for warnings collides with the local w argument in sample-ratio?

In Common Lisp, this sort of capture would cause the error message to be added to the wrong collection: the local samples w instead of the global warnings w.

In Clojure, this just works. The global w resolves into a namespace, and does not collide with the local one.

More Complex Macros

Clojure's namespaces and auto-gensyms take care of many common problems in macros, but what if you really want capture? You can capture symbols by unquoting them with the unquote character (~, a tilde) and then requoting them with a non-resolving quote character (', a single quote). For example, here is a bad version of gripe that goes out of its way to do the wrong thing and capture w:

(defmacro bad-gripe [warning]
  `(do
     (dosync (alter ~'w conj ~warning))
     nil))

I am not going to show more complex macros that really need this feature. My point here is to show that Clojure doesn't make macros safer by compromising their power. You can still do nasty things, you just have to be more deliberate about it.

Interestingly, Clojure protects you from bad-gripe, even after you go to the trouble of introducing inappropriate symbol capture. Here is a bad-sample-ratio that uses the buggy bad-gripe:

(defn bad-sample-ratio [v w]
  (let [vn (count v) wn (count w)]
    (if (or (< vn 2) (< wn 2))
      (bad-gripe "sample < 2")
      (/ vn wn))))

If you try to call bad-sample-ratio with bad inputs, bad-gripe will not be able to modify the wrong collection:

 (bad-sample-ratio [] [])
-> java.lang.ClassCastException: clojure.lang.PersistentVector cannot\
   be cast to clojure.lang.Ref

Now you see how having immutability as the default can protect you from bugs. The global w is an explicitly mutable reference. But the local w is an implicitly immutable vector. When bad-gripe tries to update the wrong collection, it is thwarted by the fact that the collection is immutable.

Wrapping up

Clojure makes simple macros easier and safer to write. The combination of namespace resolution and auto-gensyms prevents many irritating bugs.

Clojure still has the power to write more complex macros when you need it. With the right combination of unquoting and quoting, you can undo the safety net and write any kind of macro you want.

One final note: Because they are ported straight from Common Lisp, many of the examples here are not idiomatic Clojure. In Clojure most uses of imperative loops such as good-for would be replaced by a more functional style. A good example of this is Clojure's own for, which performs sequence comprehension.

Notes

Other Resources

If you find this series helpful, you might also like:

Revision history

  • 2008/12/17: initial version
Picture of stu

On Lisp -> Clojure

  • Posted By Stuart Halloway on December 12, 2008
  • Tags

I am porting the examples from the macro chapters of Paul Graham's On Lisp (OL) to Clojure.

My ground rules are simple:

  • I am not going to port everything, just the code samples that interest me as I re-read On Lisp.
  • Where Paul introduced macro features in a planned progression, I plan to use whatever Clojure feature come to mind. So I may jump straight into more "advanced" topics.

Please do not assume that this port is a good introduction to Lisp! I am cherry-picking examples that are interesting to me from a Clojure perspective. If you want to learn Lisp, read OL. In fact, you should probably read the relevant chapters in OL first, no matter what.

The Series

Note: Fogus is also porting On Lisp to Clojure.

Other Stuff

If you find this series helpful, you might also like:

Talks

I am available to give conference talks on Clojure. Check the schedule for an event near you, or contact Relevance (info@thinkrelevance.com) to schedule an event.

Notes

Picture of stu

On Lisp -> Clojure, Chapter 7

  • Posted By Stuart Halloway on December 12, 2008
  • Tags

This article is part of a series describing a port of the samples from On Lisp (OL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 7, Macros.

A Few Simple Macros

OL begins with a simple nil! macro that sets something to nil. nil! is implemented as a macro in Common Lisp (CL) nil needs to generate a special form. Clojure puts much more careful boundaries around mutable state, so most Clojure data structures are not set-able at all. The few things that can be set are reference types, each with an explicit API and concurrency semantics.

Because setters go through an explicit API instead of a special form, the Clojure nil! does not need to be macro at all. Here is a nil! for Clojure atoms:

(defn nil! [at]
  (swap! at (fn [_] nil)))

The swap! function is specific to atoms. Usage for nil! looks like:

(def a (atom 10))
(nil! a)
@a
-> nil  

The next interesting macro in OL is nif, which demonstrates the use of backquoting. One way to implement Clojure nif is:

((use '[clojure.contrib.fcase :only (case)])
(defmacro nif [expr pos zer neg]
  `(case (Integer/signum ~expr) 
     -1 ~neg
     0 ~zer
     1 ~pos))

There are a few interesting differences from CL here:

  • Clojure unquoting uses ~ and ~@ instead of CL's , and ,@. This allows Clojure to treat commas as whitespace.
  • Clojure does not have a built-in signum, but it has access to all of Java, including Integer/signum.
  • Clojure's case is not part of core, and is provided by Clojure Contrib.

Defining Simple Macros

OL demonstrates the "fill in the blanks" approach to writing macros:

  • Write the desired expansion.
  • Write the desired macro invocation form.
  • Use backquoting to create a template based on the desired expansion.
  • Use unquoting to substitute forms from the macro invocation into the template.

As examples, OL uses our-when and our-while. The Clojure equivalents are:

(defmacro our-when [test & body]
  `(if ~test
     (do
       ~@body)))
(defmacro our-while [test & body]
  `(loop []
     (when ~test
       ~@body
       (recur))))

There is one interesting new thing here. Clojure' loop/recur is an explicit way to denote a self-tail-call so that Clojure can implement it with a non-stack-consuming iteration. (Clojure cannot optimize tail calls in a generic way due to limitations of the JVM.)

It is also worth noting that while loops are uncommon in Clojure. They rely on side effects that change the result of test, and most Clojure functions avoid side effects.

Destructuring in Macros

Both Clojure and CL support destructuring in macro definitions. The OL example of this is a when-bind macro. Here is a literal translation in Clojure:

(defmacro when-bind [bindings & body]
  (let [[form tst] bindings]
    `(let [~form ~tst]
       (when ~form
     ~@body))))

The [form tst] is a destructuring bind. The first element of bindings binds to form, and the second element to tst. Usage looks like this:

 (when-bind [a (+ 1 2)] (println "a is" a))
a is 3

Do not use the when-bind as defined above. Clojure provides a better version called when-let:

; from Clojure core
(defmacro when-let
  [bindings & body]
  (if (vector? bindings)
    (let [[form tst] bindings]
      `(let [temp# ~tst]
         (when temp#
           (let [~form temp#]
             ~@body))))
    (throw (IllegalArgumentException.
             "when-let now requires a vector for its binding"))))

when-let adds two features not present in when-bind:

  • when-let requires that the binding form be a vector. This leads to the "arguments in square brackets" style that distinguishes Clojure from many Lisps.
  • when-let introduces a temporary binding temp# using Clojure's auto-gensym feature.

The temporary binding of temp# keeps the binding form from being expanded directly into the when, because some binding forms are not legal for evaluation. The following output shows the difference:

 (when-bind [[a & b] [1 2 3]] (println "b is" b))
->java.lang.Exception: Unable to resolve symbol: & in this context 
(when-let [[a & b] [1 2 3]] (println "b is" b))
-> b is (2 3)

If it is not clear to you why when-bind doesn't work, try calling macroexpand-1 on both the forms above.

Wrapping up

The concepts in OL Chapter 7 translate fairly directly from Common Lisp into Clojure. The bigger differences are choices of idiom. Many of the examples in Common Lisp presume mutable state. In the typical Clojure program these forms would be in the minority.

Notes

Revision history

  • 2008/12/12: initial version
Picture of stu

Living Lazy, Without Variables

  • Posted By Stuart Halloway on December 01, 2008
  • Tags

Programmers coming to functional languages for the first time cannot imagine life without variables. I address this head-on in the Clojure book. In Section 2.7 (free download here), I port an imperative method from the Apache Commons Lang to Clojure. First the Java version:

// From Apache Commons Lang, http://commons.apache.org/lang/
public static int indexOfAny(String str, char[] searchChars) {
  if (isEmpty(str) || ArrayUtils.isEmpty(searchChars)) {
      return -1;
  }
  for (int i = 0; i < str.length(); i++) {
      char ch = str.charAt(i);
      for (int j = 0; j < searchChars.length; j++) {
        if (searchChars[j] == ch) {
            return i;
        } 
      }
  }
  return -1;
}

And now the Clojure code. I have shown the supporting function indexed as well:

(defn indexed [s] (map vector (iterate inc 0) s))
(defn index-of-any [s chars]
  (some (fn [[idx char]] (if (get chars char) idx)) 
          (indexed s)))

There are many things I like about the Clojure version, but I want to focus on something I didn't mention already in the book. A reader thought the Clojure version did too much work:

...the [Java] version can be seen as *more efficient* when a match is found because scanning stops right there, whereas "indexed" constructs the whole list of pairs, regardless of whether or not a match WILL be found....

The reader's assumption is reasonable, but incorrect. Clojure's sequence library functions are generally lazy. So the call to indexed is really just a promise to generate indexes if they are actually needed.

To see this, create a logging-seq that writes to stdout every time it actually yields an element:

(defn logging-seq [s]
  (if s
    (do (println "Iterating over " (first s))
    (lazy-cons (first s) (logging-seq (rest s))))))

Now, you can add logging-seq to indexed so that each element of indexed is of the form [index, element, logged-element].

(defn indexed [s] (map vector (iterate inc 0) s (logging-seq s)))

Test the modified indexed function at the Clojure REPL:

user=> (indexed "foo")
Iterating over  f
(Iterating over  o
[0 \f \f] Iterating over  o
[1 \o \o] [2 \o \o])

As you can see, the indexed sequence is only produced as needed. (At the REPL it is needed to print the return value.)

Finally, you can test indexed-of-any and see that Clojure only produces enough of the sequence to get an answer. For a match on the first character, it only goes to the first character:

(index-of-any "foo" #{\f})
Iterating over  f
0

If there is no match, index-of-any has to traverse the entire string:

(index-of-any "foo" #{\z})
Iterating over  f
Iterating over  o
Iterating over  o
nil

So give up on those variables, and live lazy!

Picture of stu

Clojure Wins Again

  • Posted By Stuart Halloway on November 21, 2008
  • Tags

Steve Yegge's most recent post takes a right angle turn about a third of the way through, and begins a comparison of Emacs Lisp and JavaScript.

And the winner is ... Clojure!

OK, Steve didn't say that. What he did do was call out things he liked about JavaScript and Emacs Lisp.

For JavaScript:

  • momentum
  • (namespace) encapsulation
  • delegation (polymorphism?)
  • properties (by Steve's definition)
  • serialize to source

For Emacs Lisp:

  • Macros
  • S-Expressions

I first picked up Clojure looking for many of the same things that Steve wants. I found them. Clojure can do all the things on both lists above. (Serialize to source isn't formal yet, but check the mailing list. And of course, you will have to judge "momentum" for yourself.)

The scary thing is that Clojure wins the language war before you even learn about its signature features. When I started exploring Clojure, I quickly realized it had everything I wanted, which could be summarized as "Lisp that really embraces the Java platform."

Then Clojure changed the definition of what I wanted. Now I also want

If you have half an hour, watch a compelling vision of what software development will look like in 2010.

Picture of stu

Clojure Beta Book Available

  • Posted By Stuart Halloway on November 05, 2008
  • Tags

The Clojure Beta book is now available. Here's the Table of Contents. (Chapters with an asterisk are included in this beta.)

  • Preface*
  • Getting Started*
  • Exploring Clojure*
  • Working with Java*
  • Unifying Data with Sequences*
  • Functional Programming
  • Concurrency*
  • Macros
  • Multimethods
  • Third-Party Libraries
  • Case Study

Because this is a Beta book, and Clojure is continuing to evolve, there will be errata. Please let me know any problems you find, and I will address them in the next Beta.

Other Clojure resources

Picture of stu

Concurrent Programming with Clojure

  • Posted By Stuart Halloway on October 10, 2008
  • Tags

Clojure is dynamic language for the Java Virtual Machine with several powerful features for building concurrent applications. In this talk you will learn about:

  • Functional programming. Clojure's immutable data structures encourage side-effect free programming that can easily be shared across multiple processor cores.
  • Software Transactional Memory (STM). STM provides a mechanism for managing references and updates across threads that is easier to use and less error-prone than lock-based concurrency.
  • Direct access to Java. Clojure calls Java directly, and can emit the same byte code that a handcrafted Java program would. So, you can easily access the java.util.concurrent library.
Picture of stu

PCL -> Clojure, Chapter 17

  • Posted By Stuart Halloway on September 25, 2008
  • Tags

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 17, Object Reorientation: Classes.

Creating structs

Common Lisp defines classed with defclass. In Clojure, I can define structs with defstruct:

(defstruct bank-account :customer-name :balance)

The bank-account struct has two basis keys: customer-name and balance. I can specify values for these keys, in the order they were declared, using struct:

user=> (struct bank-account "John Doe" 1000)
{:customer-name "John Doe", :balance 1000}

With struct, all the basis keys are optional:

user=> (struct bank-account)
{:customer-name nil, :balance nil}

If you prefer named parameters, you can use struct-map instead of struct:

user=> (struct-map bank-account :balance 10)
{:customer-name nil, :balance 10}

Very important: structs are still maps. I can specify additional keys that are not part of the basis:

user=> (struct-map bank-account  :balance 10
                                   :customer-name "Jane Doe"
                                   :status :gold)
{:customer-name "Jane Doe", :balance 10, :status :gold}

Accessing structs

The examples below assume an example-account:

(def example-account (struct bank-account "Example Customer" 1000))

Pedants call get to access a structure value:

user=> (get example-account :customer-name)
"Example Customer"

But that's way too much effort. Structures are functions of their keys:

user=> (example-account :customer-name)
"Example Customer"

If the struct keys are symbols, I can go the other way. Symbols are functions of structs:

user=> (:customer-name example-account)
"Example Customer"

Other than symbols, what else can be a structure key? Ah, sweet immutability. Since Clojure data structures are immutable, any of them can function as keys.

I can use assoc and dissoc to get a new map with a key added or removed:

user=> (assoc example-account :status :elite)
{:customer-name "Example Customer", :balance 1000, :status :elite}

user=> (dissoc {:a 1 :b 2} :a)
{:b 2}

But I can't dissoc from example-account because you can never remove a basis key:

user=> (dissoc example-account :customer-name)
java.lang.Exception: Can't remove struct key

Defaults and validation

Since structs are also maps, default values are easy: just merge them. The example below doesn't even use a struct. (Often duck typing is good enough.)

(def account-defaults {:balance 0})
(defn create-account [options]
  (merge account-defaults options))

If I want to validate fields, I can just write a validation function. Here is a validation that simply requires non-false values:

(defn validate-account [account]
  (or (every? account [:customer-name :balance])
      (throw (IllegalArgumentException. "Not a valid account"))))

Of course, if I wanted to create tons of different structs with similar validations, I could build some helpers. Macros + metadata would be one way to go.

Wrapping up

Clojure's structs fill some of the same roles as Common Lisp's classes. The exmaples above show how to create and access structs, and how to add default values and validation.

That said, Clojure's structs are not classes. They do not offer inheritance, polymorphism, etc. In Clojure, those kinds of jobs are handled by the incredibly flexible defmulti (see the previous article for details, especially the references at the end).

Notes

Picture of stu

PCL -> Clojure, Chapter 16

  • Posted By Stuart Halloway on September 25, 2008
  • Tags

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 16, Object Reorientation: Generic Functions.

defmulti

In Common Lisp, a generic function defines an abstract operation and a parameter list. In Clojure, a multimethod takes a similar role:

(defmulti draw :shape)

The multimethod's name is multi, and :shape is a dispatch function used to select the actual concrete implementation. (Remember that keywords like :shape are also lookup functions.) Now, I can create one or more methods:

(defmethod draw :square [shape] "TBD: draw a sqaure")
(defmethod draw :circle [shape] "TBD: draw a circle")

The first method will draw things with a :shape of :square, and the second method will draw things with a :shape of :circle:

user=> (draw {:shape :square, :length 10})
"TBD: draw a square"
user=> (draw {:shape :circle, :radius 8})
"TBD: draw a circle"  

The draw multimethod is emulating single inheritance, if you think of an object's :shape value as its type. But the multimethod mechanism is more general.

A more complete example

Let's say that I need to implement account withdrawals. Different kinds of accounts will have different rules:

  • Bank accounts are simple accounts. Withdrawals will work if there is enough money available.
  • Checking accounts attach an overdraft account which can be used to cover large withdrawals.

The multimethod for withdraw could look like this:

(defmulti withdraw :account-type)

The bank account implementation will do a simple withdraw.

(defmethod withdraw :bank [account amount]
  (raw-withdraw account amount))

PCL uses Common Lisp's method combination to share implementation code between the different account types. Clojure's dispatch is much more general, so a general method combination mechanism is not appropriate. I am taking a different approach, pulling the shared code into a helper function raw-withdraw:

(defn raw-withdraw [account amount]
  (when (< (:balance account) amount)
    (throw (IllegalArgumentException. "Account overdrawn")))
  (assoc account :balance (- (:balance account) amount)))

The withdrawal differs from the original PCL implementation in one other way. The original code mutated the account. Since mutation is a no-no, I am instead returning a new account object, associng in the changed balance. In the example below, I am using a let just to show that the original account is unchanged.

(let [original-state {:account-type :bank :balance 100}
      updated-state (withdraw original-state 50)]
  (println original-state updated-state)) 

{:balance 100, :account-type :bank} {:balance 50, :account-type :bank}

The checking account is a little more complex. First, I have to shuttle money in from the overdraft account (if necessary), then raw-withdraw as before:

(defmethod withdraw :checking [account amount]
  (let [over-account (account :overdraft-account)
    over-amount (- amount (:balance account))
    withdrawal-account 
    (if (> over-amount 0)
      (merge account
         {:overdraft-account (withdraw over-account over-amount)
          :balance amount})
      account)]
    (raw-withdraw withdrawal-account amount)))

Again, all the objects are immutable. The merge function returns a new account object (possibly with an overdraft), and the raw-withdraw returns another object:

(let [overdraft {:account-type :checking, :balance 1000}
      original-state {:account-type :checking
              :balance 100
              :overdraft-account overdraft}
      updated-state (withdraw original-state 500)]
  (println original-state)
  (println updated-state))

{:overdraft-account {:balance 1000, :account-type :checking}, 
 :balance 100, 
 :account-type :checking}
{:overdraft-account {:balance 600, :account-type :checking}, 
 :balance 0, 
 :account-type :checking}

Dispatching on more than one parameter

In languages like Java, methods are polymorphic on their first (implicit) parameter. Because multimethods dispatch on arbitrary functions, they can be polymorphic on all of their parameters.

For example, a music library might implement a beat method that is polymorphic on both the drum and the stick:

(defmulti beat (fn [d s] [(:drum d)(:stick s)]))
(defmethod beat [:snare-drum :brush] [drum stick] "snare drum and brush")
(defmethod beat [:snare-drum :soft-mallet] [drum stick] "snare drum and soft mallet")

The first beat method matches only snare drum + brush, etc.:

user=> (beat {:drum :snare-drum} {:stick :brush})
"snare drum and brush"
user=> (beat {:drum :snare-drum} {:stick :soft-mallet})
"snare drum and soft mallet"

If no methods match the dispatch value, Clojure throws an exception:

user=> (beat {:drum :bongo} {:stick :none})
java.lang.IllegalArgumentException: No method for dispatch value
... stack trace elided ...

Or, you can define a :default that will match if no other dispatch value matches:

(defmethod beat :default [drum stick] "default value, if you want one")

user=> (beat {:drum :bongo} {:stick :none})
  "default value, if you want one"

Wrapping up

The PCL chapter demonstrates dispatch based on one or more arguments to a function, and those examples are duplicated above. There are many other things you might do with defmulti, but since they are not covered in PCL I will declare them out of scope here, and point you to some other reading:

  • Clojure objects have metadata, so you could dispatch based on metadata values instead of data values. See mac's post on the mailing list for an example.
  • Dispatch can be based on the state of an object, rather than on some kind of type tag. This lets you treat a rectangle with equal width and height as a square, even if it was created as a rectangle. See my article on dispatch in the Java.next series for an example.
  • Clojure's defmulti allows you to create multiple taxonomies dynamically, and trivially dispatch based on isa relationships in a taxonomy. See Rich's mailing list post introducing this feature.

Notes

Revision history

  • 2008/09/25: initial version
  • 2008/12/09: fixed withdraw erratum. Thanks Dean Ferreyra.
Picture of stu

Java.next Overview

  • Posted By Stuart Halloway on September 24, 2008
  • Tags

As we reach the middle of our second decade of Java experience, the community has learned a lot about software development. Many of our best ideas on how to use a Java Virtual Machine (JVM) are now being baked into more advanced languages for the JVM. These languages tend to provide two significant advantages:

  • They reduce the amount of ceremony in your code, allowing you to focus on the essence of the problem you are solving.
  • They enable some degree of functional programming style. Think of it as a dash of verb-oriented programming to spice up your noun-oriented programming.

I have picked four "Java.next" languages to demonstrate these concepts: Clojure, Groovy, JRuby, and Scala. I have written a series of articles and conference talks describing how these languages can make teams more productive.

This page is the top-level table of contents for Java.next, and I will update the links below as new articles and talks become available.

Articles on Java.next

Conference talks on Java.next

Seeing a talk

If you are interested in hearing me speak on Java.next, check the event schedule, or contact Relevance (info@thinkrelevance.com) to schedule an event near you.

Picture of stu

PCL -> Clojure, Chapter 9

  • Posted By Stuart Halloway on September 24, 2008
  • Tags

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 9, Practical: Building a Unit Test Framework.

Tests and reports

To build a minimal testing library, I need nothing more than tests and results. To keep reporting as simple as possible, I will start with console output. The report-result function tests a result, and prints pass or FAIL, plus a form with supporting detail:

(defn report-result [result form]
  (println (format "%s: %s" (if result "pass" "FAIL") (pr-str form))))

Now any function can be a test. The detail message can often be the same form that caused the error, so I will pass the same form twice: once for evaluation, and again (quoted!) for use in the detail message:

(defn test-+ []
  (report-result (= (+ 1 2) 3) '(= (+ 1 2) 3))
  (report-result (= (+ 1 2 3) 6) '(= (+ 1 2 3) 6))
  (report-result (= (+ -1 -3) -4) '(= (+ -1 -3) -4)))

The console output for test-+ looks like this:

user=> (test-+)
pass: (= (+ 1 2) 3)
FAIL: (= (+ 1 2 3) 7)
pass: (= (+ -1 -3) -4)

Inferring the detail message

The fact that I want to pass the same form twice, but with different evaluation semantics, just screams macro. Sure enough, I can clean up the code with a macro:

(defmacro check [form]
  `(report-result ~form '~form))

The macro expands the form twice, once for evaluation and once quoted for the detail message. Now I can replace calls to report-result with simpler calls to check:

(defn test-* []
  (check (= (* 1 2) 3))
  (check (= (* 1 2 3) 6))
  (check (= (* -1 -3) -4)))

Hmm. The calls to check are cleaner than the calls to report-result in the earlier example, but the check itself still looks repetitive. Solution: a better check macro that can handle multiple forms:

(defmacro check [& forms]
  `(do
     ~@(map (fn [f] `(report-result ~f '~f))  forms)))

The quoting and unquoting is a little more complex--play around with macroexpand-1 to see how it works.

With the better check in place, test functions are quite simple:

(defn test-rem []
  (check (= (rem 10 3) 1)
     (= (rem 6 2) 0)
     (= (rem 7 4) 3)))

Aggregating results

So far I have tests and console output. Next, I need some way to aggregate a set of checks into a single, top-level "checks passed" or "checks failed".

I would like to simply and together all the individual checks, but that does not quite work. As in many languages, Clojure's and short-circuits and stops evaluating when it encounters a logical false. That's no good here: Even if one test fails, I still want all the tests to run.

Since it is a question of optional evaluation, a macro is appropriate. The combine-results macro works like and, but it always evaluates all the forms:

(defmacro combine-results [& forms]
  `(every? identity (list ~@forms)))

Now check can use combine-results instead of do.

(defmacro check [& forms]
  `(combine-results
    ~@(map (fn [f] `(report-result ~f '~f)) forms)))

All existing functionality still works, and now I can see a useful return value from a test.

user=> (test-*)
pass: (= (* 2 4) 8)
pass: (= (* 3 3) 9)
true

Capturing test names

Tests ought to have names. In fact, tests ought to support multiple names. You can imagine a test detail report saying:

Check math->addition->associative passed: ...

Where associative is the name of a check, addition is the name of a function, and math is the name of another function that called addition.

First, I need a variable to store a sequence of names:

(def *test-name* [])

Printing the variable as part of a result is easy:

(defn report-result [result form]
  (println (format "%s: %s %s" 
           (if result "pass" "fail") 
           (pr-str *test-name*) 
           (pr-str form)))
  result)

Now for the hard part: populating the collection of names. For this, I will introduce a deftest macro:

(defmacro deftest [name & forms]
  `(defn ~name []
     (binding [*test-name* (conj *test-name* (str '~name))]
       ~@forms)))

The macro expansion perfomed by deftest is nothing new: deftest turns around and defns a new function named name. The interesting part is the call to binding, which rebinds *test-name* to a new collection built from the old *test-name* plus the name of the current test.

The new binding of *test-name* is visible anywhere inside the dynamic scope of the binding form. The dynamic scope includes any function calls made inside the binding, and their function calls, and so on ad infinitum ... or until another binding performs the same trick again. This gives exactly the semantics we want:

  • The dynamic scope allows callers to influence callees without having to pass test-name an an argument all over the place. Nested functions "remember" a stack of their caller's names through *test-name*.
  • The unwinding of the dynamic scope protects readers of *test-name* outside a binding. Code after the binding will never see the values *test-name* takes during the binding.
  • Dynamic bindings are thread-local (and therefore thread-safe).

With deftest in place, I can defined a hierarchy of nested tests:

(deftest test-*
  (check (= (* 2 4) 8)
     (= (* 3 3) 9)))

(deftest test-math
  ; TODO: test rest of math
  (test-*))

(deftest test-all-of-nature
  ; TODO: test rest of nature
  (test-math))

Calling test-all-of-nature will demonstrate multiple levels of nested name in a test report:

user=> (test-all-of-nature)
pass: ["test-all-of-nature" "test-math" "test-*"] (= (* 2 4) 8)
pass: ["test-all-of-nature" "test-math" "test-*"] (= (* 3 3) 9)
true                         

From here, better formatting of the console message is just mopping up.

Wrapping up

When I first read Practical Common Lisp, this was my favorite chapter. The testing library evolves quickly and naturally to a substantial feature set. (In case you didn't keep count, the entire "framework" is less than twenty lines of code.)

Try implementing the unit-testing example in your language of choice. Don't just implement the finished design. Work through each of the iterations described above:

  1. tests and results
  2. inferring the detail message
  3. aggregating results
  4. capturing test names

I would love to hear about your results, and I will link to them here.

Notes

Picture of stu

PCL -> Clojure, Chapter 8

  • Posted By Stuart Halloway on September 23, 2008
  • Tags

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 8, Macros: Defining Your Own.

Rolling your own

Lisp macros gain their power by controlling argument evaluation. In a normal Lisp function all arguments are evaluated when calling a function. Consider this call to function foo:

(foo a b)

Arguments a and b are evaluated, and then passed to function foo. If foo were a macro, however, all bets would be off. Then foo's arguments might be evaluated in bizarre orders, or not at all.

This may seem a little crazy until you consider a simple if:

(if monday (wake-up) (sleep))

if cannot possibly be a normal Lisp function. If it were, you would always both wake-up and sleep, regardless of the value of monday.

As the if example suggests, control flow is an obvious use case for macros. PCL demonstrates custom macros by defining a new control flow macro named do-primes.

Preparing for do-primes

In order to implement do-primes, I will need a primeness test. For clarity, I will divide this into two functions. First, a simple helper to detect factors.

(defn divides? [candidate-divisor dividend]
  (zero? (rem dividend candidate-divisor)))

Now I can tell when one number divides another:

user=> (divides? 7 42)
true
user=> (divides? 11 42)
false

A prime is simply a number with no divisors greater than one. I am a busy guy, so I won't check all the natural numbers, only those from two up to the square root of the number being tested. Here is a simple primeness test:

; yes, I know there are faster ways.  
(defn prime? [num]
  (when (> num 1)
    (every? (fn [x] (not (divides? x num)))
        (range 2 (inc (int (Math/sqrt num)))))))

Sequences of primes

My eventual objective is to call do-primes like this:

(do-primes i 100 200 
  (print (format "%d " i)))

where i is the loop variable and runs the primes from 100 to 200. Because Clojure has nice support for infinite sequences, I find it easier to begin by thinking in terms of the pure math. So, here is a function that returns the sequence of primes starting from a number:

(defn primes-from [number]
  (filter prime? (iterate inc number)))

(iterate inc number) returns an infinite sequence starting with number and then incrementing by one for each subsequent element. The filter then whittles this down to numbers that are prime.

This sequence is infinite, so don't try to view it from the console. Take your primes a few at the time:

user=> (take 5 (primes-from 1000))
(1009 1013 1019 1021 1031)

Now I need a simple helper that begins with primes-from, but cuts off the sequence at a chosen end:

(defn primes-in-range [start end]
  (for [x (primes-from start) :while (<= x end)] x))

The for is a list comprehension. It takes all the (primes-from start), but only while those numbers are still less than or equal to end.

do-primes

Now I am finally ready to write the macro do-primes:

(defmacro do-primes [var start end & body]
  `(doseq [~var (primes-in-range ~start ~end)] ~@body))

Macros work in two steps: expansion followed by normal Lisp evaluation. The expansion phase is like a template substitution, but with the full power of Lisp at your disposal.

In the definition of do-primes above, the syntax-quote (`) identifies the static part of the template:

  • For symbols, syntax-quote resolves the name to a fully qualified symbol (with some exceptions we don't need to worry about in this example).
  • For lists, syntax-quote will recursively syntax-quote the contained forms.

The unquote (~) and splicing-unquote (~@) provide the dynamic part of the template by exempting their forms from syntax quoting rules.

Your reaction at this point should be "That's a lot of ugly punctuation." Fear not, macroexpand-1 will ease the pain. macroexpand-1 will show you how Clojure expands the macro, without executing the expanded result. This gives you a chance to experiment with the rules for quoting and unquoting. Here is an example:

user=> (macroexpand-1 '(do-primes i 1 10 (print i)))
(clojure/doseq i (pcl.chap_08/primes-in-range 1 10) (print i))

Looking back at the definition of do-primes, here is what happened:

  • doseq expanded to the fully-qualified clojure/doseq. (I haven't covered namespaces yet, but the clojure namespace contains most of the Clojure core.)
  • i, 1, and 10 are direct expansions from the macro call.
  • primes-in-range is one of the helper functions I wrote ealier. In the sample repository, I have placed this in the pcl/chap_08 namespace, hence the expansion.
  • body contains a list of things I want to do with my primes, specifically ((print i)). That is almost what I need, except a few too many parens. The "splice" part of splicing unquote gets rid of the extra parens, splicing the list into the template. This is exactly what I need to match the doseq signature.

Now I can do-primes:

user=> (do-primes i 100 150 
  (print (format "%d " i)))
101 103 107 109 113 127 131 137 139 149

Wrapping up

The easiest way to write a macro is to work backwards. Write the form that you want the macro to expand into, and then test interactively with macroexpand-1 until you have a macro that expends correctly.

Macros are hard, and I have skipped some of the building blocks here. Check out the chapter in PCL.

Notes

The sample code is available at http://github.com/stuarthalloway/practical-cl-clojure.

Revision history

Picture of stu

PCL -> Clojure, Chapter 11

  • Posted By Stuart Halloway on September 22, 2008
  • Tags

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 11, Collections.

Sequence basics

PCL describes a group of basic collection functions: count, find, position, remove, and substitute. Clojure supports count for a variety of list-like types:

user=> (count (quote (1 2 3)))
3
user=> (count [1 2 3])
3
user=> (count #{1 2 3})
3
user=> (count "characters")
10                          

These types, and any others than implement a basic first/rest protocol, are called sequences in Clojure. A sequence is logically a list, but may be implemented using other data structures.

In addition to generic sequence functions, some sequences have specific functions unique to their underlying data structure. Clojure defines find for maps to return the matching key/value pair:

user=> (find {:lname "Doe", :fname "John"} :fname)                   
[:fname "John"]

Or, you could just place the map itself in function position, and get back the matching value for a key:

user=> ({:lname "Doe", :fname "John"} :fname)
"John"

The Clojure core does not define find for other collection types. But the implementation is a one-liner using some. For example, to ask if a collection contains the number 2:

user=> (some #(= % 2) [1 2 3])
true

Clojure-contrib wraps the some idiom into a function named includes?.

The rest of the "basic" functions have similar stories: The Clojure core tends to support them directly where they are efficient (constant time) operations. Where they would take longer (e.g. linear time), the operations can be written as one-liners atop higher-order functions.

Higher-order functions

CL includes higher order versions of the basic functions described above. These higher-order versions take an additional parameter, which is a function that acts as a filter. Here are some examples.

First, a collection of days for the examples to work against:

; for re-split
(use 'clojure.contrib.str-utils)
(def days (re-split #" " "Sun Mon Tues Wed Thurs Fri Sat"))

Now I can find the weekdays that start with "S":

user=> (filter #(.startsWith % "S") days)
("Sun" "Sat")

Or simply count the days that start with "S":

user=> (count (filter #(.startsWith % "S") days)) 
2

In an immutable world, remove is the opposite of find. I can get a collection with all "S" days removed by reversing the previous filter with complement:

user=> (filter (complement #(.startsWith % "S")) days)
("Mon" "Tues" "Wed" "Thurs" "Fri")

To replace all "S" days with "Weekend!" I can use map:

user=> (map #(if (.startsWith % "S") "Weekend!" %) days)
("Weekend!" "Mon" "Tues" "Wed" "Thurs" "Fri" "Weekend!")

Sorting

Sorting is easy:

user=> (sort days)
("Fri" "Mon" "Sat" "Sun" "Thurs" "Tues" "Wed")

Sorting by criteria is also easy:

user=> (sort-by #(.length %) days)
("Sun" "Mon" "Wed" "Fri" "Sat" "Tues" "Thurs")

Combining sequences

The concat function concatenates sequences.

user=> (concat [1 2 3] [4 5 6])
(1 2 3 4 5 6)

Note that the resulting sequence is lazy. So, concat can return without walking each input sequence. In other words, the (take 5 ...) below does not have to wait (forever!) for all the powers of 2 to be generated:

user=> (take 5 (concat (quote (1/4 1/2)) powers-of-2))
(1/4 1/2 1 2 4)

What if one of the sequences passed to concat blows up instead of returning a sequence?

user=> (take 2 (concat '(1 2 3) (throw (Error. "Not a sequence"))))
java.lang.Error: Not a sequence

Here concat fails because its second argument is not a sequence. As it happens, I have an even lazier option than concat. The lazy-cat function does not even look at each argument until it is forced to do so:

user=> (take 2 (lazy-cat '(1 2 3) (throw (Error. "Not a sequence"))))
(1 2)

Lazy sequences have many uses, but take some getting used to. One mistake to avoid is trying to inspect a lazy infinite sequence from the REPL. The REPL tries to print the entire sequence, which will take forever (literally). Hence the (take 2 ...) wrappers above.

Subsequences

It is often interesting to take subsequences from the beginning, middle, or end of a collection. Clojure supports this in a general way with take and drop. You have already seen take, which returns the first part of a collection:

user=> (take 2 days)
("Sun" "Mon")                                      

For the end of a collection, I can use drop:

user=> (drop 2 days)
("Tues" "Wed" "Thurs" "Fri" "Sat")

For the middle of a collection, I can use take and drop together:

user=> (take 5 (drop 1 days))
("Mon" "Tues" "Wed" "Thurs" "Fri")

The take-nth function takes only every nth item of a collection. To demonstrate take-nth, I will begin by defining a lazy collection of the natural-numbers:

(def natural-numbers (iterate inc 1))

The call to iterate produces a collection that starts with 1 and generates subsequent members by calling inc. You can verify that these are the natural numbers by taking a few of them.

user=> (take 10 natural-numbers)
(1 2 3 4 5 6 7 8 9 10)

Now I can write an intuitive definition for the even and odd numbers in terms of the natural numbers:

(def odd-numbers (take-nth 2 natural-numbers))
(def even-numbers (take-nth 2 (drop 1 natural-numbers)))

Predicates

Clojure provides a number of functions that test boolean predicates, including every?, not-any?, and not-every?, and some. Here are a few examples, using the days collection defined above.

Does every day start with "S"?

user=>(every? #(.startsWith % "S") days)
false

Is there some day that starts with "M"?

user=>(some #(.startsWith % "M") days)
true

Map and reduce

map take a function and one or more sequences. It returns a new sequence which is the result of applying the function to the item(s) in each sequence. So, to take the product of numbers from two sequences:

user=> (map * '(1 2 3 4 5) '(10 9 8 7 6))
(10 18 24 28 30)

If I want to control the type of collection returned, I can use into:

user=> (into [] (map * '(1 2 3 4 5) '(10 9 8 7 6)))
[10 18 24 28 30]

reduce walks down a collection, applying function f of two arguments to the first two arguments, then applying f to the result of the first call and the next element. This is very useful for operations that process a sequence and return a single value. For example, I can sum a sequence:

user=> (reduce + [1 2 3 4 5])
15

Or find the max value of a sequence:

user=> (reduce max [1 2 3 4 5])
5

Maps

Maps (hash tables in CL) can be iterated just like any other sequence type, bearing in mind that the function you pass in should expect a key/value pair. Given the following map of names to scores:

user=> (def scores {:john 18, :jane 21, :jim 14})                   7
#'user/scores

I can find all the people who scored above 15:

user=> (filter (fn [[k,v]] (> v 15)) scores)
([:jane 21] [:john 18])

Notice how the destructuring bind ([[k,v]]) makes it easy to bind k and v separately, without introducing a temporary variable pair that I don't really need.

Wrapping up

Lisp excels at processing lists. Clojure offers similar capabilities, but generalized to sequences, which can be lists, vectors, maps, sets, or other list-like collections.

Clojure's support for lazy collections allows a different style for collection processing that I will continue to explore in later articles in this series.

Notes

Revision history

  • 2008/09/22: initial version
  • 2008/10/15: removed var-quoting per Douglas's comment. Thanks.
  • 2008/12/09: fixed filter erratum. Thanks Dean Ferreyra.
Picture of stu

PCL -> Clojure, Chapter 10

  • Posted By Stuart Halloway on September 19, 2008
  • Tags

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 10, Numbers, Characters, and Strings.

Lisp with Java guts

Because Clojure is Java under the covers, you can always use Java's support for numbers, characters, and strings. The Java interop syntax is clean and simple, so it is idiomatic to call Java directly, rather than write wrappers to make code look more Lisp-like. A few examples follow:

user=> (Math/pow 3 3)
27.0

user=> (.compareTo "a" "b")
-1

user=> (Character/toLowerCase \A)
\a

Numbers are numbers

In Clojure, as in most Lisps, numbers are numbers. They don't do irritating things like overflow:

user=> (* 1000000 1000000 1000000)
1000000000000000000

Also, integer division is exact:

user=> (/ 10 3)
10/3

Under the covers, Clojure's numeric representation switches Java types as necessary to do the right thing.

user=> (class (* 1000 1000))
java.lang.Integer
user=> (class (* 1000000 1000000))
java.lang.Long
user=> (class (* 1000000 1000000 1000000))
java.math.BigInteger

Study the API docs

Clojure does a good job of balancing the purity of math (Lisp) and the practical reality of efficient representation (Java's primitives). But you still have to know your way around. Some things are wrapped in Lisp, and some things aren't. For example, numbers support the mathematical comparison operators, but Strings use Java's compareTo:

For example:

user=> (< 1 2)
true
user=> (< "a" "b")
java.lang.ClassCastException: java.lang.String
user=> (.compareTo "a" "b")
-1
user=> (.compareTo 1 2)
-1

If you aren't sure whether there is a Lispy wrapper for some functionality, you can check the API docs, or just try it in the REPL.

Wrapping up

For numbers, characters, and strings, Clojure provides some of the trappings a Lisp programmer would expect, e.g. exact integer division. But under the covers, it's all Java. If you don't find what you need in the Clojure API, drop to Java using the interop syntax.

Notes

Picture of stu

PCL -> Clojure, Chapter 7

  • Posted By Stuart Halloway on September 18, 2008
  • Tags

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 7, Macros: Standard Control Constructs.

Rolling your own

Common Lisp control constructs are generally part of the standard library, not the core language. Ditto for Clojure. If you don't find a control construct you want, you can always roll it yourself. For example, Clojure doesn't have an unless, so here goes:


(defmacro unless [condition & body]
  `(when (not ~condition)
     ~@body))

defmacro differs from Common Lisp in two important ways.

  • The argument list is a vector [...], not a list (...). (This is true for functions as well, I just hadn't mentioned it yet). Clojure gives vectors, sets, and maps equal billing with lists by giving them their own literal syntax.
  • Clojure uses different reader macros for unquote and unquote-splicing. Where CL uses , and ,@, Clojure uses ~, and ~@.

The avoidance of commas in read macros is a well-considered decision. Commas are whitespace in Clojure. This often results in an interface that is simultaneously human-friendly and list-y. The following two expressions are equivalent:


{:fn "John" :ln "Doe"}
{:fn "John", :ln "Doe"}

The latter form makes the map more readable, and more similar to other languages.

doseq

Common Lisp provides dolist for iterating a list. Clojure works in terms of sequences, which are collections that can be traversed in a list-like way. The Clojure analog to dolist is doseq. It can work with lists:


user=> (doseq [x '(1 2 3)] (println x))
1
2
3

doseq also works with maps. Note the destructuring bind since I care only about the values:


user=> (doseq [[_ v] {:fn "John" :ln "Doe"}] (println v))
John
Doe

In fact, doseq works with any kind of sequence (hence the name). (iterate inc 1) produces an infinite collection incrementing up from 1. (take 5 ...) pulls a finite set of 5 elements from a collection.


user=> (doseq [x (take 5 (iterate inc 1))] (println x)))
1
2
3
4
5

Don't try to doseq an infinite collection, and don't say I didn't warn you.

dotimes

Common Lisp provides dotimes for iteration with counting. Here is the Clojure version of PCL's multiplication table example:


user=>(dotimes [x 10]
        (dotimes [y 10]
          (print (format "%3d " (* (inc x) (inc y)))))
        (println))
  1   2   3   4   5   6   7   8   9  10 
  2   4   6   8  10  12  14  16  18  20 
  3   6   9  12  15  18  21  24  27  30 
  4   8  12  16  20  24  28  32  36  40 
  5  10  15  20  25  30  35  40  45  50 
  6  12  18  24  30  36  42  48  54  60 
  7  14  21  28  35  42  49  56  63  70 
  8  16  24  32  40  48  56  64  72  80 
  9  18  27  36  45  54  63  72  81  90 
 10  20  30  40  50  60  70  80  90 100 

CL do and loop

Common Lisp provides some more general control constructs: do and loop. Clojure's functions of the same name serve very different purposes. Clojure's do is equivalent to CL's progn, and Clojure's loop works with recur.

You could write Clojure macros to emulate CL's do and loop, but you probably won't want too. Instead, you can use list comprehensions or lazy sequences, which I will introduce later in this series.

Wrapping up

Like CL, Clojure defines control structures using macros. Also like CL, Clojure has control structures that are functional, plus some that are evaluated for their side effects. Clojure's control structures tend to use fewer parentheses.

Clojure does not duplicate CL's general purpose imperative control structures. Instead, you can often use list comprehensions and lazy sequences.

Notes

The sample code is available at http://github.com/stuarthalloway/practical-cl-clojure.

Revision history