Donnerstag, Mai 24, 2012

The root of polymorphism: The Anti-If Campaign

Ever heard about the Anti-If Campaign? The less if-statements there are in your code, the better -- argues Francesco Cirillo, the initiator of the Anti-If Campaign. There is a lot of truth behind his campaign. Too much if-statements in your code are evil, that's for sure. They'll make your code static, hard to maintain and no fun to read. It goes without saying that related concepts like switch-statements, conditionals and the like are subsumed under "if". They are just syntactic sugar for a certain arrangement of if-statements.

I asked myself: What if we abandon if-statements completely? What about being absolutely strict about the use of if and related concepts: No ifs in your code anywhere anyplace. You are not allowed to use if at all. It is like not having such a construct in your programming language.

Is it possible to write code without ifs? Yes, it is, as I'll show in this post. You'll end up with polymorphic functions instead -- and this is really cool stuff. Polymorphic functions are a result of having no ifs at your fingertips.

So let us proclaim the No-If Campaign for the sake of being extreme -- and see where it leads us to. In the following, I'll use Clojure for demonstration purposes. Clojure is a functional programming language and belongs to the family of Lisp languages. Don't worry if you don't know Clojure. The programs are simple to grasp and should let you follow the line of argumentation.

The Magic with Maps

A map, sometimes also called dictionary, associative array or hash map, is a data structure that stores pairs of key values and target values. A key value is uniquely associated with a target value. In Clojure, maps are written with curly braces. The example interaction at the console (called REPL in Clojure speak) associates so-called keywords with numbers; the ordering of the pairs is irrelevant.

user=> (def day {:mon 1 :tue 2 :wed 3 :thu 4 :fri 5})
#'user/day

It is a specialty of Lisp-like languages, that programs are written with parentheses. It's something one quickly gets used to. The above code defines day to be bound to a map with the given elements.

To query a map, the map and a key value must be given.

user=> (day :thu)
4
user=> (day :sat)
nil

If there is no target value associated with a key value, nil is returned. We can provide a default value as an extra argument to be returned instead.

user=> (day :thu 0)
4
user=> (day :sat 0)
0

The notion of equality is built into maps. Otherwise we couldn't query a map. That means: We can re-implement "equality" with the help of a map.

user=> (defn equal? [x y] ({y true} x false))
#'user/equal?

The code defines a function called equal? that expects two arguments, namely x and y. The body of the functions says: if x is not a key value in map {y true} return false; otherwise, if x equals y return true.

Let this use of a map sink in for a moment. Once you get the overall idea, it is easy to see how we can re-implement if.

How does if work in Clojure? In Clojure, if is a so-called special form. It expects at least two arguments, a third argument is optional. The evaluation of an if is as follows: If the evaluation of the first arguments results in false or nil, the third argument is evaluated and its result returned; if the third argument is missing, nil is returned. If the evaluation of the first argument is neither false nor nil, the second argument is evaluated and its result returned. It sounds more complicated than it actually is. Let us take an example

user=> (if (== 2 3) (+ 2 3) (- 2 3))
-1

Since the evaluation of (== 2 3) leads to false, the third argument to if is evaluated, which results in -1. Easy, isn't it?!

We could emulate this behavior with a map in the following way:

user=> (eval ({false '(- 2 3) nil '(- 2 3)} (== 2 3) '(+ 2 3)))
-1

In the map, there are key values for false and nil, which are both associated to the same expression. They are quoted (that's what the apostrophe is for) to suppress premature evaluation. The default expression '(+ 2 3) is quoted for the very same reason. Depending on the evaluation of the test-expression, here (== 2 3), either the default expression or the target value is taken. The resulting expression is evaluated by eval.

We can capture this use of a map as a coding pattern with the help of a macro. Let us call the macro IF -- for obvious reasons.

(defmacro IF
  ([test then]
    `(IF ~test ~then nil))
  ([test then else]
    `(eval ({false '~else, nil '~else} ~test '~then))))

Don't worry when you don't get all the details of macros and the special characters used inside. The point is that
  1. this macros behaves exactly like an if without using an if for its implementation. It is a proof-of-concept: We can live without if -- because we can re-built it from maps.
  2. this macro reads like a perfect specification of an if, which is a very interesting aspect of our approach.
user=> (IF (== 2 3) (+ 2 3) (- 2 3))
-1

(Note to experts: Because of eval, the macro does not respect lexical scope. Nothing to worry about in a proof-of-concept demonstration.)

To conclude: Yes, we don't need if as long as we have maps natively provided in our programming language. We can re-implement if.

This sounds ridiculous: We proclaim the No-If Campaign just in order to re-introduce if-statements? You are right, we need if-statements one way or the other. We can't seriously get rid of them. In programming, the concept of choice is essential and needs to be used. if-statements are just the most primitive binary choice construct you can think of -- and they have their meaning and use.

But there is another lesson to be learnt from re-implementing if with a map. With maps, we have a multiple choice construct, which can be instrumented for implementing polymorphic functions.

Polymorphism

Instead of choosing either something based on false/nil or choosing something else otherwise (binary choice), we generalize the idea of choice with maps. Based on any kind and any number of key values a map returns an associated target value (multiple choice). To get rid of the use of eval (see our IF macro), we demand that any value associated with a key value has to be a function.

Assumed we would like to calculate the area of a given geometric shape. The formulas are dependent on the type of shape. The area of a circle is computed in another way than the area of a rectangle. This knowledge can be encoded with a map.


(def shape2func
  {:circle    (fn [r]   (* r r Math/PI))
   :rectangle (fn [a b] (* a b))
   :triangle  (fn [a h] (* a h 0.5))})

Now, we define a function area which expects at least one argument. This very argument acts as a sort of identifier for choosing the right function from map shape2func. The function is applied on any further arguments supplied with function area.



(defn area [shape & args]
  (apply (shape2func shape) args))


Function area has become a polymorphic function! Based on the value of some argument -- here the very first argument identifying the kind or type of shape one refers to -- a proper implementation is chosen and called.

user=> (area :circle 2)
12.566370614359172
user=> (area :rectangle 2 3)
6

This kind of polymorphism is so much intriguing that it is worth being captured with a macro. The macro eases the definition of polymorphic functions. What we add to the macro is the idea of a dispatch function. The dispatch function determines the identifier used as a key value for looking up the function in the map. It gives us some additional flexibility.

(defmacro defnpoly
  ([name dispatch-fn map]
    `(defnpoly ~name ~dispatch-fn ~map nil))
  ([name dispatch-fn map default-fn]
    `(defn ~name [& args#]
      (apply
        (~map (apply ~dispatch-fn args#) ~default-fn)
        args#))))

Don't worry if you do not get the details of the macro. Using the macro is straight forward. It captures a pattern of a multiple choice construct -- and that is what we are after.

Using the macro, the definition of our polymorphic function area looks like this; note how the area for a square is implemented:

(defnpoly area (fn [shape & args] shape)
  {:circle (fn [self r] (* r r Math/PI))
   :rectangle (fn [self a b] (* a b))
   :square (fn [self a] (area :rectangle a a))
   :triangle (fn [self a b] (* a b 0.5))})

The dispatch function captures the previously mentioned strategy how a dispatch value is determined from a given set of arguments. It is the very first argument the dispatch is based on. Since the different functions implementing area are called with the very same arguments as the dispatch function is called with, we need to capture the first argument called self though it is ignored in the body of the functions. The use of self might trigger a revelation: object-orientation is nearby.

user=> (area :circle 2)
12.566370614359172
user=> (area :square 2)
4

As you can see, no if-statements are needed nor used in polymorphic functions. And that's exactly what the Anti-If Campaign is after: Use polymorphic functions, use the power of such a multiple choice construct whenever meaningful and possible. If-statements are no adequate means to emulate multiple choice. Beyond the need of binary choice, if-statements make code for multiple choice ugly, hard to maintain and less adaptable. As you can see, it is a breeze to add an implementation for :square to area -- no if is required.

One can do amazing things with polymorphic functions and a flexible dispatch mechanism. It embraces first argument type dispatch as is done in object-orientation. But there is more to it. Let us have a simple example for dispatching on the type of both arguments given to polymorphic function times

(defnpoly times (fn [x y] [(type x) (type y)])
  {[Long Long]   (fn [x y] (* x y))
   [Long String] (fn [n s] (reduce str (repeat n s)))
   [String Long] (fn [s n] (times n s))})

The program is clear and succinct -- because there are no ifs. Polymorphic functions have their own beauty!

user=> (times 4 3)
12
user=> (times 4 "hi")
"hihihihi"
user=> (times "hi" 0)
""

Polymorphic functions are so much useful that Clojure has them already included. In Clojure, polymorphic functions are called multimethods.

Multimethods in Clojure

Part of the fascination of Lisp-like languages such as Clojure is that they are extremely extensible. If you are missing polymorphic functions, you write yourself a short macro, and you have the feature and can use it. We demonstrated how easy that is.

Clojure follows the tradition of Lisp and favors a slightly different model of defining a polymorphic function in your code. The polymorphic function is declared with defmulti and its implementing functions are defined with defmethod. This way, Clojure abstracts away the use of a map for polymorphism. It adds a layer of abstraction and removes a little bit of syntactic overhead we had in our solution. Other than that, defmulti and defmethod behave exactly like defnpoly. See yourself, how close the definition of area is to our defnpoly approach.

(defmulti area (fn [shape & args] shape))
(defmethod area :circle [self r] (* r r Math/PI))
(defmethod area :rectangle [self a b] (* a b))
(defmethod area :square [self a] (area :rectangle a a))
(defmethod area :triangle [self a b] (* a b 0.5))

Multifunction area is used exactly like our polymorphic function definition of area.

user=> (area :circle 2)
12.566370614359172
user=> (area :square 2)
4

There is a lot more to say about multifunctions in Clojure, but I do not want to drift away from our starting point: the Anti-If Campaign.

Summary

In the data structure of a map the concepts of equality and choice are inherently included. The interesting part is that a map provides a mechanism for multiple choice. That means that binary choice (being another word for if, so to speak) is just a special case of multiple choice; that is why if can easily be re-implemented using maps. To have a binary choice construct like if at hand is useful and meaningful.

The fun part is to utilize and explore the power of multiple choice with maps. It takes a simple step and the introduction of a dispatch function and we have polymorphic functions. Polymorphic functions are a very powerful control construct. They are easy to use, make code highly readable and flexible to adapt. Same does object-orientation. But the flexibility given by a dispatch function goes beyond (single type-dispatch) object-orientation.

And that is the message of the Anti-If Campaign: Learn to think in and use multiple choice constructs -- be they provided in form of object-orientation or polymorphic functions (like in Clojure). If you use binary choice (i.e. if) for multiple choice, your code will be messy and ugly. So, don't use if to emulate multiple choice. Be the power of maps with you ;-)

Kommentare:

georgek hat gesagt…

Great post!

Anonym hat gesagt…

i see one usecase where if is nice: to break recursion or loop/recur. I think there an if is clearer and more concise.

However i like the post. I found it helpful. Thanks.

Havvy hat gesagt…

By the by, except for the macros, this is also possible using Javascript. Switch is actually more verbose than dispatch on an object (which is really just a map)

Anonym hat gesagt…

Having read this I thought it was very informative. I appreciate you finding the time and effort to put this article together.

I once again find myself spending way too much time both reading and leaving comments.
But so what, it was still worth it!
My page :: business backup service