NONLINEAR. Pedro Álvarez-Tabío

Programming (and thinking) the functional way

cover-pic

As a complete beginner with Haskell [1], I got immediately interested with it the time I saw the very elegant example of quicksort. Here it is:

quicksort :: Ord a => [a] -> [a]  
quicksort [] = []  
quicksort (p:xs) =  
    (quicksort lesser) ++ [p] ++ (quicksort greater)
    where
        lesser = filter (< p) xs
        greater = filter (>= p) xs

I was perplex. How could such a simple and beautiful thing be correct? Yes, I know this is not a "true", in-place quicksort. But those are performance considerations that I don't intend to expose here [2]. The point I'm trying to make here is that pure functional programming is such a change in mentality, such a different mental model and approach to programming that it's about the same as learning to code all over again.

First, let's define the problem that we're trying to solve here. Basically, sorting an array in ascending order of its elements. For doing this, we use the world-changing and ingenious quicksort algorithm [3], which has a few basic rules:

  • Array with one element: return the array
  • More than one element: select a pivot randomly and make two groups. First one contains all values <p, second one all values >p. Do the same recursively on each of those two groups.

So how can we think in a functional way? My objective here is to follow a dialog between two instances of the same programmer that have a fundamental difference. The first instance thinks imperative, the same way it has thought since learning how to code. The second makes a conscious effort to change gears and remove all prejudices: thinks functional. Actually this programmer is me, right now at the moment of the writing. You'll see the difference, trust me.

Let's compare this to a simple example in Java [4]:

public class Quicksort  {  
  private int[] numbers;
  private int number;

  public void sort(int[] values) {
    if (values == null || values.length == 0){
      return;
    }
    this.numbers = values;
    number = values.length;
    quicksort(0, number - 1);
  }

  private void quicksort(int low, int high) {
    int i = low, j = high;
    int pivot = numbers[low + (high-low)/2];

    while (i <= j) {
      while (numbers[i] < pivot) {
        i++;
      }
      while (numbers[j] > pivot) {
        j--;
      }

      if (i <= j) {
        swap(i, j);
        i++;
        j--;
      }
    }
    if (low < j)
      quicksort(low, j);
    if (i < high)
      quicksort(i, high);
  }

  private void swap(int i, int j) {
    int temp = numbers[i];
    numbers[i] = numbers[j];
    numbers[j] = temp;
  }
}

Wow. i's and j's all the way, what is this? Why is it so long compared to the Haskell example? This looks like comparing C and assembly 30 years ago! And in some respects, it is the same leap. [4]

Let's go through a mental dialog for both "instances".

JAVA:

Alright, so let's start by defining all the typical structure of a Java program. A class, with some fields for keeping our state. At least I think we should have an array of integers as the main component, the array that we are going to sort. Also a method called sort that takes an array of integers as an argument and sorts it.

public class Quicksort {  
    private int[] numbers;

    public void sort(int[] values) {

    }
}

HASKELL:

Alright, so there's no need for state, no attributes. Let's define the function as something that converts a list into another list. These lists need to have something in common, they need to be composed of elements that have an ordering. How can we generalize this? Aha! Typeclasses... There must be a typeclass that does this... Yes, Ord.

quicksort :: Ord a => [a] -> [a]  

JAVA:

Let's start by the trivial case, an empty array. If the array is empty, we'll return an empty array of course. But... damn, it's crashing whenever we have a null array. Let's add it to an if at the beginning of sort, to defend ourselves.

if (values.length == 0 || values == null) {  
    return;
}

HASKELL:

Trivial case, an empty list. For these cases there's pattern matching. Let's see if I can use it. Pretty cool!

quicksort [] = []  

JAVA:

OK so now let's kickstart the recursion for the general case. In this case we assign our parameters of the sort method to the state. We'll also need the length, so we just add a new field in Quicksort class.

public void sort(int[] values) {  
    if (values.length == 0 || values == null) {
        return;
    }
    this.numbers = values;
    this.length = values.length;
    quicksort(0, length - 1);
}

HASKELL:

Recursion is already embedded in the function. Nothing to do here!

No code. Nothing. Nada. That's good.  

JAVA:

So now we have to define the quicksort method according to the rules defined above. I will choose the pivot as the first element, no need to do anything fancy right now. So let's do two iterations per recursion, one going from beginning to end (i, the <p list) and the other from the end till the beginning (j, the >p list). Every time we find a value in the i iteration that is greater than the current one in the j iteration, we swap them. After i gets past j, we stop and make two recursive calls. The first part will be from the current beginning to j, the second from i to the current end.

private void quicksort(int low, int high) {  
    int i = low, j = high;
    int pivot = numbers[low];

    while (i <= j) {
        while (numbers[i] < pivot) {
           i++;
        }
        while (numbers[j] > pivot) {
            j--;
        }

        if (i <= j) {
            swap(i, j);
            i++;
            j--;
        }
    }

    if (low < j)
        quicksort(low, j);
    if (i < high)
        quicksort(i, high);
}

And the swap method:

private void swap(int i, int j) {  
    int temp = numbers[i];
    numbers[i] = numbers[j];
    numbers[j] = temp;
}

Enter Haskell.

Let's define lesser and greater as the two parts of each iteration. Wait! We can use the standard head and tail notation to extract the pivot as the first element. Then I can use the two parts, and recurse in each one!

quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)  

Alright, so I declared lesser and greater as two lists, now let's define them using where, a very powerful Haskell keyword for declaring values (not variables) inside a function. We simply have to apply a filter function and since we already have the rest of elements as a value that we can call (xs), we'll get:

    where
        lesser = filter (< p) xs
        greater = filter (>= p) xs

I tried to explain the iteration + recursion for quicksort in Java as well as I possibly could. But what if in Java we miss an i++? What if we mess up with the conditions for the while loop? Alright, this is a relatively simple algorithm. But just imagine that you have to do this all the time, for all your stuff, or that sorting is just the first step in a very complicated data curation algorithm. Of course it can be done, but good luck dealing with all the low-level iteration bugs.

And now regarding the treatment of state. What if we have an array that, for some reason, becomes null and we apply quicksort in Java? We are keeping the previous array we sorted, is that supposed to be like that? How about synchronization? Sixteen threads wanting to access the Quicksort methods? Now we have to do a monitor, maybe? Or an instance per thread? It becomes a mess.

It all boils down to compilers. Compilers should be smart enough to "guess" what we are trying to do and optimize [5]. Let's not think about indexes and telling an array what to do. Let's think about the nature of our data, about what transformations it needs to become what we want it to become. You might think that functional programming adds complexity to thinking about algorithms and data treatment, but it does not. It's the imperative mindset that's widespread (for so many good reasons!) in the programming world. An empire that's starting to crumble I might say.

Actually, you don't need to completely stop programming in your favorite imperative language and start writing Haskell all the time. Haskell has its flaws, too [6]. You can also write better Java by adapting this mental model. You can become a better software engineer by learning functional programming.

What about this Java code?

public List<Comparable> sort(List<Comparable> elements) {  
    if (elements.size() == 0) return elements;

    Stream<Comparable> lesser = elements.stream()
    .filter(x -> x.compareTo(pivot) < 0)
    .collect(Collectors.toList());

    Stream<Comparable> greater = elements.stream()
    .filter(x -> x.compareTo(pivot) >= 0)
    .collect(Collectors.toList());

    List<Comparable> sorted = new ArrayList<Comparable>();
    sorted.addAll(quicksort(lesser));
    sorted.add(pivot);
    sorted.addAll(quicksort(greater));

    return sorted;

}

Isn't it similar to the Haskell one? Alright, this might not work right now with the Java you're using. It uses lambda functions, one of the very cool things Java 8 is adopting [7]. See? Functional languages not only help programmers be better, they can also make programming languages be better :)

Functional programming is a natural evolution of the trend in programming languages to reach higher levels of abstraction. In years to come, we will regard imperative languages the same way we now think that doing a web application in C is highly inefficient. It all is a developer time tradeoff. Why did many startups as well as established companies choose Ruby for their codebase instead of C++, for example? Because it makes development time shorter. Let's not get it wrong. Just equate a developer to a cloud computing unit. An hour of a developer's time is a lot more expensive than an hour of a high-performance AWS super-duper-cluster instance. By making more difficult to make mistakes, by making bug fixes less frequent, by using better abstractions, we can make sure us developers are more productive, more creative and valuable to the world.

Discuss on Hacker News

[1] Haskell from scratch courtesy of "Learn you a Haskell for Great Good!"

[2] This quicksort in Haskell that I am showing here is not in-place quicksort so it loses one of its properties, which is memory efficiency. The in-place version in Haskell would be more like:

import qualified Data.Vector.Generic as V  
import qualified Data.Vector.Generic.Mutable as M 

qsort :: (V.Vector v a, Ord a) => v a -> v a  
qsort = V.modify go where  
    go xs | M.length xs < 2 = return ()
          | otherwise = do
            p <- M.read xs (M.length xs `div` 2)
            j <- M.unstablePartition (< p) xs
            let (l, pr) = M.splitAt j xs 
            k <- M.unstablePartition (== p) pr
            go l; go $ M.drop k pr

Discussion here.

[3] This version of quicksort is simplified for illustration purposes. It's always good looking at the source. Boldly go and read this piece of History (with a capital H) by C.A.R. Hoare, "Quicksort".

[4] Taken from http://www.vogella.com/tutorials/JavaAlgorithmsQuicksort/article.html

[4] Will we consider uncontrolled state harmful the same way GOTO statement being considered harmful consolidated structured programming?

[5] This wiki has LOTS of architectural information about the amazing Glasgow Haskell Compiler, ghc. https://ghc.haskell.org/trac/ghc/wiki/Commentary

[6] A big question mark over time on functional programming languages has been the ability (or lack thereof) to effectively code User Interfaces. Don't despair though! There's this cool new thing called Functional Reactive Programming (FRP). Still performing babysteps, but there are already implementations out there. One that's gaining lots of momentum is ReactJS/Om/ClojureScript web app stack. Guess that might be a good follow-up post :)

[7] See http://zeroturnaround.com/rebellabs/java-8-explained-applying-lambdas-to-java-collections/