Transducers

One of the nice things that you’ll come across in Clojure is transducer. In this post I’ll go over what transducers are, how you can use them, how you can make one, and what transducible contexts are.

What are transducers?

In simple terms, transducers are composable transformation pipelines. A transducer does not care about where the input comes from or where the output will go; it simply cares about the transformation of the data that that flows through the pipeline. Let’s look at an example:

1
2
3
4
(let [xf (comp 
(map inc)
(filter even?))]
(sequence xf (range 10)))

Here xf (external function) is our transducer which will increment every number and then will keep only the even numbers. Calling sequence functions like map, filter, etc. with single arity returns a transducer which you can then compose. The transducer doesn’t know where it will be used - will it be used with a collection or with a channel? So, a transducer captures the essence of your transformation. sequence is responsible for providing the input to the transducer. This is the context in which the transducer will run.

Here’s how the same thing can be done using threading macro:

1
2
3
(->> (range 10)
(map inc)
(filter even?))

The difference here is that the 2-arity version of map and filter will create intermediate collections while the 1-artity versions won’t. Transducers are much more efficient than threading together sequence functions.

Chaining
Transducing

Source for images

Inside a transducer

Let’s look at the 1-arity version of map and see what makes a transducer.

1
2
3
4
5
6
7
8
9
10
11
(defn map
([f]
(fn [rf]
(fn
([] (rf))
([result] (rf result))
([result input]
(rf result (f input)))
([result input & inputs]
(rf result (apply f input inputs))))))
...)

When you call 1-arity version of map you get back a transducer which, as shown above, is a function. Functions like map, filter, etc. take a collection and return a collection. Transducers, on the otherhand, take one reducing function and return another. The function returned is expected to have three arities:

  • 0-arity (init): This kickstarts the transformation pipeline. The only thing you do here is call the reducing function `rf`.
  • 2-arity (step): This is where you'll perform the transformation. You get the result so far and the next input. In case of `map`, you call the reducung function `rf` by applying the function `f` to the input. How the value is going to be added to the result is the job of `rf`. If you don't want to add anything to the result, just return the result as-is. You may call `rf` once, multiple times, or not at all.
  • 1-arity (end): This is called when the transducer is terminating. Here you must call `rf` exactly once and call the 1-arity version. This results in the production of the final value.
  • So, the general form of a transducer is this:

    1
    2
    3
    4
    5
    (fn [rf]
    (fn
    ([] (rf))
    ([result] (rf result))
    ([result input] ... )))

    Using transducers

    You can use a transducer in a context. There’s four contexts which come out of the box — into, transduce, sequence, and educe.

    into

    The simplest way to use a transducer is to pass it to into. This will add your transformed elements to an already-existing collection after applying the transducer. In this example, we’re simply adding a range into a vector.

    1
    2
    3
    (let [xf (comp (map inc)
    (map dec))]
    (into [] xf (range 10)))

    Internally, into calls transduce.

    transduce

    transduce is similar to the standard reduce function but it also takes an additional xform as an argument.

    1
    2
    3
    (let [xf (comp (map inc)
    (map dec))]
    (transduce xf + (range 10)))

    sequence

    sequence lets you create a lazy sequence after applying a transducer. In contrast, into and transduce are eager.

    1
    2
    3
    (let [xf (comp (map inc)
    (map dec))]
    (sequence xf (range 10)))

    eduction

    eduction lets you capture applying a transducer to a collection. The value returned is an iterable application of the transducer on the collection items which can then be passed to, say, reduce.

    1
    2
    3
    4
    (let [xf (comp (map inc)
    (map dec))
    coll (eduction xf (range 10))]
    (reduce + 0 coll))

    Inside a transducible context

    As mentioned before, transducers run in transducible contexts. The ones that come as a part of clojure.core would suffice most real-world needs and you’ll rarely see yourself writing new ones. Let’s look at transduce.

    1
    2
    3
    4
    5
    6
    7
    8
    (defn transduce
    ([xform f coll] (transduce xform f (f) coll))
    ([xform f init coll]
    (let [f (xform f)
    ret (if (instance? clojure.lang.IReduceInit coll)
    (.reduce ^clojure.lang.IReduceInit coll f init)
    (clojure.core.protocols/coll-reduce coll f init))]
    (f ret))))

    transduce is just like reduce. The 3-arity version expects an initial value to be supplied by calling the 0-arity version of the supplied function. The 4-arity version is slightly more involved. IReduceInit is an interface implemented by collections to let them provide an initial value. It lets a collection reduce itself. If not, the call goes to coll-reduce which is a faster way to reduce a collection than using first/next recursion.

    Stateful transducers

    It’s possible for transducers to maintain reduction state.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    (defn multiply-xf
    []
    (fn [rf]
    (let [product (volatile! 1)]
    (fn
    ([] (rf))
    ([result] (rf result))
    ([result input]
    (let [new-product (vswap! product * input)]
    (rf result new-product)))))))

    Here’s a transducer which will multiply all the incoming numbers. We maintain state by using a Volatile. Whenever we get a new input we multiply it with the product and update the state of Volatile using vswap!. Let’s see this in action:

    1
    2
    (into [] (multiply-xf) [1 2 3])
    => [1 2 6]

    Early Termination

    The way the above transducer is written, it’ll process all the inputs even if one of the inputs is zero. We know that once we encounter a zero, we can safely end the reduction process and return a zero. reduced lets you return a reduced value and end the reduction. Let’s make a minor change to the above transducer and add in early termination.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    (defn multiply-xf
    []
    (fn [rf]
    (let [product (volatile! 1)]
    (fn
    ([] (rf))
    ([result] (rf result))
    ([result input]
    (let [new-product (vswap! product * input)]
    (if (zero? new-product)
    (reduced result)
    (rf result new-product))))))))

    In the 2-arity function, we check if the new-product is zero. If it is, we know we have a reduced value. We end the reduction by returning the result we have so far. Let’s see this in action:

    1
    2
    3
    4
    (into [] (multiply-xf) [1 2 3])
    => [1 2 6]
    (into [] (multiply-xf) [1 2 0 3])
    => [1 2]

    Conclusion

    Transducers can be a very useful tool in your Clojure toolkit that let you process large collections, channels, etc. effectively by letting you make composable transformation pipelines that process one element at a time. They require a little getting used-to but once you’re past the learning curve, performance galore!

    One Year as a Software Engineer - a Retrospective

    December 2017 marks my completing one year as a software engineer. This post is a restrospective where I list down the lessons I’ve learned, in no specific order of priority.

    Personal Life

    Learn to invest

    We’re having too good a time today. We ain’t thinking about tomorrow.
    — John Dillinger, Public Enemy

    It’s easy to get carried away because of the fat pay cheque your fancy developer job brings you at the end of the month. Think of all the bling you can buy. This hedonistic attitude, however, does not help you hedge against the volatility of the job market. In his book Out of our Minds: Learning to be Creative, Ken Robinson states that secure life-long employment in a single job is a thing of the past. This applies even more if you work in a startup environment where things change in the blink of an eye.

    Making proper investments and establishing a second flow of cash can help you get a grip on the situation when it gets rough. The simplest way to invest is to put your money in the stock market and let the dividends add to your monthly flow of cash. You do not have to be an active day trader managing your positions. You can very easily invest in mutual funds, index funds, etc. and they are not that difficult to begin with.

    One of the best books I’ve been recommended by a friend of mine is Trading and Exchanges: Market Microstructure for Practitioners by Larry Harris. This will give you a very good overview of the market and all that you need to become a confident investor.

    Be ready to interview

    Love your job but don’t love your company, because you may not know when your company stops loving you.
    — A. P. J. Abdul Kalam, 11th President of India

    The key takeaway of working in a startup environment is this: things change very rapidly and when the push comes to shove, you will be thrown overboard. Having seen this happen to people close to me, I’ve learned that you need to be ready to interview with other startups and/or companies as soon as you are fired or have resigned. This includes staying in touch with the fundamentals you’ve learned in your CS class like data structures and algorithms, and also knowing the technologies you’ve used in reasonable depth. Also, having a good network of developers goes a long way in easing your search for a new job. And don’t forget to sharpen the saw — do things that are not programming that make you a better programmer.

    Learn to negotiate

    Dude, it’s five minutes. Let’s un-suck your negotiation.
    — Patrick McKenzie

    Learning how to negotiate is one of the most important skills that is often overlooked. It is overlooked because it is perceived to be cheap to negotiate for salary, or just plain avoiding difficult conversation. Whatever the case, get over it and learn the skill. As Patrick McKenzie mentions in his brilliant blog post Salary Negotiation: Make More Money, Be More Valued, all it takes is five minutes to finalize your salary. These five minutes have a lasting impact for alteast an year to come.

    Read a lot

    Read, read, read.
    — William Faulkner

    I’ll admit that it is hard to take time out to read daily but keeping a dedicated slot of 30 minutes just for reading goes a long way. I make sure it’s a distraction-free slot with no dinging notifications and I try not to multitask. Another exercise that I do in conjunction with reading is trying to improve my retention of the material I’ve read by improving my recall memory. The best way to do this is to use Feynman technique where you eludicate what you’ve learned and pretend to teach it to a student.

    I prefer keeping one programming and one non-programming book as a part of my daily reading. In addition, there’s a list of blogs, and papers that I read from time to time. I’ll probably post a list as a separate post.

    Engineering

    Understand Peter principle

    The key to management is to get rid of the managers.
    — Ricardo Semler

    Laurence Peter came up with the concept that a person in a role keeps getting promoted based on their performance in their current role and not on the abilities that the role demands. It is quite possible to have a manager who doesn’t know how to do his job well i.e. he’s risen to his level of incompetence. This principle is important to understand as an engineer and is something that should make one reflect on one’s current skillset - do you possess the abilities to be in the role you are currently in or do you need to skill up? Again, this goes back to my point on sharpening the saw and doing non-programming things that make you a better developer.

    Tech debt is evil

    Simplicity is hard work. But, there’s a huge payoff. The person who has a genuinely simpler system - a system made out of genuinely simple parts, is going to be able to affect the greatest change with the least work. He’s going to kick your ass. He’s gonna spend more time simplifying things up front and in the long haul he’s gonna wipe the plate with you because he’ll have that ability to change things when you’re struggling to push elephants around.
    — Rich Hickey

    When Ward Cunningham came up with the tech debt metaphor, he was referring to writing code despite having a poor understanding of the requirements. As time passes, whatever little understanding there was of the requirement fades away and the code is taken for granted. The definition of tech debt has since then come to represent poorly written code that later nobody understands and it’s taken for granted - something Ward Cunningham disagrees with.

    The lethal combination is poorly written code for badly understood requirement(s) and you’ll come across this very often in startup environments. This comes with some pretty nasty ramifications like team in-fighting, and politics. In worst cases, it can bring the development of new features to a grinding halt.

    Avoiding tech debt has to be a top priority for any startup that wants to grow. A lot of it revolves around establishing some processes to convey requirements among teams and ensuring that the resulting system design is simple. Like the quote by Rich Hickey shows, it is hard work but it will pay off in the longer run.

    Centralize your Logs

    However, logging doesn’t seem to receive the same level of attention; consequently, developers find it hard to know the ‘what, when, and how’ of logging.
    — Colin Eberhardt

    Please stop asking developers to SSH into machines to read the logs. Centralize your logs by using ELK or if you want to avoid the hassle of setting up ELK, use a third party service like Fluentd or similar. A good centralized logging strategy will not only save you the pain of SSH-ing into multiple servers and grep-ing, it will also let you search through them easily. In addition, aggregating logs from various servers helps you identify patterns that may emerge by checking what’s happening on multiple servers in a specific time range.

    Unit Testing with Specs

    clojure.spec is a standard, expressive, powerful, and integrated system for specification and testing. It lets you define the shape of your data, and place contraints on it. Once the shape, and constraints are defined, clojure.spec can then generate sample data which you can use to test your functions. In this post I’ll walk you through how you can use clojure.spec in conjunction with other libraries to write unit tests.

    Motivation

    As developers, we are accustomed to writing example-based tests - we provide a known input, look at the resulting output, and assert that it matches our expectations. Although there is nothing wrong with this approach, there are a few drawbacks:

    1. It is expensive as it takes longer to complete.
    2. It is easier to miss out on the corner cases.
    3. It is more prone to pesticide paradox.[1]

    In contrast, clojure.spec allows you to do generative, property-based testing. Generative testing allows you to specify what kind of data you are looking for. This is done by using generators. A generator is a declarative description of possible inputs to a function.[2] Property-based testing allows you to specify how your program is supposed to behave, given an input. A property is a high-level specification of behavior that should hold for a range of inputs.[3]

    Setup

    Creating an App

    We’ll begin by creating an app using lein and defining the dependencies. So go ahead and execute the following to create your project:

    1
    lein new app clj-spec

    Adding Dependencies

    Next we’ll add a few dependencies. cd into clj-spec and open project.clj. Add the following to your :dependencies

    1
    2
    3
    4
    5
    [org.clojure/clojure "1.8.0"]
    [clojure-future-spec "1.9.0-beta4"]
    [org.clojure/test.check "0.9.0"]
    [circleci/bond "0.3.0"]
    [cloverage "1.0.9"]

    clojure.spec comes as a part of Clojure 1.9 which, as of writing, isn’t out yet. If you’re on Clojure 1.8, as I am, you can use clojure-future-spec which will give you the same APIs. circleci/bond is a stubbing library which we’ll use to stub IO, network calls, database calls, etc. cloverage is the tool we’ll use to see the coverage of our tests.

    Using clojure.spec

    Simple Specs

    Fire up a REPL by executing lein repl and require the required namespaces ;)

    1
    2
    3
    4
    5
    6
    clj-spec.core=> (require '[clojure.spec.alpha :as s])
    nil
    clj-spec.core=> (require '[clojure.spec.gen.alpha :as gen])
    nil
    clj-spec.core=> (require '[clojure.future :refer :all])
    nil

    spec will let us define the shape of our data, and constraints on it. gen will let us generate the sample data.

    Let’s write a simple spec which we can use to generate integers.

    1
    2
    clj-spec.core=> (s/def ::n int?)
    :clj-spec.core/n

    We’ve defined a spec ::n which will constrain the sample data to only be integers. Notice the use of double colons to create a namespace-qualified symbol; this is a requirement of the spec library. Now let’s generate some sample data.

    1
    2
    clj-spec.core=> (gen/generate (s/gen ::n))
    -29454

    s/gen takes a spec as an input and returns a generator which will produce conforming data. gen/generate exercises this generator to return a single sample value. You can produce multiple values by using gen/sample:

    1
    2
    clj-spec.core=> (gen/sample (s/gen ::n) 5)
    (0 0 1 -1 0)

    We could have done the same thing more succinctly by using the in-built functions as follows:

    1
    2
    clj-spec.core=> (gen/generate (gen/vector (gen/int) 5))
    [25 -29 -13 26 -8]

    Spec-ing Maps

    Let’s say we have a map which represents a person and looks like this:

    1
    2
    {:name "John Doe"
    :age 32}

    Let’s spec this.

    1
    2
    3
    (s/def ::name string?)
    (s/def ::age int?)
    (s/def ::person (s/keys :req-un [::name ::age]))

    We’ve defined ::name to be a string, and ::age to be an integer (positive or negative). You can make your specs as strict or as lenient as you choose. Finally, we define ::person to be a map which requires the keys ::name and ::age, albiet without namespace-qualification. Let’s see this in action:

    1
    2
    clj-spec.core=> (gen/generate (s/gen ::person))
    {:name "KXYXcbk", :age 107706}

    By now you must have a fair idea of how you can spec your data and have sample values generated that match those specs. Next we’ll look at how we can do property-based testing with specs.

    Using test.check

    test.check allows us to do property-based testing. Property-based tests make statements about the output of your code based on the input, and these statements are verified for many different possible inputs.[4]

    A Simple Function

    1
    2
    3
    4
    5
    (defn even-or-odd
    [coll]
    (map #(cond
    (even? %) :even
    :else :odd) coll))

    We’ll begin by testing the simple function even-or-odd. We know that for all even numbers we should get :even and for all odd numbers we should get :odd. Let’s express this as a property of the function. Begin by require-ing a couple more namespaces.

    1
    2
    3
    4
    clj-spec.core=> (require '[clojure.test.check :as tc])
    nil
    clj-spec.core=> (require '[clojure.test.check.properties :as prop])
    nil

    Now for the actual property.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    (def property
    (prop/for-all [v (gen/vector (gen/choose 0 1))]
    (let [result (even-or-odd v)
    zeroes-and-ones (group-by zero? v)
    zeroes (get zeroes-and-ones true)
    ones (get zeroes-and-ones false)
    evens-and-odds (group-by #(= :even %) result)
    evens (get zeroes-and-ones true)
    odds (get zeroes-and-ones false)]
    (and (= (count zeroes) (count evens))
    (= (count ones) (count odds))))))

    We have a generator which will create a vector of 0s and 1s only. We pass that vector as an input to our function. Additionally, we know that the number of 0s should equal the number of :evens returned and that the number of 1s should equal the number of :odds returned.

    Next, let’s test this property.

    1
    2
    clj-spec.core=> (tc/quick-check 100 property)
    {:result true, :num-tests 100, :seed 1510754879429}

    Awesome! We ran the test a 100 times and passed. The added benefit is that the input generated will be different every time you run the test.

    Using bond

    bond is a library which will let you stub side-effecting functions like database calls. We’ll require the namespce and modify our code to save even numbers to database.

    First, the namespace.

    1
    2
    clj-spec.core=> (require '[bond.james :as bond])
    nil

    Next, the code.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    (defn save
    [n]
    ;; let's assume there's a database call here
    nil)

    (defn even-or-odd
    [coll]
    (map #(cond
    (even? %) (do
    (save %) ;; save to database
    :even)
    :else :odd) coll))

    Now let’s update the property and stub save.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    (def property
    (prop/for-all [v (gen/vector (gen/choose 0 1))]
    (bond/with-stub [save]
    (let [result (even-or-odd v)
    zeroes-and-ones (group-by zero? v)
    zeroes (get zeroes-and-ones true)
    ones (get zeroes-and-ones false)
    evens-and-odds (group-by #(= :even %) result)
    evens (get zeroes-and-ones true)
    odds (get zeroes-and-ones false)]
    (and (= (count zeroes) (count evens))
    (= (count ones) (count odds))
    (= (count zeroes) (-> save bond/calls count)))))))

    Notice how we’re using bond/with-stub and telling it to stub save function which calls the database. Later, we assert that the number of times that the databse was called is equal to the number of evens in the vector. Let’s verify the property.

    1
    2
    clj-spec.core=> (tc/quick-check 10000 property)
    {:result true, :num-tests 10000, :seed 1510834022725}

    Voilà! It works!

    The last part of this post is about finding out test coverage using cloverage. For that, we’ll be moving our code to core.clj and writing test under the test directory.

    Using cloverage

    To see cloverage in action, we’ll need to add our functions to core.clj. Here’s what it’ll look like:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    (ns clj-spec.core
    (:gen-class))

    (defn save
    [n]
    ;; let's assume there's a database call here
    nil)

    (defn even-or-odd
    [coll]
    (map #(cond
    (even? %) (do
    (save %) ;; save to database
    :even)
    :else :odd) coll))

    (defn -main
    "I don't do a whole lot ... yet."
    [& args]
    (println "Hello, World!"))

    Update your clj-spec.core-test to the following:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    (ns clj-spec.core-test
    (:require [bond.james :as bond]
    [clojure.test.check.properties :as prop]
    [clojure.spec.gen.alpha :as gen]
    [clojure.test.check.clojure-test :refer :all]
    [clj-spec.core :refer :all]))

    (defspec even-odd-test
    100
    (prop/for-all [v (gen/vector (gen/choose 0 1))]
    (bond/with-stub [save]
    (let [result (even-or-odd v)
    zeroes-and-ones (group-by zero? v)
    zeroes (get zeroes-and-ones true)
    ones (get zeroes-and-ones false)
    evens-and-odds (group-by #(= :even %) result)
    evens (get zeroes-and-ones true)
    odds (get zeroes-and-ones false)]
    (and (= (count zeroes) (count evens))
    (= (count ones) (count odds))
    (= (count zeroes) (-> save bond/calls count)))))))

    Here we are using the defspec macro to run the same property-based test a 100 times only this time we’ll run the test via command-line using lein. Execute the following command to run the test and see the coverage.

    1
    lein run -m cloverage.coverage -t 'clj-spec.core-test' -n 'clj-spec.core'

    This will make use of cloverage to run the tests. -t denotes our test namespace and -n denotes the namespace for whom the tests are written. You’ll get an output like this:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    ...
    Produced output in /Users/fasih/Personal/clj-spec/target/coverage .
    HTML: file:///Users/fasih/Personal/clj-spec/target/coverage/index.html

    |---------------+---------+---------|
    | Namespace | % Forms | % Lines |
    |---------------+---------+---------|
    | clj-spec.core | 82.61 | 88.89 |
    |---------------+---------+---------|
    | ALL FILES | 82.61 | 88.89 |
    |---------------+---------+---------|

    Perfect!! Now we know how much coverage we have. The HTML file has a nice graphical representation of which lines we’ve covered with our tests.

    Conclusion

    This brings us to the end of the post on using clojure.spec to write generative, property-based tests in both the REPL and source files. Generative testing automates the task of having to come up with examples for your tests. Where to go from here? Each of these libraries is pretty powerful in itself and will provide you with the necessary tools to write powerful, robust, and expressive tests that require minimal effort. So, head over to the offical docs to learn more.

    Monad

    So far we’ve looked at monoids and functors. The next algebraic data structure we’ll cover is a monad. If you’ve wondered what a monad is but never really understood it, this is the post for you. I am sure that you’ve used it without realizing it. So let’s get to it.

    Definition

    A monad has more structure than a functor. This means that you can call map on it and that it obeys all the functor laws. In addition, a monad has a flatMap function which you can use to chain monads together. In essence, monads represent units of computation that you can chain together and the result of this chaining is also a monad.

    Let’s look at a few examples.

    Example

    1
    2
    3
    4
    5
    6
    7
    8
    val first = List(1, 2)
    val next = List(8, 9)

    for {
    i <- first
    j <- next
    }
    yield(i * j)

    The above code[1] uses a for comprehension to muliply elements of the list together. Under the hood, this gets translated to:

    1
    2
    3
    4
    5
    first flatMap {
    f => next map {
    n => f * n
    }
    }

    The compiler is making use of the List monad to chain operations together. Let’s break this down.

    1
    2
    3
    next map {
    n => f * n
    }

    This part of the code will return a List since that is what calling map on a List does. Since we have two elements in first list, the result of mapping will generate two lists of two elements each. This isn’t what we want. We want a single list that combines the results together.

    1
    2
    3
    first flatMap {
    ...
    }

    The flattening of results is what flatMap does - it takes the two lists and squishes them into one.

    Monad Laws

    For something to be a monad, it has to obey the monadic laws. There’s three monad laws:

    1. Left identity
    2. Right identity
    3. Associativity

    Left Identity

    This law means that if we take a value, put it into a monad, and then flatMap it with a function f, that’s the same as simply applying the function f to the original value. Let’s see this in code:

    1
    2
    3
    4
    5
    6
    scala> def f(x: Int): List[Int] = { List(x * 2) }
    f: (x: Int)List[Int]

    // left identity
    List(2).flatMap(f) == f(2)
    res5: Boolean = true

    Right Identity

    This law means that if we take a monad, flatMap it, and within that flatMap we try to create a monad out of it, then that’s the same as original monad. Let’s see this in code:

    1
    2
    3
    // right identity
    scala> List(1, 2, 3).flatMap({ x => List(x) }) == List(1, 2, 3)
    res6: Boolean = true

    Let’s walkthrough this. The function to flatMap gets the elements of the original list, List(1, 2, 3), one-by-one. The result is List(List(1), List(2), List(3)). This is then flattened to create List(1, 2, 3), which is the original list.

    Associativity

    This law states that if we apply a chain of functions to our monad, that’s the same as the composition of all the functions. Let’s see this in code:

    1
    2
    3
    4
    5
    6
    7
    8
    scala> def f(x: Int): List[Int] = { List(x + 1) }
    f: (x: Int)List[Int]

    scala> def g(x: Int): List[Int] = { List(x + 1) }
    g: (x: Int)List[Int]

    scala> List(1, 2, 3).flatMap(f).flatMap(g) == List(1, 2, 3).flatMap(x => f(x).flatMap(g))
    res8: Boolean = true

    Conclusion

    This brings us to the end of the post on monads and their laws. List isn’t the only monad in your arsenal. Options and Futures are monads, too. I suggest going ahead and constructing examples for monadic laws for them.

    Functor

    The next algebraic structure we’ll look at is a functor. In the introduction, we saw that a category consists of objects and arrows. As an example, we morphed a set of strings to another set which contained the reverse of those strings. In other words, we morphed an object to another object. What if we could morph an entire category to another category while preserving the structure? Well, that’s what a functor does.

    Formal Definition

    Let and be categories. A functor is a map taking each -object to a -object and each -arrow to a arrow , such that all -objects and composable -arrows and

    Example

    Say we have a set . From this set we create another set which contains finite lists of elements drawn from . The functor we want maps from set to set. Since we know that a category contains objects and arrows, becomes the object part. The arrow part takes a function to a function that given a list maps over elements of

    How does this translate to code? This actually translates fairly easily to code. Containers like lists, trees, etc. that you can call map on are functors.

    Let’s write some code. We’ll begin by creating a set .

    1
    2
    @ val S = Set(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
    S: Set[Int] = Set(0, 5, 1, 6, 9, 2, 7, 3, 8, 4)

    Next, we’ll create .

    1
    2
    @ def f(x: Int) = { x * 2 }
    defined function f

    Next, let’s create .

    1
    2
    @ val L = List(1, 2, 3)
    L: List[Int] = List(1, 2, 3)

    Next, we’ll create the function maplist.

    1
    2
    @ def maplist(f: Int => Int)(L: List[Int]) = L map f
    defined function maplist

    Finally, let’s see this in action:

    1
    2
    @ maplist(f)(L)
    res4: List[Int] = List(2, 4, 6)

    As we can see, maplist applied the function f on all elements of L. We did this by using the map method of a List instance.

    Functor Laws

    All functors are expected to obey the two laws that we saw in the formal definition. Let’s see how they translate to code.

    First Law

    The first law states that if we map the identity function over a functor, we’ll get back a functor which is the same as the original functor.

    1
    2
    @ List(1, 2, 3) map identity
    res5: List[Int] = List(1, 2, 3)

    As we can see, applying identity to the list gives back the same list.

    Second Law

    The second law states that if we map a functor using a composition of two functions, , it’s the same as first mapping the functor using the first function and then mapping the resulting functor using the second function, .

    We’ll begin by creating two functions f and g.

    1
    2
    3
    4
    @ def f(x: Int): Int = x + 1
    defined function f
    @ def g(x: Int): Int = x + 1
    defined function g

    Now let’s put the theory into practice.

    1
    2
    3
    4
    5
    6
    7
    // composition of two functions
    @ List(1, 2, 3) map { x => g(f(x)) }
    res8: List[Int] = List(3, 4, 5)

    // applying map twice
    @ List(1, 2, 3) map f map g
    res9: List[Int] = List(3, 4, 5)

    As we see, the two lists are the same.

    More Functor Examples

    Example 1

    Let’s consider a category where objects are integers. Arrows between objects indicates a “divided by” relationship. For example,

    This indicates that 10 can be divided by 5. To reiterate, objects are numbers and arrows represent a “divided by” relationship.

    Now let’s create a functor from the category to itself. This functor will multiply each object by 13. So, . Is this a valid functor? We have but is it true that ?

    The answer is yes. Our category has arrows that indicate a “divided by” relationship. So, will be an integer. Similarly, will also be an integer and maintain a “divided by” relationship. This shows that arrows do not always have to be functions. They can also indicate a relationship between their domain and codomain.

    Conclusion

    In this post we saw functors which map objects from one category to another. Containers like trees, lists, etc. are functors. All functors are required to obey the two functor laws.

    Monoid

    The first algebraic structure we’ll look at is a monoid. We’ve covered monoid previously in Scalaz Under the Hoods. In this post we’ll look at it again from a more abstract standpoint.

    Formal Definition

    A monoid is an underlying equipped with

    1. a binary operation from pairs of elements of into such that for all
    2. an element such that

    We’ve already translated this definition to code. Just to recap, here’s what we wrote previously:

    1
    2
    3
    4
    trait Monoid[A] {
    def mempty: A
    def mappend(a: A, b: A): A
    }

    mappend is the binary operation , and mempty is the element .

    More concretely, we wrote this:

    1
    2
    3
    4
    object IntMonoid extends Monoid[Int] {
    def mempty: Int = 0
    def mappend(a: Int, b: Int) = a + b
    }

    So, translates to the addition operation , and translates to . That way, where is any integer. That was fairly easy to understand.

    Monoid Homomorphism

    A monoid homomorphism from to is a function such that

    1. and
    2. .

    The composition of two monoid homomorphisms is the same as their composition as functions on sets.

    I know this is abstract so let’s have a look at a concrete example. Let’s write some code. We’ll be reusing the monoids that we previously wrote.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    @ trait Monoid[A] {
    def mempty: A
    def mappend(a: A, b: A): A
    }
    defined trait Monoid

    @ object Monoid {
    implicit object StringMonoid extends Monoid[String] {
    def mempty: String = ""
    def mappend(a: String, b: String): String = a + b
    }
    implicit object IntMonoid extends Monoid[Int] {
    def mempty: Int = 0
    def mappend(a: Int, b: Int) = a + b
    }
    }
    defined object Monoid

    Next, we’ll write a homomorphism

    1
    2
    @ def f(s: String): Int = s length
    defined function f

    Let’s see this in action. We’ll begin by testing the first rule.

    1
    2
    3
    4
    5
    6
    7
    // bring the monoids into scope
    @ import Monoid._
    import Monoid._

    // rule 1
    @ f(StringMonoid.mempty) == IntMonoid.mempty
    res6: Boolean = true

    So we see that the first rule is satisfied. Applying on the zero element of StringMonoid gives us the zero element of IntMonoid. Onto the second rule.

    1
    2
    3
    4
    5
    6
    7
    8
    @ val x = "apple"
    x: String = "apple"
    @ val y = "banana"
    y: String = "banana"

    // rule 2
    @ f( StringMonoid.mappend(x, y) ) == IntMonoid.mappend( f(x), f(y) )
    res9: Boolean = true

    And we see that the second rule is also satisfied. Therefore, is a homomorphism such that . To recap, a monoid homomorphism is a map between monoids that preserves the monoid operation and maps the identity element of the first monoid to that of the second monoid[1]. The monoid operation is still and the empty string is mapped to , which is the zero/identity element of IntMonoid.

    Category with One Object

    Suppose there’s a category with just one object in it. The identity arrow would point to itself. And the composition of this arrow with itself is , which satisfies the associativity law. A monoid may be represented as a category with a single object. The elements of M are represented as arrows from this object to itself, the identity element is represented as the identity arrow, and the operation is represented as composition of arrows.

    Any category with a single object is a monoid.

    Introduction to Category Theory

    We’ve covered a lot of topics in Scalaz but before moving forward, I’d like to cover functors, monoids, monads, etc. These form the basis of functional programming and are predicated in category theory. This post is intended to be an introduction to category theory.

    What is Category Theory?

    Category theory is a mathematical theory involving the study of categories. A category consists of a group of objects and transformations between them. Think of a category as a simple collection.[1]

    Formally, a category consists of the following:

    1. a collection of objects
    2. a collection of arrows (called morphisms)
    3. operations assigning each arrow an object , its domain, and an object , its codomain. We write this as
    4. a composition operator assigning each pair of arrows and , with a composite arrow , satisfying the associative law:
      for any arrows , , and (with , , , and not necessarily distinct),
    5. for each object , an identity arrow satisfying the identity law:
      for any arrow ,
      and

    The formal definition above is taken verbatim from Basic Category Theory for Computer Scientists.

    Simple Category

    Let’s relate the diagram above[2] to the formal definition that we have. This simple category has three objects , , and . There’s three identity arrows , , and . These identity arrows satisfy the identity law. For example, . Intuitively, if you were “standing” on and you first “walked along” the arrow and then “walked along” the arrow to reach , it’s as good as just “walking along” .

    A More Concrete Example

    Let’s consider a category whose objects are sets. We’ll translate this into code and hold it to the laws stated above.

    1. is a collection of sets i.e. each object is a set.
    2. an arrow is a morphism from set to set
    3. for each function , we have , and
    4. the composition of a function with is a function from to mapping each element to
    5. for each set , the identity function is a function with domain and codomain as .

    Code

    Let’s begin by creating our first object of category - a set .

    1
    2
    @ val A = Set("apples", "oranges")
    A: Set[String] = Set("apples", "oranges")

    Next, let’s define a function which morphs to .

    1
    2
    @ def f(a: Set[String]): Set[String] = a map { _.reverse }
    defined function f

    Next, let’s morph to by applying the function

    1
    2
    @ val B = f(A)
    B: Set[String] = Set("selppa", "segnaro")

    The domain of is the set where as codomain is the set of reversed strings, .

    Next, let’s define a function

    1
    2
    @ def g(b: Set[String]): Set[Int] = b map { _.length }
    defined function g

    Now let’s compose and

    1
    2
    @ val C = g(f(A))
    C: Set[Int] = Set(6, 7)

    And finally, let’s create an identity function

    1
    2
    @ def idA(a: Set[String]): Set[String] = a map identity
    defined function idA

    Let’s see this in action

    1
    2
    @ idA(A)
    res7: Set[String] = Set("apples", "oranges")

    This is how we translate a category to code. In the coming posts we’ll cover more category theory.

    Scalaz Memo

    In this post we’ll look at Memo which is a Scalaz goodie to add memoization to your program. We’ll recap what memoization is, write a recursive function to calculate Fibonacci numbers, and then add memoization to it using Memo.

    What is Memoization?

    In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.[1]

    Fibonacci (without Memo)

    1
    2
    3
    4
    5
    6
    7
    @ def fibo(n: Int): Int = {
    n match {
    case 0 | 1 => n
    case _: Int => fibo(n - 1) + fibo(n - 2)
    }
    }
    defined function fibo

    So here’s the non-memoized recursive fibo which calculates the nth Fibonacci number. The issue here is that it’ll recalculate the Fibonacci numbers a lot of times and therefore cannot be used for large values of n.

    Fibonacci (with Memo)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    @ import scalaz._
    import scalaz._
    @ import Scalaz._
    import Scalaz._
    @ val fibo: Int => Int = Memo.immutableHashMapMemo {
    case 0 => 0
    case 1 => 1
    case n: Int => fibo(n - 1) + fibo(n - 2)
    }
    fibo: Int => Int = scalaz.Memo$$$Lambda$2233/1055106802@48649202

    Here’s the memoized version using Scalaz Memo. We are using an immutable, hash map-backed memo. The immutableHashMapMemo method takes a partial function defining how we construct the memo. In our case, if the value of n is not 0 or 1, we try looking up the value in the memo again. We recurse until we reach 0 or 1. Once that happens and our recursion returns, the resultant value is cached in an immutable.HashMap.

    Conclusion

    Memoization is a great way to optimize your programs. In this post we used an immutable hash map memo. There are other types of memos applying different strategies to cache their results. The Memo companion object is the place to look for an appropriate memo that suits your needs.

    Scalaz Lens

    In this post we’ll look at Lens which is a pure functional way of getting and setting data. Before we get into lenses, we’ll look at why we need lenses by looking at a simple example.

    Motivating Example

    1
    2
    3
    4
    @ case class Address(city: String, zip: Int)
    defined class Address
    @ case class Person(name: String, address: Address)
    defined class Person

    Say we have a class Person which has the name of the person and their address where the address is represented by Address. What we’d like to do is to change the address of the person. Let’s go ahead and create a Person.

    1
    2
    @ val p1 = Person("John Doe", Address("Doe Ville", 7))
    p1: Person = Person("John Doe", Address("Doe Ville", 7))

    Changing the Address while maintaining immutability is fairly easy.

    1
    2
    3
    4
    5
    @ val p2 = p1.copy(
    p1.name,
    Address("Foo City", 9)
    )
    p2: Person = Person("John Doe", Address("Foo City", 9))

    The problem arises when things begin to nest. Let’s create an Order class representing an order placed by a Person.

    1
    2
    3
    4
    @ case class Order(person: Person, items: List[String])
    defined class Order
    @ val o1 = Order(p1, List("shoes", "socks", "toothpaste"))
    o1: Order = Order(Person("John Doe", Address("Doe Ville", 7)), List("shoes", "socks", "toothpaste"))

    Now, the person would like to change the address to which the items are delivered.

    1
    2
    3
    4
    5
    6
    7
    8
    @ val o2 = o1.copy(
    o1.person.copy(
    o1.person.name,
    Address("Foo City", 9)
    ),
    o1.items
    )
    o2: Order = Order(Person("John Doe", Address("Foo City", 9)), List("shoes", "socks", "toothpaste"))

    So, the deeper we nest, the uglier it gets. Lenses provide a succinct, functional way to do this.

    Lens

    Lenses are a way of focusing on a specific part of a deep data structure. Think of them as fancy getters and setters for deep data structures. I’ll begin by demonstrating how we can create and use a lens and then explain the lens laws.

    Creating a Lens

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    @ import scalaz._
    import scalaz._
    @ import Scalaz._
    import Scalaz._

    // creating a lens
    @ val addressInPerson = Lens.lensu[Person, Address] (
    (p, address) => p.copy(address = address),
    _.address
    )
    addressInPerson: Lens[Person, Address] = scalaz.LensFunctions$$anon$5@4e32e2a2

    What we’ve done is create a lens that accepts a Person object and focuses on its Address field. lensu expects two functions - a setter and a getter. In the first function, the setter, we’re making a copy of the Person object passed to the lens and updating its address field with the new one. In the second function, the getter, we’re simply returning the address field. Lets see this in action by getting and setting values.

    Getting a Field

    1
    2
    @ addressInPerson.get(p1)
    res12: Address = Address("Doe Ville", 7)

    Once you create a lens, you get a get method which returns the address field in the Person object.

    Setting a Field

    1
    2
    @ val p3 = addressInPerson.set(p1, Address("Bar Town", 10))
    p3: Person = Person("John Doe", Address("Bar Town", 10))

    Similarly, there’s a set method which lets you set fields to specific values.

    Modifying a Field

    1
    2
    @ val p4 = addressInPerson.mod({ a => a.copy(city = s"${a.city}, NY") }, p1)
    p4: Person = Person("John Doe", Address("Doe Ville, NY", 7))

    mod lets you modify the field. It expects a function that maps Address to Address. In the example here, we’re appending “NY” to the name of the city.

    Lenses are Composable

    The true power of lenses is in composing them. You can compose two lenses together to look deeper into a data structure. For example, we’ll create a lens which lets us access the address field of the person in an Order. We’ll do this by composing two lenses.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    // creating the lens
    @ val personInOrder = Lens.lensu[cmd5.Order, Person] (
    (o, person) => o.copy(person = person),
    _.person
    )
    personInOrder: Lens[cmd5.Order, Person] = scalaz.LensFunctions$$anon$5@33d58abf

    // testing the lens
    @ personInOrder.get(o1)
    res16: Person = Person("John Doe", Address("Doe Ville", 7))

    Ignore the cmd. prefix to Order. That is just an Ammonite REPL quirk to avoid confusing with the Order trait from Scalaz. Next, we’ll combine the two lenses we have.

    1
    2
    @ val addressInOrder = personInOrder >=> addressInPerson
    addressInOrder: LensFamily[cmd5.Order, cmd5.Order, Address, Address] = scalaz.LensFamilyFunctions$$anon$4@73679b13

    >=> is the symbolic alias for andThen. The way you read what we’ve done is: get the person from the order AND THEN get the address from that person.

    This allows you to truly keep your code DRY. Now no matter within which data structure Person and Address are, you can reuse that lens to get and set those fields. It’s just a matter of creating another lens or few lenses to access the Person from a deep data structure.

    Similarly there’s also compose which has a symbolic alias <=< and works in the other direction. I personally find it easier to use andThen / >=>.

    Lens Laws

    Get-Put: If you get a value from a data structure and put it back in, the data structure stays unchanged.
    Put-Get: If you put a value into a data structure and get it back out, you get the most updated value back.
    Put-Put: If you put a value into a data structure and then you put another value in the data structure, it’s as if you only put the second value in.

    Lenses that obey all the three laws are called “very well-behaved lenses”. You should always ensure that your lenses obey these rules.

    Here’s how Scalaz represents these lens laws:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    trait LensLaw {
    def identity[A >: A2 <: A1, B >: B1 <: B2](a: A)(implicit A: Equal[A]): Boolean = {
    val c = run(a)
    A.equal(c.put(c.pos: B), a)
    }
    def retention[A >: A2 <: A1, B >: B1 <: B2](a: A, b: B)(implicit B: Equal[B]): Boolean =
    B.equal(run(run(a).put(b): A).pos, b)
    def doubleSet[A >: A2 <: A1, B >: B1 <: B2](a: A, b1: B, b2: B)(implicit A: Equal[A]): Boolean = {
    val r = run(a)
    A.equal(run(r.put(b1): A) put b2, r put b2)
    }
    }

    identity is get-put law, retention is put-get law, and doubleSet is put-put law.

    Lenses and State Monads

    Formally, a state monad looks like the following:

    1
    S => (S, A)

    Given a state S, it computes the resulting state S by making mutations to the existing state S and produces a resulting A. This is a bit abstract so let’s look at a scenario. Say we have a list of people whose addresses we’d like to update to Fancytown with zip code 3. Let’s do that using lenses.

    Creating a State

    1
    2
    3
    4
    @ val state = for {
    p <- addressInPerson %= { add => Address("Fancytown", 3) }
    } yield p
    state: IndexedStateT[Id, Person, Person, Address] = scalaz.IndexedStateT$$anon$11@25b51460

    Here we are creating a state using a for comprehension. The %= operator accepts a function which maps an Address to an Address. What we get back is a state monad. Now that we have a state monad, let’s use it to update the address.

    Updating the State

    Next, let’s make person p1 move to Fancytown.

    1
    2
    @ state(p1)
    res33: (Person, Address) = (Person("John Doe", Address("Fancytown", 3)), Address("Fancytown", 3))

    Here we are updating person p1‘s address. What we get back is a new state S, p1 but with Fancytown address, and the result A, the new Address. state(p1) is the same as state.apply(p1). In short, we’re applying that state to a Person object.

    Conclusion

    This brings us to the end the post on lenses. Lenses are a powerful way to get, set, and modify fields in your data structures. The best part about them is that they are reusable and can be composed to form lenses that focus deeper into the data structure.

    Scalaz NonEmptyList

    In this post we’ll look at NonEmptyList which is a singly-linked list guaranteed to be non-empty. Let’s begin by seeing how using a NEL (short for NonEmptyList) in appropriate places is better than using regular List.

    Contrasting List with NEL

    1
    2
    3
    4
    @ def email(body: String, recipients: List[String]): Unit = {
    // send email
    }
    defined function email

    Let’s say we’re writing an email function that sends email to a list of people. A naive assumption would be to always expect recipients to have at least one element. However, we may end up with an empty list and that’ll lead to painstaking debugging trying to find out why emails weren’t sent out.

    So, we’ll change our email function to this:

    1
    2
    3
    4
    5
    6
    7
    8
    @ def email(body: String, recipients: List[String]): Either[String, String] = {
    if(recipients.length == 0)
    Left("Empty list of recipients")
    else
    // send email
    Right(s"Email sent to ${recipients.length} recipient(s)")
    }
    defined function email

    Now although we’re handling the case of empty list, we’ve introduced unnecessary noise with an if-else. Here’s how we can get a guarantee on being non-empty and avoid the conditional:

    1
    2
    3
    4
    5
    6
    7
    8
    @ import scalaz._
    import scalaz._
    @ import Scalaz._
    import Scalaz._
    @ def email(body: String, recipients: NonEmptyList[String]): Unit = {
    // send email
    }
    defined function email

    Voila! Using a NEL gives us two advantages:

    • A guarantee on never receiving an empty list
    • Expressing our intent in the function signature

    With that said, let’s look at how we can create and use NELs.

    Creating a NEL

    1
    2
    @ NonEmptyList(1, 2, 3, 4)
    res2: NonEmptyList[Int] = NonEmpty[1,2,3,4]

    In the example above, we’re using the apply method to create a NEL. It looks like the following:

    1
    def apply[A](h: A, t: A*)

    The first argument is the head whereas the second argument is varargs which becomes the tail. Internally, the apply method calls the nels method. We can use that directly, too.

    1
    2
    @ NonEmptyList nels(1, 2, 3, 4)
    res3: NonEmptyList[Int] = NonEmpty[1,2,3,4]

    You can also create a NEL from a single value by using wrapNel.

    1
    2
    @ 1.wrapNel
    res4: NonEmptyList[Int] = NonEmpty[1]

    You can even create a NEL in the cons list way like you’d do with a List

    1
    2
    3
    4
    5
    6
    7
    // good ol' List
    @ 1 :: List(2, 3, 4)
    res5: List[Int] = List(1, 2, 3, 4)

    // NEL
    @ 1 <:: NonEmptyList(2, 3, 4)
    res6: NonEmptyList[Int] = NonEmpty[1,2,3,4]

    You can convert a List to a NEL by calling toNel. This will yield an Option of NEL since the List may be empty

    1
    2
    @ List(1, 2, 3) toNel
    res7: Option[NonEmptyList[Int]] = Some(NonEmpty[1,2,3])

    Appending NELs

    1
    2
    @ NonEmptyList(1, 2, 3) append NonEmptyList(4, 5, 6)
    res8: NonEmptyList[Int] = NonEmpty[1,2,3,4,5,6]

    You can also concatenate two NELs by using append.

    Head and Tail

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    @ NonEmptyList(1, 2, 3).head
    res13: Int = 1
    @ NonEmptyList(1, 2, 3).tail
    res14: IList[Int] = ICons(2, ICons(3, []))

    @ List().head
    java.util.NoSuchElementException: head of empty list
    scala.collection.immutable.Nil$.head(List.scala:428)
    scala.collection.immutable.Nil$.head(List.scala:425)
    ammonite.$sess.cmd15$.<init>(cmd15.sc:1)
    ammonite.$sess.cmd15$.<clinit>(cmd15.sc)

    head and tail return and head and tail of the NEL, respectively. Note that head is guaranteed to work without exceptions because we’ll always have one element.

    Iterating

    1
    2
    3
    4
    5
    6
    7
    @ NonEmptyList(1, 2, 3) foreach println
    1
    2
    3

    @ NonEmptyList(1, 2, 3) map { _ + 1 }
    res18: NonEmptyList[Int] = NonEmpty[2,3,4]

    Because NEL is also a list, you can perform familiar list operations like foreach, map, etc.

    Conclusion

    This post was intended to give you an introduction to Scalaz NonEmptyList and how it differs from the usual List. The guarantee on NEL being non-empty means that you can call head without having to worry about exceptions. This makes your code less cluttered and its intent more expressive.

    Scalaz Either

    In this post we’ll look at Scalaz Either. This is Scalaz’s version of the standard Scala Either. Before we look at Scalaz Either, we’ll look at Scala Either.

    Scala Either

    Let’s begin by going over the docs:[1]

    Represents a value of one of two possible types (a disjoint union.) Instances of Either are either an instance of Left or Right.

    Convention dictates that Left is used for failure and Right is used for success.

    With the definition out of the way, let’s look at some code. The example is modeled around the code in the official Scala Either docs.

    Creating an Either

    1
    2
    3
    4
    5
    6
    7
    8
    @ def parseInt(str: String): Either[String, Int] = {
    try {
    Right(str toInt)
    } catch {
    case e: Exception => Left(e getMessage)
    }
    }
    defined function parseInt

    Next, let’s create a case each of success and failure.

    1
    2
    3
    4
    @ val success = parseInt("2")
    success: Either[String, Int] = Right(2)
    @ val failure = parseInt("apple")
    failure: Either[String, Int] = Left("For input string: \"apple\"")

    Using a for Comprehension

    Scala Either is not a monad and so you cannot use it in a for comprehensions.

    1
    2
    3
    4
    @ for {
    n <- success
    } yield n
    res3: Either[String, Int] = Right(2)


    NOTE:

    Previously, Scala Either was not a monad so it couldn’t be used in for comprehensions. Now, it is a monad and can be used in for comprehensions.


    Scalaz Either

    A valid question to ask is why would one use Scalaz Either when Scala Either is a monad. The answer is that Scalaz Either is a lot more convenient and powerful compared to Scala Either. Let’s begin by refactoring parseInt to return a Scalaz Either.

    Creating an Either

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    @ import scalaz._
    import scalaz._
    @ import Scalaz._
    import Scalaz._
    @ def parseInt(str: String): String \/ Int = {
    import scala.util.{Try, Success, Failure}
    Try { str toInt } match {
    case Success(n) => n.right[String]
    case Failure(e) => e.getMessage.left[Int]
    }
    }
    defined function parseInt

    Next, let’s create a case each of success and failure.

    1
    2
    3
    4
    @ val success = parseInt("2")
    success: String \/ Int = \/-(2)
    @ val failure = parseInt("apple")
    failure: String \/ Int = -\/("For input string: \"apple\"")

    The return type of our function is indicated by String \/ Int. This means we may return a String on the left in case of failure and an Int on the right in case of success. We create right or left projections by calling right or left, respectively, and mentioning the type of the value that will be on the other side. For example, we call right[String] because the left side is a String. The right projection is indicated by \/- and left projection is indicated by -\/.

    Using a for Comprehension

    1
    2
    3
    4
    @ for {
    n <- success
    } yield n
    res12: String \/ Int = \/-(2)

    Because Scalaz Either is also a monad, it can be used in a for comprehension.

    Checking Left or Right

    1
    2
    3
    4
    @ success isRight
    res13: Boolean = true
    @ failure isLeft
    res14: Boolean = true

    Akin to Scala Either, Scalaz Either also lets you check for left or right by calling isLeft or isRight, respectively.

    Ternary Operator

    1
    2
    @ success ? "YES" | "NO"
    res15: String = "YES"

    Scalaz Either provides you with a getOrElse which you can use to as a ternary operator using its symbolic representation |.

    Folding an Either

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    @ success fold(
    left => -1,
    right => right + 1
    )
    res16: Int = 3
    @ failure fold(
    left => -1,
    right => right + 1
    )
    res17: Int = -1

    Both Scala and Scalaz Either provide you with a fold method which run the first function if we have a left, or the second function if we have a right.

    Converting to Validation

    The single biggest difference between Scala Either and Scalaz Either is that Scalaz Either can be converted to other types like Validation, etc. For example, converting an Either to a Validation allows you to accumulate errors. As the code comments state:

    A \/ B is also isomorphic to Validation[A, B]. The subtle but important difference is that Applicative instances for Validation accumulates errors (“lefts”)

    1
    2
    @ 1.right validation
    res65: Validation[Nothing, Int] = Success(1)

    We create a Validation by calling validation method on the Either instance. Depending on a left or right, we get either a Success or Failure.

    Conclusion

    Scalaz Either and Scala Either are pretty similar in the latest version of Scala (2.12, as of writing). Which one you decide to use depends upon your personal preference. My preference is to use Scalaz Either throughout my code if I am using other Scalaz features to maintain consistency.

    Scalaz Validation

    Introduction

    In this post we’ll look at Scalaz Validation which you can use to validate the data in your system. Data validation is a part and parcel of software development; you have to check the data that comes into your system or it may lead to unexpected behavior and / or cause your system to fail. Scalaz provides you with Validation to validate your data. Validation is an applicative functor. An applicative functor has more structure than a functor but less than a monad.[1]

    Motivating Example

    So let’s say we have a Version class representing the major and minor version of our software like so:[2]

    1
    case class Version(major: Int, minor: Int)

    Then, a negative value in either of major or minor would be invalid. We could ensure that we never get a negative value in either the major or minor by using require like so:[3]

    1
    2
    3
    4
    case class Version(major: Int, minor: Int) {
    require(major >= 0, "major must be >= 0: %s".format(major))
    require(minor >= 0, "minor must be >= 0: %s".format(minor))
    }

    The problem here is that we’ll have to handle exceptions and we don’t want side-effects. Let’s use Validation to do it in a more functional way.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    @ import scalaz._
    import scalaz._
    @ import Scalaz._
    import Scalaz._
    // paste mode in Ammonite REPL is { .. CODE HERE .. } and not :paste
    @ {
    case class Version(major: Int, minor: Int)
    object Version {
    def createNew(major: Int, minor: Int): Validation[IllegalArgumentException, Version] = {
    val isValidMajor = (major >= 0) ? true | false
    val isValidMinor = (minor >= 0) ? true | false

    (isValidMajor, isValidMinor) match {
    case (false, true) => new IllegalArgumentException("major < 0").failure
    case (true, false) => new IllegalArgumentException("minor < 0").failure
    case (false, false) => new IllegalArgumentException("major and minor < 0").failure
    case (true, true) => new Version(major, minor).success
    }
    }
    }
    }
    defined class Version
    defined object Version
    @ Version.createNew(-1, -1)
    res5: Validation[IllegalArgumentException, Version] = Failure(
    java.lang.IllegalArgumentException: major and minor < 0
    )
    @ Version.createNew(1, 1)
    res6: Validation[IllegalArgumentException, Version] = Success(Version(1, 1))

    What we’ve done here is add a createNew to the companion object of Version that returns a Validation. Validating the input can result in either a Success or Failure. We create a Success by calling success on the value and a Failure by calling failure on the value. Once we have a Validation, there are numerous ways in which we can deal with it.

    Providing Default Values

    1
    2
    3
    4
    5
    6
    @ val invalid = Version.createNew(-1, -1)
    invalid: Validation[IllegalArgumentException, Version] = Failure(
    java.lang.IllegalArgumentException: major and minor < 0
    )
    @ val default = invalid | new Version(1, 0)
    default: Version = Version(1, 0)

    There’s a convenient | operator (getOrElse) that lets you provide a default value if the result of the validation is a Failure. Here we are assigning the value 1 to major and 0 to minor.

    Folding a Validation

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    @ val valid = Version.createNew(1, 0)
    valid: Validation[IllegalArgumentException, Version] = Success(Version(1, 0))
    @ valid fold(
    f => s"Validation failed because ${f getMessage}".println,
    v => s"Version is $v".println
    )
    "Version is Version(1,0)"
    @ invalid fold(
    f => s"Validation failed because ${f getMessage}".println,
    v => s"Version is $v".println
    )
    "Validation failed because major and minor < 0"

    Akin to a disjunction, it’s possible to fold a Validation. Failure will be the first argument and Success will be the second. If you want to fold just on the Success part, you can use foldRight.


    NOTE:

    Both fold and foldRight should return values. Here I’ve just printed out the values to the console which is a side-effect. I’ve done this to keep the examples simple.


    Composing Validations

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    @ {
    case class Version(major: Int, minor: Int)
    object Version {
    def isValidMajor(major: Int) =
    (major >= 0) ?
    major.success[IllegalArgumentException] |
    new IllegalArgumentException("major < 0").failure

    def isValidMinor(minor: Int) =
    (minor >= 0) ?
    minor.success[IllegalArgumentException] |
    new IllegalArgumentException("minor < 0").failure

    def createNew(major: Int, minor: Int) = {
    (isValidMajor(major).toValidationNel |@| isValidMinor(minor).toValidationNel) {
    Version(_, _)
    }
    }
    }
    }
    defined class Version
    defined object Version

    When we started building the Version example, we used pattern matching to check if the major and minor versions were correct. However, there’s a more succinct way to collect all the errors. In the example above, we’ve turned isValidMajor and isValidMinor into methods that return a Validation instead of simply a Boolean.

    The magic is in createNew. Here we convert the Validation into a NonEmptyList. toValidationNel wraps the Failure in a NonEmptyList but keeps the Success as-is. The |@| is an ApplicativeBuilder which lets us combine the failures together. If we get all successes, we construct a Version out of it. Let’s see this in action:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    @ val bothInvalid = Version.createNew(-1, -1)
    bothInvalid: Validation[NonEmptyList[IllegalArgumentException], Version] = Failure(
    NonEmpty[java.lang.IllegalArgumentException: major < 0,java.lang.IllegalArgumentException: minor < 0]
    )

    @ val majorInvalid = Version.createNew(-1, 0)
    majorInvalid: Validation[NonEmptyList[IllegalArgumentException], Version] = Failure(
    NonEmpty[java.lang.IllegalArgumentException: major < 0]
    )

    @ val minorInvalid = Version.createNew(1, -1)
    minorInvalid: Validation[NonEmptyList[IllegalArgumentException], Version] = Failure(
    NonEmpty[java.lang.IllegalArgumentException: minor < 0]
    )

    @ val bothValid = Version.createNew(1, 1)
    bothValid: Validation[NonEmptyList[IllegalArgumentException], Version] = Success(Version(1, 1))

    So, in case of both the major and minor being invalid, we get a non-empty list with both the errors in it. This is very convenient and also very extensible. If you need to add an extra check, you can write a new isValidXXX method and make it return a Validation. You can then use the ApplicativeBuilder to your advantage. Using simply booleans would need checking a large number of possible cases.

    Mapping Success and Failure

    1
    2
    3
    4
    5
    @ val errMsgs = bothInvalid bimap(
    f => f map { _.getMessage },
    identity,
    )
    errMsgs: Validation[NonEmptyList[String], Version] = Failure(NonEmpty[major < 0,minor < 0])

    It’s possible to map over Success and Failure and apply any transformations you want. For example, we represent errors with IllegalArgumentExceptions and we may be coding a rest API and we’d like to send back the strings representing the errors. In the example above, I’ve used bimap to map Success and Failure. For every failure, I am extracting the string using getMessage and leaving the success as-is by using identity.

    1
    2
    3
    4
    5
    @ val theVersion = bothValid bimap(
    f => f map { _.getMessage },
    identity,
    )
    theVersion: Validation[NonEmptyList[String], Version] = Success(Version(1, 1))

    Similar to bimap is map which just maps the Success part of the Validation.

    1
    2
    @ val theValue = bothValid map { v => Version(v.major + 1, v.minor * 2) }
    theValue: Validation[NonEmptyList[IllegalArgumentException], Version] = Success(Version(2, 2))

    Remember that bimap and map return a Validation. They don’t extract values out of it. You’ll have to use fold, etc. to get the values out.

    Running a Side-Effect on Success

    1
    2
    @ bothValid foreach { v => s"Begin build process for version $v".println }
    "Begin build process for version Version(1,1)"

    foreach lets you run a side-effecting function if the result of the Validation is a Success. For example, using the Version, you could begin a build process or similar.

    Checking if it is a Success or Failure

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    @ val isSuccess = bothValid isSuccess
    isSuccess: Boolean = true

    @ val isFailure = bothInvalid isFailure
    isFailure: Boolean = true

    @ val isSuccessSatisfying = bothValid exists { _.major < 5 }
    isSuccessSatisfying: Boolean = true

    @ val isSuccessSatisfyingOrFailure = bothValid forall { _.major === 5 }
    isSuccessSatisfyingOrFailure: Boolean = false

    There are numerous ways in which we can check if the Validation resulted in a Success or a Failure. The simplest way is to use one of isSuccess or isFailure. Similarly, exists will return true if the validation is a success value satisfying the given predicate and forall will return true if the validation is a failure value or the success value satisfies the given predicate.

    Converting to Other Datatypes

    1
    2
    3
    4
    5
    6
    7
    8
    @ val list = bothValid toList
    list: List[Version] = List(Version(1, 1))

    @ val stream = bothValid toStream
    stream: Stream[Version] = Stream(Version(1, 1))

    @ val either = bothValid toEither
    either: Either[NonEmptyList[IllegalArgumentException], Version] = Right(Version(1, 1))

    Scalaz also provides convenience methods to convert the Validation to different datatypes like List, Stream, Either, etc.

    Conclusion

    There’s a lot more methods than can be covered in a post. Hopefully this post gave you an idea about what’s possible with Validations.