November 27, 2016

Property-based Tests for Coin Changer

While trying to better understand property-based testing, I was looking with a colleague for an example small enough to demonstrate the technique. The problem was that most small examples were too trivial, so that the property-based tests degenerated into simple reimplementations of the code under test. We eventually found a good example in the Coin Changer kata.

A coin changer should, for a given amount and a set of coin denominations, return the smallest number of coins that represents the given amount. For example, an amount of 42 cents in Euro denominations of #{200 100 50 20 10 5 2 1} should return '(20 20 2).

It was straightforward to derive two properties of a correct coin changer from the description of the kata. First, the coins returned should sum up to the given amount. We can express this as a test.check property in Clojure:

(defn amounts-up-to [n]
  (gen/one-of [gen/nat                   ; bounded by size parameter
               (gen/resize n gen/nat)])) ; bounded by n

(defspec sum-of-coins-equals-amount
  (prop/for-all [amount (amounts-up-to 1000)]
    (let [coins (make-change amount)]
      (= amount (apply + coins)))))

The property itself simply sums up the returned coins and compares them to the initial amount. test.check checks this for 100 randomly generated amounts per run. I am defining and using a custom generator amounts-up-to, because the generator for natural numbers is bounded by the number of the current test run. I need the property checked for amounts larger than 100 though. amounts-up-to generates a natural number that is half of the time bounded by the current run number and for the other half bounded by the specified maximum amount.

By the way, this first property falls into the "Hard to Solve, Easy to Verify" category of this really helpful post about property-based testing categories by Scott Wlaschin.

The second property of a correct coin changer is that the number of coins returned is the smallest possible:

(defn minimal?
  "For a given collection of coins (sorted by descending value), returns
  whether the number of coins is minimal for their amount."
  [coins]
  (every? (fn [coin-value]
            (let [smaller-coins (drop-while (partial <= coin-value) coins)]
              (> coin-value (apply + smaller-coins))))
          '(200 100 50 20 10 5 2 1)))

(defspec number-of-coins-is-minimal
  (prop/for-all [amount (amounts-up-to 1000)]
    (minimal? (sort > (make-change amount)))))

The predicate minimal? works by checking for each coin value whether the sum amount of all the smaller coins is still smaller than that coin value. If that were not the case, some of those smaller coins can be replaced by a single coin, yielding a smaller result.

This holds true for Euro cents, but does not generalize to arbitrary denominations. Coin Changer is notoriously tricky for more irregular cases like '(25 20 1), where my simple implementation of make-change would return the wrong result of '(25 25 25 1 1 1 1 1) for an amount of 80.

Why is that? Both minimal? and the greedy algorithm I implemented for make-change only work correctly for a subset of the Coin Changer problem space. They depend on the fact that always making the best choice locally (e.g. always using the largest coin) leads to the optimal global solution. But for some denominations, this greedy choice leads to a non-global optimum. And then we need a more sophisticated dynamic programming algorithm to calculate change correctly.

But how can we decide whether the solution is minimal for the general case? I have no idea. It's a decision problem which can be trivially solved by falling back to the known solution of the respective optimization problem, make-change:

(defn minimal? [coins]
  (= (count coins) (count (make-change (apply + coins)))))

But that would not be a useful test for make-change anymore. Once again, this is merely reimplementing the code under test in order to test it.

I learned two things about property-based testing here. Firstly, property-based testing does not replace example-based tests or REPL-based experimentation, two methods I heavily rely on when driving my code through tests. Secondly, property-based tests can be very helpful, even when they do not fully cover the problem space. I realized this while implementing the dynamic programming algorithm for make-change. While I could not rely on minimal? anymore, the sum-of-coins-equals-amount property still was helpful in catching mistakes.

Implementations

This is the greedy algorithm, I used initially to implement the two properties:

(defn make-change
  "Returns change for the specified amount, using the minimal number of coins."
  [amount]
  (loop [amount amount
         coins []
         current (first coin-values)
         values (rest coin-values)]
    (if (> amount 0)
      (if (< amount current)
        (recur amount coins (first values) (rest values))
        (recur (- amount current) (conj coins current) current values))
      coins)))

... and here is the dynamic programming algorithm, which I heavily based on the solution by Mark Engelberg:

(defn add-coin [change coin]
  (when change
    (conj change coin)))

(defn count-coins [change]
  (if change
    (count change)
    Integer/MAX_VALUE))

(defn make-change
  ([amount] (make-change coin-values amount))
  ([denominations amount]
   (cond
     (neg? amount) nil
     (zero? amount) '()
     (empty? denominations) nil
     :else
     (let [option1 (add-coin
                    (make-change denominations (- amount (first denominations)))
                    (first denominations))
           option2 (make-change (rest denominations) amount)]
       (min-key count-coins option1 option2)))))

(def make-change (memoize make-change))
Tags: Testing, Clojure, Feedback