How does filterM work in Haskell?
filterM is an interesting function. In one sense it’s very similar to filter which we all know and love. It’s much more powerful though as we shall soon see. Let’s start by having a look at the definition of filter:
filter :: (a -> Bool) -> [a] -> [a]
and then at filterM:
filterM :: Applicative m => (a -> m Bool) -> [a] -> m [a]
A side-by-side comparison:
filter :: (a -> Bool) -> [a] -> [a]
filterM :: Applicative m => (a -> m Bool) -> [a] -> m [a]
By comparing the type signatures of filter and filterM we can see that filterM is just filter where the conditional expression yields a Bool within a context m and where the matching results are aggregated in the m context.
The implementation of filterM in GCH base is as follows:
filterM :: (Applicative m) => (a -> m Bool) -> [a] -> m [a]
= foldr (\x -> liftA2 (\flg -> if flg then (x:) else id) (p x)) (pure []) filterM p
From the above definition it looks like whenever the monadic filter function (a -> m Bool)
returns a m True
, the value in the supplied list is prepended to an accumulator, and if it doesn’t match the existing accumulator is left unchanged.
Although this sound very simple, I found the usage of filterM to be somewhat difficult to understand - at least at first. Let’s start investigating its usage by looking at some example instances for m.
Maybe
Given a list of numbers:
numbers :: [Int]
= [1,2,3,4,5] numbers
and an isEven function:
isEven :: Int -> Bool
= n `mod` 2 == 0 isEven n
we can use filterM to filter the list of numbers that are even and return the results in a Maybe:
Just . isEven) numbers filterM (
which results in:
Just [2,4]
That seems pretty easy. Using filter on numbers:
filter isEven numbers
we get:
2,4] [
The only difference between the results being that the filterM variant has the results in the Maybe Monad.
What happens when filterM takes a function that can return Nothing in some instances?
Given the following function:
isDivisibleByThree :: Int -> Bool
= n `mod` 3 == 0 isDivisibleByThree n
Let’s filter our list of numbers so that they contain even numbers, but if we encounter a number that is divisible by three, we want to bail on the result:
-> if isDivisibleByThree n then Nothing else Just (isEven n)) numbers filterM (\n
this results in:
Nothing
Now, this might be a little surprising. What happened to all the matches until we encountered a three, such as two? Recall that the filterM implementation:
= foldr (\x -> liftA2 (\flg -> if flg then (x:) else id) (p x)) (pure []) filterM p
uses liftA2:
liftA2 :: (a -> b -> c) -> f a -> f b -> f c
to run a binary function over the Applicative instances. With two Maybe instances, the result is always Nothing, if one of them is Nothing as you can’t run the function without both inputs:
+) (Just 1) (Just 2) = Just 3
liftA2 (+) (Just 1) Nothing = Nothing
liftA2 (+) Nothing (Just 2) = Nothing
liftA2 (+) Nothing Nothing = Nothing liftA2 (
What this demonstrates is that if we ever receive a Nothing value while using filterM all results up until that point are discarded. This highlights one key difference between filter and filterM; in addition to filtering on the Bool result, filterM also combines the results using its Applicative properties.
Let’s run the filterM code once again, but this time, we’ll leave out any multiples of three:
-> if isDivisibleByThree n then Nothing else Just (isEven n)) [1,2,4,5,7,8] filterM (\n
and this time the answer is:
Just [2,4,8]
IO
Let’s try filtering only even numbers using the IO Monad:
pure . isEven) numbers
ioFilterM (= [2, 4] -- IO [Int]
That works as expected. Now let’s introduce a failure in IO Monad when a number is divisible by three:
-> if isDivisibleByThree n then ioError (userError "boom!") else pure (isEven n)) numbers
filterM (\n = *** Exception: user error (boom!) -- IO [Int]
The above discards any results collected once it reaches an IO error. This functionality is very similar to how the Maybe Monad filtered when it received a Nothing. This is quite useful when filtering only valid results and failing on the first failure.
And if we remove any numbers divisible by three:
-> if isDivisibleByThree n then ioError (userError "boom!") else pure (isEven n)) [1,2,4,5,7,8]
filterM (\n = [2,4,8] -- IO [Int]
we get back the expected results.
List
With List, things get more interesting. Consider the following:
-> [True, False]) numbers filterM (\n
What do you reckon the answer would be? Probably not a powerset:
1,2,3,4,5],[1,2,3,4],[1,2,3,5],[1,2,3],[1,2,4,5],[1,2,4],[1,2,5],[1,2],
[[1,3,4,5],[1,3,4],[1,3,5],[1,3],[1,4,5],[1,4],[1,5],[1],[2,3,4,5],[2,3,4]
[2,3,5],[2,3],[2,4,5],[2,4],[2,5],[2],[3,4,5],[3,4],[3,5],[3],[4,5],[4],
,[5],[]] [
Remember that filterM is defined as:
= foldr (\x -> liftA2 (\flg -> if flg then (x:) else id) (p x)) (pure []) filterM p
How does this work with List? If we use liftA2 with List:
+) [1,2,3] [4,5,6]
liftA2 (= [5,6,7,6,7,8,7,8,9]
we see that we get a Cartesian product of values (all combinations). List is a non-deterministic Monad and as such it produces results of every possible combination.
Let’s start by expanding out the point-free implementation of filterM:
=
filterM p foldr (\x acc -> liftA2 (\flg1 accx -> if flg1 then (x:accx) else accx) (p x) acc) (pure [])
accx is the accumulator value passed to liftA2. The values passed will be the Cartesian product of [True, False] and the accumulator of list acc, which is initially [[]].
There are two main expansions happening in the implementation of filterM:
- liftA2 is creating a Cartesian product of the flags [True, False] and the accumulator acc and combining them with supplied function, which prepends the current value of the list x to the accumulator accx if the flag is True or returns the existing accumulator accx if it is False.
- All the combinations returned from listA2 are then returned into foldr as the new value of the accumulator acc.
Because filterM is implemented using foldr the accumulated values are used from last to first.
Given the following legend:
x -- element in the list
acc -- value of accumulator
accx -- value of accumulator at current combination
flg1 -- value of flag at current combination
result -- value of accx after applying flg1
newacc -- value of acc returned to foldr
Let’s start from the end of the list at 5 and follow it up to 1.
For the value of 5:
x = 5
acc = [[]]
flags = [True, False]
--------------------
accx []
flg1 True
result = 5:[] => [5]
--------------------
accx []
flg1 False
result => []
--------------------
newacc = [[5], []]
For the value of 4:
x = 4
acc = [[5], []]
flags = [True, False]
--------------------
accx [5]
flg1 True
result = 4:[5] => [4,5]
--------------------
accx []
flg1 True
result = 4:[] => [4]
--------------------
accx [5]
flg1 False
result => [5]
--------------------
accx []
flg1 False
result => []
--------------------
newacc = [[4,5],[4],[5], []]
For the value of 3:
x = 3
acc = [[4,5],[4],[5], []]
flags = [True, False]
--------------------
accx [4,5]
flg1 True
result = 3:[4,5] => [3,4,5]
--------------------
accx [4]
flg1 True
result = 3:[4] => [3,4]
--------------------
accx [5]
flg1 True
result = 3:[5] => [3,5]
--------------------
accx []
flg1 True
result = 3:[] => [3]
--------------------
accx [4,5]
flg1 False
result => [4,5]
--------------------
accx [4]
flg1 False
result => [4]
--------------------
accx [5]
flg1 False
result => [5]
--------------------
accx []
flg1 False
result => []
--------------------
newacc = [[3,4,5],[3,4],[3,5],[3],[4,5],[4],[5],[]]
For the value of 2:
x = 2
acc = [[3,4,5],[3,4],[3,5],[3],[4,5],[4],[5],[]]
flags = [True, False]
--------------------
accx [3,4,5]
flg1 True
result = 2:[3,4,5] => [2,3,4,5]
--------------------
accx [3,4]
flg1 True
result = 2:[3,4] => [2,3,4]
--------------------
accx [3,5]
flg1 True
result = 2:[3,5] => [2,3,5]
--------------------
accx [3]
flg1 True
result = 2:[3] => [2,3]
--------------------
accx [4,5]
flg1 True
result = 2:[4,5] => [2,4,5]
--------------------
accx [4]
flg1 True
result = 2:[4] => [2,4]
--------------------
accx [5]
flg1 True
result = 2:[5] => [2,5]
--------------------
accx []
flg1 True
result = 2:[] => [2]
--------------------
accx [3,4,5]
flg1 False
result => [3,4,5]
--------------------
accx [3,4]
flg1 False
result => [3,4]
--------------------
accx [3,5]
flg1 False
result => [3,5]
--------------------
accx [3]
flg1 False
result => [3]
--------------------
accx [4,5]
flg1 False
result => [4,5]
--------------------
accx [4]
flg1 False
result => [4]
--------------------
accx [5]
flg1 False
result => [5]
--------------------
accx []
flg1 False
result => []
--------------------
newacc = [[2,3,4,5],[2,3,4],[2,3,5],[2,3],[2,4,5],[2,4],[2,5],[2],[3,4,5],[3,4],[3,5],[3],[4,5],[4],[5],[]]
For the value of 1:
x = 1
acc = [[2,3,4,5],[2,3,4],[2,3,5],[2,3],[2,4,5],[2,4],[2,5],[2],[3,4,5],[3,4],[3,5],[3],[4,5],[4],[5],[]]
flags = [True, False]
--------------------
accx [2,3,4,5]
flg1 True
result = 1:[2,3,4,5] => [1,2,3,4,5]
--------------------
accx [2,3,4]
flg1 True
result = 1:[2,3,4] => [1,2,3,4]
--------------------
accx [2,3,5]
flg1 True
result = 1:[2,3,5] => [1,2,3,5]
--------------------
accx [2,3]
flg1 True
result = 1:[2,3] => [1,2,3]
--------------------
accx [2,4,5]
flg1 True
result = 1:[2,4,5] => [1,2,4,5]
--------------------
accx [2,4]
flg1 True
result = 1:[2,4] => [1,2,4]
--------------------
accx [2,5]
flg1 True
result = 1:[2,5] => [1,2,5]
--------------------
accx [2]
flg1 True
result = 1:[2] => [1,2]
--------------------
accx [3,4,5]
flg1 True
result = 1:[3,4,5] => [1,3,4,5]
--------------------
accx [3,4]
flg1 True
result = 1:[3,4] => [1,3,4]
--------------------
accx [3,5]
flg1 True
result = 1:[3,5] => [1,3,5]
--------------------
accx [3]
flg1 True
result = 1:[3] => [1,3]
--------------------
accx [4,5]
flg1 True
result = 1:[4,5] => [1,4,5]
--------------------
accx [4]
flg1 True
result = 1:[4] => [1,4]
--------------------
accx [5]
flg1 True
result = 1:[5] => [1,5]
--------------------
accx []
flg1 True
result = 1:[] => [1]
-------------------- *
accx [2,3,4,5]
flg1 False
result => [2,3,4,5]
--------------------
accx [2,3,4]
flg1 False
result => [2,3,4]
--------------------
accx [2,3,5]
flg1 False
result => [2,3,5]
--------------------
accx [2,3]
flg1 False
result => [2,3]
--------------------
accx [2,4,5]
flg1 False
result => [2,4,5]
--------------------
accx [2,4]
flg1 False
result => [2,4]
--------------------
accx [2,5]
flg1 False
result => [2,5]
--------------------
accx [2]
flg1 False
result => [2]
--------------------
accx [3,4,5]
flg1 False
result => [3,4,5]
--------------------
accx [3,4]
flg1 False
result => [3,4]
--------------------
accx [3,5]
flg1 False
result => [3,5]
--------------------
accx [3]
flg1 False
result => [3]
--------------------
accx [4,5]
flg1 False
result => [4,5]
--------------------
accx [4]
flg1 False
result => [4]
--------------------
accx [5]
flg1 False
result => [5]
--------------------
accx []
flg1 False
result => []
--------------------
newacc = [[1,2,3,4,5],[1,2,3,4],[1,2,3,5],[1,2,3],[1,2,4,5],[1,2,4],[1,2,5],[1,2],[1,3,4,5],[1,3,4],[1,3,5],[1,3],[1,4,5],[1,4],[1,5],[1],[2,3,4,5],[2,3,4],[2,3,5],[2,3],[2,4,5],[2,4],[2,5],[2],[3,4,5],[3,4],[3,5],[3],[4,5],[4],[5],[]]
That was a bit harder than necessary!
Either
Using filterM with Either is pretty much the same as a Maybe:
let e1 = filterM (\x -> if x == 11 then Left "You gave me eleven" else Right (isEven x))
-- e1 :: :: Integral a => [a] -> Either [Char] [a]
1 .. 10]
e1 [= Right [2,4,6,8,10] -- only even numbers
1 .. 11]
e1 [= Left "You gave me eleven" -- drops all results on a Left
State
Now let’s use a Monad that has two type holes which are both used together. The State Monad allows us to return a value and thread through some state we are interested in at the same time. Let’s use our isEven method to filter in all the even inputs and use a list to record all the values inspected along the way:
let x1 = filterM (\x -> state (\s -> (isEven(x), s ++ [x]))) [1 .. 10]
-- x1 :: (Integral a, Monad m) => StateT [a] m [a]
-- get value
evalState x1 [] = [2,4,6,8,10] -- only even numbers
-- get state
execState x1 [] = [1,2,3,4,5,6,7,8,9,10] -- the state - all inspected values
The interesting thing to note is that given x1’s type:
x1 :: (Integral a, Monad m) => StateT [a] m [a]
The m in filterM:
filterM :: Applicative m => (a -> m Bool) -> [a] -> m [a]
is:
StateT [a] m
which is why we can return a Bool in the value position and have it filter the inputs for us.
Hopefully that was somewhat easier to understand. You can find alternate explanations to this problem here and here.