Random numbers are frequently used as an example of the State
monad. Generating random
numbers functionally without a monad or other way to secure the state is a terrible idea:
you open up a risk of doing the worst thing you can do with random numbers (reusing them,
so they're not random!), while simultaneously making them slower and more difficult to use.
As far as FP antipatterns go, this is as close to a disaster as it gets. State
helps you
avoid the danger, but they're still harder to use in all but the simplest toy examples.
If literally everything else in your program is referentially transparent
and you want the simplicity of not ever having to consider that something might not
be, and it may be worth putting up with the hassles and possible dangers of State
.
But if requirements change, you may find yourself burdened by an awkward and mostly
unhelpful mechanism. My recommendation is to choose an alterative: either a better,
less clumsy state-hiding-and-utilizing mechanism, or, more simply, let a mutable
class do the right thing for you.
The justification for this attitude is presented below.
TODO: link to What Is a Monad (after writing it)?
The State
monad is conceptually very simple: it takes some representation of a state, and from
it produces an output value and a new state. That sounds incredibly much like a typical
pseudorandom number generator. We'll use the cats
implementation of State
, but let's
examine a bare-bones version of State
(which is monadic in what it produces, not the state
that it carries):
class State[S, A](private val advance: S => (S, A)) {
def pure[B](b: B) = new State[S, B](s => (s, b))
def flatMap[B](f: A => State[S, B]): State[S, B] = new State(s => {
val (ss, a) = advance(s)
f(a).advance(ss)
})
}
object State {
def apply[S, A](advance: S => (S, A)) = new State(advance)
}
This isn't super-useful alone, because we can't observe anything that happens, or provide any initial state, but it structurally is a monad. To make it useful, we need two more things, namely a way to start off with an initial state, and a way to eventually do something with the result of the computation.
So we'll switch over to the implementation given in the cats documentation.
We can simply:
import cats.data.State
type RngState[A] = State[Long, A]
val nextLong: RngState[Long] = State(seed =>
(seed * 6364136223846793005L + 1442695040888963407L, seed)
)
(Those numbers are suggested by Knuth for a 64-bit linear congruential generator.)
Let's compare to the stateful version:
class Rng(private[this] var seed: Long) {
def nextLong = {
val ans = seed
seed = seed * 6364136223846793005L + 1442695040888963407L
ans
}
}
Since we got State
for free, the FP version is easier, assuming we're willing to leave our state as a
bare Long
. It's important to understand the API of Rng: whenever you want a new random number,
call nextLong
; it will take care of it. If you want to reuse a number, it's your responsibility to store
it. Not too hard to understand, but if you don't understand it you can't reason correctly.
Let's suppose we want to generate the sum of three random numbers starting with a seed of 17
. This should be easy enough:
def sum3mut(r: Rng) = r.nextLong + r.nextLong + r.nextLong
sum3mut(new Rng(17))
val sum3fp = for { a <- nextLong; b <- nextLong; c <- nextLong } yield (a + b + c)
sum3fp.runA(17).value
Both seem pretty straightforward. The syntax is a bit different (direct addition vs. marshalling the
values inside for
and then yielding the result), but with a little familiaritiy either way is
pretty straightforwards. Let's try running it on 1,000 initial seeds.
(I'm using my fast-and-only-moderately-accurate benchmarking library Thyme.)
Benchmark comparison (in 2.509 s)
Significantly different (p ~= 0)
Time ratio: 513.69305 95% CI 494.09265 - 533.29345 (n=45)
First 682.6 ns 95% CI 664.1 ns - 701.0 ns
Second 350.6 us 95% CI 341.2 us - 360.1 us
If you think that you're reading it wrong because it can't possibly be 500x slower, no, you're reading it right. That's a 500x performance penalty. If you need a couple random numbers, maybe you don't care. If you're doing, say, Monte Carlo estimation of error intervals in data, this is pretty awful--the difference between happily running on one core on your laptop and requiring a giant Spark cluster.
This enormous speed penalty isn't entirely a consequence of FP, but it is a consequence on the JVM for now, given the complexity of the optimization required to turn the above into something that performs well.
Sometimes monadic trappings around a simple function get in the way. Not so with random numbers!
We might try to do away with the whole State
machinery and, perhaps for somewhat
better performance, simply write a plain old function (pof)
val nextLong = (l: Long) => (l * 6364136223846793005L + 1442695040888963407L, l)
This is an unmitigated disaster. Let's try to add three numbers:
val sum3pof_wrong = (l: Long) => {
val (s, a) = nextLong(l)
val (ss, b) = nextLong(s)
val (sss, c) = nextLong(s)
(sss, a + b + c)
}
But this is terrible because we exposed ourselves to accidentally reusing seeds
as in this buggy example (the third call should have been nextLong(ss)
).
Even worse, since we're using random numbers where the number stream and the seed have the same type, without extra machinery you can even get the order of arguments wrong:
val sum3pof_verywrong = (l: Long) => {
val (a, s) = nextLong(l)
val (b, ss) = nextLong(s)
val (c, sss) = nextLong(ss)
(a + b + c, sss)
}
Now we've mixed up our seed and our computation, and we generated and added the same number three times all because the local context did not make it apparent which of the two results was which. (You can avoid this by wrapping the seed to make it unambiguously different than the random number return type. But there's no point, because the exposed states are still error-prone in the first way; we just shouldn't do this at all.)
For random numbers, where independence is (usually) everything, this is as bad
as it gets. Avoid this pattern like the plague! If you must have referential
transparency in your random numbers, use State
or some other mechanism to
enforce the discipline required to keep your state safely progressing!
Let's suppose we want to generate more than just Long
numbers. Maybe we want to generate a string
containing a random sequence of t
and f
. Let's give it a go each way.
First, for our mutable RNG, since we didn't bother to give nextBool
and nextString(length: Int)
methods, let's add them.
We have a fill
method that lets us repeat an operation a certain number of times, so we can:
implicit class MoreRandomStuff(private val r: Rng) extends AnyVal {
def nextBool = r.nextLong < 0
def nextTF = if (nextBool) 't' else 'f'
def nextString(length: Int) = new String(Array.fill(length)(nextTF))
}
Not hard at all!
Let's try with the monadic version. We can't just use a for
comprehension, because it doesn't support
generating arbitrarily nested calls to flatMap
. Instead, we can turn to a method called replicateA
that
does something similar.
import cats.implicits._
val nextBool = nextLong.map(_ < 0)
val nextTF = nextBool.map(b => if (b) 't' else 'f')
def nextString(length: Int) = nextTF.replicateA(length).map(_.mkString)
This wasn't terribly arduous, but it wasn't completely free. We needed a different method
and thus had more to learn to be productive (just because we have replicateA
doesn't mean
we can forget about fill
!), but since we don't have to think about fill
as well
right now this is only slows learning, not understanding once you're familiar.
Note that referential transparency is only buying us that we do not have to understand the
extremely simple rules for using Rng
; and with State
we're having to use a wider range
of mechanics to compensate. It's not a big deal, once you get used to it, but it's not
helping us out.
Suppose we want to create some data structure using these two sources of random numbers.
We won't care about why right now (maybe reproducibility, maybe to get a larger effective
period, maybe one is faster but the other is more random, etc.). Specifically, we
want to use different random number streams to create an ID and a string of "t"
s and "f"
s.
case class Foo(id: Long, tfs: String) {}
Let's say that it's super-important that we don't mess up which stream we use for what. We can create some marker traits and adorn our instances with them:
trait IdR {}
trait TfR {}
val r = new Rng(17) with IdR {}
val rr = new Rng(22) with TfR {}
Now we can safely specify in our API which number stream we are using:
def mkFooMut(rId: Rng with IdR, rTf: Rng with TfR) =
Foo(rId.nextLong, rTf.nextString(10))
val myFooMut = mkFooMut(r, rr) // Works!
// val wrongFoo = mkFooMut(rr, r) // Nope
// val wrongFoo = mkFooMut(new Rng(7), rr) // Nope
Easy type-safety, and all our previous work on random numbers can be reused. We
could go even farther to avoid error (e.g. here we can still call rId.nextString(10)
;
the types only keep the arguments straight coming into mkFoo
), but we'll say this
provides enough security.
Now let's try the same with RngState
. Right off the bat, we have a problem: we
need something new to store the state. There are a bunch of ways we could do this,
but since we're trying to reuse our existing infrastructure, we'll store our
combined state in a tuple and create State
transformations to deal with each arm:
type Rng2State[A] = State[(Long, Long), A]
def nextFromOne[A](r: RngState[A]): Rng2State[A] = State{ s =>
val (nx, a) = r.run(s._1).value
(s.copy(_1 = nx), a)
}
def nextFromTwo[A](r: RngState[A]): Rng2State[A] = State { s =>
val (nx, a) = r.run(s._2).value
(s.copy(_2 = nx), a)
}
Now we can create Foo
given appropriate state:
val mkFoo =
for {
id <- nextFromOne(nextLong)
tfs <- nextFromTwo(nextString(10))
} yield Foo(id, tfs)
val myFooFp = mkFooFp.runA((17, 22)).value
Except for referential transparency, this is unqualifiedly worse. Not only is there considerably more boilerplate that we have to write to thread the two states into individual states and back, but we have the fragile operation of exposing ourself to the random number state and then safely putting it back again.
Now suppose we want to keep building up, creating these:
case class Bar(id: Long, b: Byte, z: Boolean) {}
case class Baz(id: Long, foo: Foo, bar: Bar) {}
But now suppose we want to use yet another source of random numbers that is
the only one that can produce Byte
s, and which we also will use to make booleans.
trait Rn3 extends Rng {
def nextByte = (nextLong >>> 56).toByte
}
val rrr = new Rng(37) with Rn3 {}
No problem.
And now we just create the things we want taking the random number streams we ought to:
def mkBarMut(rId: Rng with IdR, rB: Rn3) =
Bar(rId.nextLong, rB.nextByte, rB.nextBool)
def mkBazMut(rId: Rng with IdR, rTf: Rng with TfR, rB: Rn3) =
Baz(rId.nextLong, mkFooMut(rId, rTf), mkBarMut(rId, rB))
There's one short obvious way to do it, and any simple error is a compile error. (With the proviso, again, that we haven't made the streams only able to produce the data type we want.)
I'm going to post the State
version without describing it blow-by-blow.
type Rng3State[A] = State[(Long, Long, Long), A]
def nextFromOneOf3[A](r: RngState[A]): Rng3State[A] = State{ s =>
val (sid2, a) = r.run(s._1).value
(s.copy(_1 = sid2), a)
}
def nextFromFooish[A](r2: Rng2State[A]): Rng3State[A] = State{ s =>
val (nx, a) = r2.run((s._1, s._2)).value
(s.copy(_1 = nx._1, _2 = nx._2), a)
}
def nextFromBarish[A](r2: Rng2State[A]): Rng3State[A] = State{ s =>
val (nx, a) = r2.run((s._1, s._3)).value
(s.copy(_1 = nx._1, _3 = nx._2), a)
}
// We're not even enforcing stream-type-safety here!
val nextByte = nextLong.map(l => (l >>> 56).toByte)
val mkBarFp =
for {
id <- nextFromOne(nextLong)
b <- nextFromTwo(nextByte)
z <- nextFromTwo(nextBool)
} yield Bar(id, b, z)
val mkBazFp =
for {
id <- nextFromOneOf3(nextLong)
foo <- nextFromFooish(mkFooFp)
bar <- nextFromBarish(mkBarFp)
} yield Baz(id, foo, bar)
Whew! It works, but to keep it even this short we've given up some parts of the
type-safety of the streams (everyone can create Byte
now), we've got growing
amounts of fiddly state packing and unpacking, and we're increasingly repeating
ourselves in nextWhatever(makeWhatever)
pairs.
And what do we gain for all this? Only that we allow ourselves to forget the
distinction between producing and storing a value, and that Rng
does the former.
State
is a clumsy, unscalable mechanism for carrying state along with a
computation, but it does royally well at actually enforcing that the state is
carried along. (Note: it is less clumsy in Haskell, for instance; the inherent issues
remain, but expressing solutions is less onorous.)
In the case of random numbers, the problem is solved autonomously by a mutable class that takes care of the state on its own. Not all state problems are like this. But many, including random number generation, are. This separability, the ability to delegate authority for the process to a class who knows (locally) how to do the right thing is a very powerful abstraction mechanism. Random numbers are one of the clearest examples, but counters and performance timers also tend to have this property, among others.
Using State
out of a desire for referential transparency actually breaks
this abstraction mechanism by forcing you to care, repeatedly, about just how
you are propagating your state around. State
works a bit like a political
state gone awry: you can't just take your kids outside to run and play and scream
and otherwise mutate their environment; you need to write an application for
permission to run, and one for permission to play, and one for permission to
scream, and one for permission to combine the permits for running and playing,
and...even though they eventually send a babysitter over to take care of all of
it, it's not worth all the paperwork.
Bottom line: using State
for random number generation is (usually) an
antipattern. The costs are high, especially if you have multiple random
number streams (or other sources of multiple State
). If you're going to
deal with random numbers this way, make sure the costs are worth it.
In contrast, a simple mutable approach is close to foolproof, with one proviso: you must maintain a mental distinction between methods that generate numbers and those that store numbers. In many cases, this is trivially easy. If it becomes difficult (e.g. if you use a myriad of caches and don't take care to split off and cache your random number stream), look for a different solution.