/dev/random (H. Peter Anvin; Theodore Y. Ts'o)

Index Home About Blog

Date:   Tue, 25 Jan 2000 20:15:41 -0500
From: "Theodore Y. Ts'o" <tytso@MIT.EDU>
Subject: Re: Intel 810 Random Number Generator
Newsgroups: fa.linux.kernel

There've been a lot of comments on the subject, so I waited until I was
completely caught up on e-mail to send one reply to the entire
thread....

First of all, /dev/random is supposed to be a very high quality random
number generator.  It's use is *supposed* to be for generating critical
cryptographic keys, particularly long-term keys, and for seeding
pseudo-random number generators.  It's not particularly optimized for
generating large numbers of random numbers for use in Monte Carlo
applications, for example, and in such applications you usually want a
statistically good, PRNG since being able to replicate your results is
*good*.  (As opposed to crypto key generation, were being able to
replicate your results is *bad*.  :-)

As such, using a true random number generator is a good thing.  If done
well, it can do a much better job than some of the sampling tricks
currently used in /dev/random.   The caveat is in the "if it works"...
Hardware random number generators are notoriously hard to get right.
All sorts of environmental inputs (60 Hz hum, harmonics for CPU clocks,
sampling artifacts, etc.) can potentially swamp the effects of the
quantuum noise generated by the device.

For this reason, it's a really good idea to wash the outputs from the
hardware random number generator through some kind of real-time verifier
to make sure (to some level of confidence) that the hardware generator
hasn't suddenly started screwing up.  This could happen in any number of
ways --- a transistor in the RNG chip getting damaged from a power spike,
or from a cosmic ray, a NSA/FBI executed black bag job, etc.

This verifier can be done in the kernel, but that limits you to the
amount of statistical tests you can do, and you generally don't want to
put too much of that in the kernel.  This is true especially if you want
to be uber-paranoid, and run spectral analysis over the data looking for
potential problems.   Even if you're just running the FIPS tests over
each batch of data before feeding it into the /dev/random, that requires
enough CPU and involves enough complexity that simply shoving it into
the kernel probably isn't the right answer.

Of course, you can simply just be very conservative about how many bits
of credit you give the inputs to the 810RNG.  The Intel paper about the
810 RNG (by Jun and Kocher) claims that 1/2 bit of entropy per output
bit is a good conservative estimate.  People who like to be more careful
can simply give even less credit.  

However, I think that past a certain point, it does make more sense to
have a user-level daemon which handles the verification and sampling
tests for the chip.  If it's done right, it can work across multiple
hardware RNG's, simply by reading the randomness from the device
drivers, and then doing the appropriate pre-processing, and then feeding
it into /dev/random with the appropriate entropy credits.  Perhaps there
should be a "simple mode" which simply takes inputs from the driver and
feeds them directly into /dev/random without the intervention of the
user-mode process, and it can simply be very conservative about how much
entropy credit it gives to the entropy count.  But there should be a way
for a user-mode process to disable this and take over control over
the post-RNG processing.

						- Ted

P.S.  If you look at the Jun and Kocher paper (thanks to Colin Plumb for
giving me a pointer to it:

	http://developer.intel.com/design/security/rng/CRIwp.htm

It's clear that there is a hardware whitener (a von Neumann bias
eliminator) to remove 0 vs 1 biases.   There are hints that it's
possible to turn off the whitener, so that you get access to the raw
stream of bits from the RNG before any whitening is done.
Unfortunately, how to actually do this (probably some kind of debug
mode) doesn't seem to be published anywhere.  If any one knows how to do
this, please let me know.  Ideally, if the software is going to be doing
real-time verification of the RNG's soundness, it should be doing so on
the pre-whitened data stream.   As an example, the following string of
numbers is anything but random:

	1 2 3 4 5 6 7 8 9 10 

However, if this is run through a MD5 or SHA whitener, the result would
*look* random, even though the source material is anything but random.
So you really want to look for patters and do any analysis on the raw
data stream.

So, if anyone can figure out (and tell me) how to turn off the 810's
hardware whitener circuits, that would be really useful.  Thanks!!

From: hpa@transmeta.com (H. Peter Anvin)
Subject: Re: Intel 810 Random Number Generator
Date:   26 Jan 2000 00:24:36 -0800
Newsgroups: fa.linux.kernel

Followup to:  <388EAA4F.6B553FB7@idb.hist.no>
By author:    Helge Hafting <helgehaf@idb.hist.no>
In newsgroup: linux.dev.kernel
>
> >Taking the least significant bits of a fast timer between keypresses is a
> >very good way of generating entropy.
> 
> Typing produce good entropy this way.  Holding down a key and
> getting auto-repeat from the keyboard chips is another story though.  Of
> course
> the user can only blame himself for doing that.  NSA will probably not
> send someone to hold down a key for you while you generate your pgp
> keys. :-)
> 

Very easy to work around.  A keyboard that's being pounded on
generates:

MAKE BREAK MAKE BREAK MAKE BREAK MAKE BREAK

A keyboard that's autorepeating generates:

MAKE MAKE MAKE MAKE MAKE MAKE MAKE MAKE MAKE

Therefore, you ignore a MAKE signal from any key that you already know
is down.  If you don't want to bother keeping track of all the keys
that are pressed (except for modifier keys, where you don't get a
choice) then the solution is simply to not account for any entropy
for the MAKE signal at all (but still mix its timing into the pool!),
but only for the BREAK signal.  Note that each BREAK will have at
least one MAKE associated with it, and the first such MAKE will be the
"real" one when the key is pressed, so the BREAK code can bump the
entropy counter for both one MAKE and one BREAK timing.

     -hpa
-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."

Date: 	Tue, 12 Sep 2000 07:44:49 -0400
From: "Theodore Y. Ts'o" <tytso@MIT.EDU>
Subject: Re: Using Yarrow in /dev/random
Newsgroups: fa.linux.kernel

   Date: 	Mon, 11 Sep 2000 13:08:59 +0000
   From: Pravir Chandra <pchandra@rstcorp.com>

   I've been working to change the implementation of /dev/random over to the
   Yarrow-160a algorithm created by Bruce Schneier and John Kelsey. We've been
   working on parallel development for Linux and NT so that the algorithms are
   matching. The Yarrow 160A algorithm is a variant of Yarrow-160 that has come
   about from discussions with John Kelsey. We've been in contact with him
   throughout our development effort.

I'm not a big fan of Yarrow, since it (in my opinion) places too much
faith in the crypto algorithms.  It uses a pathetically small entropy
pool, and assumes that hash function will do the rest.  Which is fine,
but that makes it a pseudo-RNG, or a crypto-RNG, and not really an
entropy collector.  That's not necessarily a bad thing, but IMO that
should live outside the kernel.  There's no real point putting something
like Yarrow in kernel space.  Better that it collects entropy from a
true entropy collector that's in-kernel (like the current /dev/random),
and that it does the rest of the processing in user space.  After all,
the main rason why /dev/random is in the kernel is precisely because
that's the most efficient place to collect as much entropy as possible.

If you look at the current 2.4 /dev/random code, you'll see that there
already are two entropy pools in use, which is designed to be able to do
something like the "catastrophic reseed" features which Bruce Schneier
is so fond of.  What's currently there isn't perfect, in that there's
still a bit too much leakage between the two pools, but I think that
approach is much better than the pure Yarrow approach.

						- Ted

Date: 	Tue, 12 Sep 2000 13:09:06 -0400
From: "Theodore Y. Ts'o" <tytso@MIT.EDU>
Subject: Re: Using Yarrow in /dev/random
Newsgroups: fa.linux.kernel

   Date: Tue, 12 Sep 2000 09:56:12 +0000
   From: Pravir Chandra <pchandra@rstcorp.com>

   i agree that the yarrow generator does place some faith on the crypto
   cipher and the accumulator uses a hash, but current /dev/random
   places faith on a crc and urandom uses a hash. 

No, not true.  The mixing into the entropy pool uses a twisted LFSR, but
all outputs from the pool (to either /dev/random or /dev/urandom)
filters the output through SHA-1 as a whitener.  The key here, though,
and what makes this fundamentally different from yarrow, is that since
we're feeding the entire (large) entropy pool through SHA-1, even if the
SHA-1 algorithm is very badly broken (say as in what's been happening
with MD5), as long as there's sufficient entropy in the pool, the
adversary can define but minimal information about the pool, since the
8192 -> 160 bit transform has to lose information by definition.

   i also agree that the entropy pools are small, but the nature of the
   hash preserves the amount of entropy that has been uses to create the
   state of the pools. basically, if the pool size is 160 bits (hash
   output) its state can be built by more than 160 bits of entropy, its
   just that adding entropy after that increases the unguessability
   (conventional attacks) of the state but brute forcing the state is
   still 2^160.

Which is just another way of saying that yarrow is only a cryptographic
random number generator whose maxmum entropy storage is 160 bits.....

							- Ted

Date: 	Tue, 12 Sep 2000 20:32:26 -0400
From: "Theodore Y. Ts'o" <tytso@MIT.EDU>
Subject: Re: Using Yarrow in /dev/random
Newsgroups: fa.linux.kernel

   Date: 	Wed, 13 Sep 2000 01:23:30 +0200 (CEST)
   From: Igmar Palsenberg <maillist@chello.nl>

   > No, not true.  The mixing into the entropy pool uses a twisted LFSR, but
   > all outputs from the pool (to either /dev/random or /dev/urandom)
   > filters the output through SHA-1 as a whitener.  The key here, though,
   > and what makes this fundamentally different from yarrow, is that since
   > we're feeding the entire (large) entropy pool through SHA-1, even if the
   > SHA-1 algorithm is very badly broken (say as in what's been happening
   > with MD5), as long as there's sufficient entropy in the pool, the
   > adversary can define but minimal information about the pool, since the
   > 8192 -> 160 bit transform has to lose information by definition.

   Broken ?? Please explain.

This is old news.  Hans Dobbertin, in about 10 hours of CPU time on a
Pentium system, can generate hash collions in MD5.  This is not
something which you're supposed to be able to do with crypto checksums,
so effectively MD5 is "broken", and shouldn't be used in new
applications where a crypto checksum is required.  His break is probably
not exploitable for existing uses of MD5, so some people might claim
that MD5 hasn't been "broken" yet, but merely severly bent.  At some
level this is just a matter of semantics.

Regardless, the use of MD5 for new protocols/applications is not
recommend.

						- Ted

Date: 	Thu, 19 Oct 2000 16:50:37 -0400
Subject: Re: use of add_interrupt_randomness in drivers missing in many drivers
From: tytso@mit.edu
Newsgroups: fa.linux.kernel

   From: daw@mozart.cs.berkeley.edu (David Wagner)
   Date: 	18 Oct 2000 20:29:33 GMT

   Adding more bits to the pool should never hurt; the cryptographic
   mixing ensures this.  What _can_ hurt is adding predictable bits but
   (erroneously) bumping up the entropy counter.

Yes; and writing to /dev/random only mixes the contents into the pool;
it does *not* bump the entropy counter.  Hence, writing to /dev/random
is always safe; it can't hurt, and can help.  For this reason, it's safe
to have /dev/random to be world writeable; some folks have been overly
paranoid and making /dev/random be mode 444, or some such.

If you want to add random data to the pool and bump the estimate of the
entropy, you need to be root, and use a special ioctl which does this
atomically.  The intent is that a user-mode daemon would read data from
/dev/microphone, post-processes it a lot (filter out 60 Hz hum, compress
it, whatever), get an estimate of the entropy in the sample (which may
not be the same as its size), and then call the ioctl to push that data
into the entropy pool.

						- Ted

From: tytso@MIT.EDU (Theodore Y. Ts'o)
Newsgroups: linux.dev.kernel
Subject: Re: /dev/random blocks forever on 2.2.12 and 2.2.16
Date: 8 Aug 00 17:42:12 GMT

   Date: 	Tue, 8 Aug 2000 14:20:41 +0200
   From: Oscar Roozen <jor+lkml@xs4all.nl>

   On Mon, Aug 07, 2000 at 07:41:37PM +0000, Miquel van Smoorenburg wrote:
   > >Yes, I can put a monkey behind the keyboard, but if somebody knows a
   > >better solution, please don't be shy. ;)
   > Use /dev/urandom

   You mean I do "ln -sf /dev/urandom /dev/random" ?  Nah... ;)

   Several people sent me a private email sugesting to use urandom instead,
   but it's not what I use, it's about what all software on the machine uses.
   Should I patch SSH? Should I patch SSL? Or are they using urandom anyway?

My recommendation has always been that applications use /dev/random for
long-term public key generation, and to seed a pseudo-RNG for session
keys.  The reasoning behind this is that true, high-quality random
numbers is a very precious resource (even with a real high, quality
random number generator, it is still a limited resource), and so it's
better to periodically get randomness from /dev/random, and then use
that as seeds for a cryptographically secure, pseudo-random number
generator.

						- Ted

Date: 	Mon, 20 Aug 2001 21:20:53 -0400
From: Theodore Tso <tytso@mit.edu>
Subject: Re: /dev/random in 2.4.6
Newsgroups: fa.linux.kernel

On Mon, Aug 20, 2001 at 04:00:30PM +0000, David Wagner wrote:
> 
> I don't see why not.  Apply this change, and use /dev/urandom.
> You'll never block, and the outputs should be thoroughly unpredictable.
> What's missing?

Absolutely.  And if /dev/urandom is not unpredictable, that means
someone has broken SHA-1 in a pretty complete way, in which case it's
very likely that most of the users of the randomness are completely
screwed, since they probably depend on SHA-1 (or some other MAC which
is probably in pretty major danger if someone has indeed managed to
crack SHA-1).

> (I don't see why so many people use /dev/random rather than /dev/urandom.
> I harbor suspicions that this is a misunderstanding about the properties
> of pseudorandom number generation.)

Probably.  /dev/random is probably appropriate when you're trying to
get randomness for a long-term RSA/DSA key, but for session key
generation which is what most server boxes will be doing, /dev/urandom
will be just fine.

Of course, then you have the crazies who are doing Monte Carlo
simulations, and then send me mail asking why using /dev/urandom is so
slow, and how can they the reseed /dev/urandom so they can get
repeatable, measureable results on their Monte Carlo simulations....

					- Ted

Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@transmeta.com>
Subject: Re: [PATCH] (0/4) Entropy accounting fixes
Original-Message-ID: <Pine.LNX.4.44.0208171923330.1310-100000@home.transmeta.com>
Date: Sun, 18 Aug 2002 02:27:43 GMT
Message-ID: <fa.m6uod2v.140i9os@ifi.uio.no>

On Sat, 17 Aug 2002, Oliver Xymoron wrote:
>
> Net effect: a typical box will claim to generate 2-5 _orders of magnitude_
> more entropy than it actually does.

On the other hand, if you are _too_ anal you won't consider _anything_
"truly random", and /dev/random becomes practically useless on things that
don't have special randomness hardware.

To me it sounds from your description that you may well be on the edge of
"too anal". Real life _has_ to be taken into account, and not accepting
entropy because of theoretical issues is _not_ a good idea.

Quite frankly, I'd rather have a usable /dev/random than one that runs out
so quickly that it's unreasonable to use it for things like generating
4096-bit host keys for sshd etc.

In particular, if a machine needs to generate a strong random number, and
/dev/random cannot give that more than once per day because it refuses to
use things like bits from the TSC on network packets, then /dev/random is
no longer practically useful.

Theory is theory, practice is practice. And theory should be used to
_analyze_ practice, but should never EVER be the overriding concern.

So please also do a writeup on whether your patches are _practical_. I
will not apply them otherwise.

			Linus

Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@transmeta.com>
Subject: Re: [PATCH] (0/4) Entropy accounting fixes
Original-Message-ID: <Pine.LNX.4.44.0208172006050.1491-100000@home.transmeta.com>
Date: Sun, 18 Aug 2002 03:06:01 GMT
Message-ID: <fa.m5uceiv.1402b8o@ifi.uio.no>

On Sat, 17 Aug 2002, Oliver Xymoron wrote:
>
> Let me clarify that 2-5 orders thing. The kernel trusts about 10 times
> as many samples as it should, and overestimates each samples' entropy
> by about a factor of 10 (on x86 with TSC) or 1.3 (using 1kHz jiffies).

Looking at the code, your _new_ code just throws samples away _entirely_
just because some random event hasn't happened (the first thing I noticed
was the context switch testing, there may be others there that I just
didn't react to).

In short, you seem to cut those random events to _zero_. And this happens
under what seems to be perfectly realistic loads. That's not trying to fix
things, that's just breaking them.

> The patches will be a nuisance for anyone who's currently using
> /dev/random to generate session keys on busy SSL servers.

No, it appears to be a nuisance even for people who have real issues, ie
just generating _occasional_ numbers on machines that just don't happen to
run much user space programs.

		Linus

Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@transmeta.com>
Subject: Re: [PATCH] (0/4) Entropy accounting fixes
Original-Message-ID: <Pine.LNX.4.44.0208172019130.1537-100000@home.transmeta.com>
Date: Sun, 18 Aug 2002 03:24:27 GMT
Message-ID: <fa.m6eadbv.14gc9gh@ifi.uio.no>

Hmm.. After more reading, it looks like (if I understood correctly), that
since network activity isn't considered trusted -at-all-, your average
router / firewall / xxx box will not _ever_ get any output from
/dev/random what-so-ever. Quite regardless of the context switch issue,
since that only triggers for trusted sources. So it was even more
draconian than I expected.

Are you seriously trying to say that a TSC running at a gigahertz cannot
be considered to contain any random information just because you think you
can time the network activity so well from the outside?

Oliver, I really think this patch (which otherwise looks perfectly fine)
is just unrealistic. There are _real_ reasons why a firewall box (ie one
that probably comes with a flash memory disk, and runs a small web-server
for configuration) would want to have strong random numbers (exactly for
things like generating host keys when asked to by the sysadmin), yet you
seem to say that such a user would have to use /dev/urandom.

If I read the patch correctly, you give such a box _zero_ "trusted"
sources of randomness, and thus zero bits of information in /dev/random.
It obviously won't have a keyboard or anything like that.

This is ludicrous.

		Linus

Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@transmeta.com>
Subject: Re: [PATCH] (0/4) Entropy accounting fixes
Original-Message-ID: <Pine.LNX.4.44.0208172151440.1829-100000@home.transmeta.com>
Date: Sun, 18 Aug 2002 04:51:07 GMT
Message-ID: <fa.m7uke2v.160q8on@ifi.uio.no>

On Sat, 17 Aug 2002, Oliver Xymoron wrote:
>
> But it will get data of _equal quality_ to the current approach from
> /dev/urandom.

So what?

If you make /dev/random useless ("but you can use /dev/urandom instead")
then we should not have it.

> > Are you seriously trying to say that a TSC running at a gigahertz cannot
> > be considered to contain any random information just because you think you
> > can time the network activity so well from the outside?
>
> Yes. The clock of interest is the PCI bus clock, which is not terribly
> fast next to a gigabit network analyzer.

Be realistic. This is what I ask of you. We want _real_world_ security,
not a completely made-up-example-for-the-NSA-that-is-useless-to-anybody-
else.

All your arguments seem to boil down to "people shouldn't use /dev/random
at all, they should use /dev/urandom".

Which is just ridiculous.

		Linus

Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@transmeta.com>
Subject: Re: [PATCH] (0/4) Entropy accounting fixes
Original-Message-ID: <Pine.LNX.4.44.0208172001530.1491-100000@home.transmeta.com>
Date: Sun, 18 Aug 2002 03:03:50 GMT
Message-ID: <fa.m8e8eiv.16geb8v@ifi.uio.no>

On Sat, 17 Aug 2002, Oliver Xymoron wrote:

>  To protect against back to back measurements and userspace
>  observation, we insist that at least one context switch has occurred
>  since we last sampled before we trust a sample.

This sounds particularly obnoxious, since it is entirely possible to have
an idle machine that is just waiting for more entropy, and this will mean
(for example) that such an idle machine will effectively never get any
more entropy simply because your context switch counting will not work.

This is particularly true on things like embedded routers, where the
machine usually doesn't actually _run_ much user-level software, but is
just shuffling packets back and forth. Your logic seems to make it not add
any entropy from those packets, which can be _deadly_ if then the router
is also used for occasionally generating some random numbers for other
things.

Explain to me why I should consider these kinds of draconian measures
acceptable? It seems to be just fascist and outright stupid: avoiding
randomness just because you think you can't prove that it is random is not
very productive.

We might as well get rid of /dev/random altogether if it is not useful.

		Linus

Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@transmeta.com>
Subject: Re: [PATCH] (0/4) Entropy accounting fixes
Original-Message-ID: <Pine.LNX.4.44.0208172058200.1640-100000@home.transmeta.com>
Date: Sun, 18 Aug 2002 03:58:47 GMT
Message-ID: <fa.m6u6eiv.150c98n@ifi.uio.no>

On 17 Aug 2002, Robert Love wrote:
>
> [1] this is why I wrote my netdev-random patches.  some machines just
>     have to take the entropy from the network card... there is nothing
>     else.

I suspect that Oliver is 100% correct in that the current code is just
_too_ trusting. And parts of his patches seem to be in the "obviously
good" category (ie the xor'ing of the buffers instead of overwriting).

So I think that if we just made the code be much less trusting (say,
consider the TSC information per interrupt to give only a single bit of
entropy, for example), and coupled that with making network devices always
be considered sources of entropy, we'd have a reasonable balance.

		Linus

Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@transmeta.com>
Subject: Re: [PATCH] (0/4) Entropy accounting fixes
Original-Message-ID: <Pine.LNX.4.44.0208172141490.1829-100000@home.transmeta.com>
Date: Sun, 18 Aug 2002 04:49:05 GMT
Message-ID: <fa.m7uudqv.16008gn@ifi.uio.no>

On Sat, 17 Aug 2002, Oliver Xymoron wrote:
>
> > We might as well get rid of /dev/random altogether if it is not useful.
>
> If it's not accounting properly, it's not useful.

My point exactly.

And if it isn't useful, it might as well not be there.

And your accounting isn't "proper" either. It's not useful on a
network-only device. It's just swinging the error the _other_ way, but
that's still an error. The point of /dev/random was to have an estimate of
the amount of truly random data in the algorithm - and the important word
here is _estimate_. Not "minimum number", nor "maximum number".

And yes, it still mixes in the random data, but since it doesn't account
for the randomness, that only helps /dev/urandom.

And helping /dev/urandom is _fine_. Don't get me wrong. It just doesn't
make /dev/random any more useful - quite the reverse. Your patch will just
make more people say "/dev/random isn't useful, use /dev/urandom instead".

Do you not see the fallacy of that approach? You're trying to make
/dev/random safer, but what you are actually _doing_ is to make people not
use it, and use /dev/urandom instead. Which makes all of the estimation
code useless.

THIS is my argument. Randomness is like security: if you make it too hard
to use, then you're shooting yourself in the foot, since people end up
unable to practically use it.

The point of /dev/random was to make it _easy_ for people to get random
numbers that we can feel comfortable about. The point of the accounting is
not a theoretical argument, but a way to make us feel _comfortable_ with
the amount of true randomness we're seeding in. It was not meant as a
theoretical exercise.

		Linus

Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@transmeta.com>
Subject: Re: [PATCH] (0/4) Entropy accounting fixes
Original-Message-ID: <Pine.LNX.4.44.0208180916560.10546-100000@home.transmeta.com>
Date: Sun, 18 Aug 2002 16:57:15 GMT
Message-ID: <fa.l5ddnqv.1lhe31k@ifi.uio.no>

On Sun, 18 Aug 2002, Oliver Xymoron wrote:
>
> The key word is actually conservative, as in conservative estimate.
> Conservative here means less than or equal to.

Your argument is that even with a gigahz logic analyzer on the network
line, you should certainly see randomness that is worth considering.

I dare you to actually show perfect correlation from it: the interrupt may
be synchronized to the PCI clock, but the code executed there-after
certainly will not. And even if the machine is 100% idle, and the whole
working set fits in the L1 cache, the DMA generated by the packet itself
will result in cache invalidations.

In other words, in order for you to actually be able to predict the TSC
from the outside, you'd have to not just have the gigahz logic analyzer on
the network line, you'd also have to be able to correlate the ethernet
heartbeat to the PCI clock (which you probably could do by looking at the
timing of the reply packets from a ping flood, although it would be
"interesting" to say the least and probably depends on how the network card
generates the ethernet clock), _and_ you'd have to be able to do a cache
eviction analysis (which in turn requires knowing the initial memory
layout for the kernel data structures for networking).

And your argument that there is zero randomness in the TSC _depends_ on
your ability to perfectly estimate what the TSC is. If you cannot do it,
there is obviously at least one bit of randomness there. So I don't think
your "zero" is a good conservative estimate.

At some point being conservative turns into being useless [ insert
obligatory political joke here ].

[ Side note: the most common source of pseudo-random numbers is the old
  linear congruental generator, which really is a sampling of a "beat"
  between two frequencies that are supposed to be "close", but prime.

  That's a fairly simple and accepted pseudo-random generator _despite_
  the fact that the two frequencies are totally known, and there is zero
  noise inserted. I'll bet you'll see a _very_ hard-to-predict stream from
  something like the PCI clock / CPU clock thing, with noise inserted
  thanks to things like cache misses and shared bus interactions. Never
  mind the _real_ noise of having a work-load. ]

> No, it says /dev/random is primarily useful for generating large
> (>>160 bit) keys.

Which is exactly what something like sshd would want to use for generating
keys for the machine, right? That is _the_ primary reason to use
/dev/random.

Yet apparently our /dev/random has been too conservative to be actually
useful, because (as you point out somewhere else) even sshd uses
/dev/urandom for the host key generation by default.

That is really sad. That is the _one_ application that is common and that
should really have a reason to maybe care about /dev/random vs urandom.
And that application uses urandom. To me that says that /dev/random has
turned out to be less than useful in real life.

Is there anything that actually uses /dev/random at all (except for
clueless programs that really don't need to)?

Please realize that _this_ is my worry: making /dev/random so useless
that any practical program has no choice but to look elsewhere.

> Actually, half of the point here is in fact to make /dev/urandom safer
> too, by allowing mixing of untrusted data that would otherwise
> compromise /dev/random.

Now this I absolutely agree with. The xor'ing of the buffer data is
clearly a good idea. I agree 100% with this part. You'll see no arguments
against this part at all.

>		 99.9% of users aren't using network sampling
> currently, after these patches we can turn it on for everyone and
> still sleep well at night. See?

Oh, that's the _good_ part. Yes.

The bad part is that I think our current /dev/random is close to useless
already, and I'd like to reverse that trend.

> That is an interesting point. A counterpoint is if we account so much as
> 1 bit of entropy per network interrupt on a typical system, the system
> will basically _always_ feel comfortable (see /proc/interrupts). It will
> practically never block and thus it is again identical to /dev/urandom.

But what's the problem with that? The "/dev/random may block" is not the
intrisic value of /dev/random - if people want to wait they are much
better off just using "sleep(1)" than trying to read from /dev/random.

My argument is that on a typical system there really _is_ so much
randomness that /dev/random is actually a useful thing. I think that you'd
have to _work_ at finding a system where /dev/random should block under
any normal use (where "normal use" also obviously means that only programs
that really need it would use it, ie ssh-keygen etc).

			Linus

Newsgroups: fa.linux.kernel
From: "Theodore Ts'o" <tytso@mit.edu>
Subject: Re: [PATCH] (0/4) Entropy accounting fixes
Original-Message-ID: <20020819042141.GA26519@think.thunk.org>
Date: Mon, 19 Aug 2002 04:25:54 GMT
Message-ID: <fa.h8ltdiv.mka102@ifi.uio.no>

On Sun, Aug 18, 2002 at 12:38:59AM -0500, Oliver Xymoron wrote:
> On Sat, Aug 17, 2002 at 09:01:20PM -0700, Linus Torvalds wrote:
> >
> > On 17 Aug 2002, Robert Love wrote:
> > >
> > > [1] this is why I wrote my netdev-random patches.  some machines just
> > >     have to take the entropy from the network card... there is nothing
> > >     else.
> >
> > I suspect that Oliver is 100% correct in that the current code is just
> > _too_ trusting. And parts of his patches seem to be in the "obviously
> > good" category (ie the xor'ing of the buffers instead of overwriting).
>
> Make sure you don't miss this bit, I should have sent it
> separately. This is a longstanding bug that manufactures about a
> thousand bits out of thin air when the pool runs dry.

There's a reason why I did what I did here, and it has to do with an
attack which Bruce Schneier describes in his Yarrow paper:

	http://www.counterpane.com/yarrow-notes.html

called the "iterative guessing attack".  Assume that the adversary has
somehow knows the current state of the pool.  This could because the
initial state was known to the attacker, either because it was in a
known, initialized state (this could happen if the distribution
doesn't save the state of the pool via an /etc/init.d/random script),
or because the attacker managed to capture the initial seed file used
by the /etc/init.d/random script.  Now what the attacker can do is
periodically sample the pool, and attempt to explore all possible
values which have been mixed into the pool that would result in the
value which he/she read out of /dev/random.

So in fact, by being more selective about which values get mixed into
the pool, you can actually help the iterative guessing attack!  That's
why the current code tries to mix in sufficient randomness to
completely reseed the secondary extraction pool, and not just enough
randomness for the number of bytes required.  This was a deliberate
design decision to try to get the benefits of Yarrow's "catastrophic
reseeding".

Your complaint in terms of "manufacturing about a thousand bits out of
thin air" is a fair one, but it depends on how you view things.  From
the point of view of absolute randomness, you're of course right.  If
the primary pool only has 100 bits of randomness, and
xfer_secondary_pool attempts to transfer 1100 bits of randomness, it
drains the primary pool down to 0, but credits the secondary pool with
1100 bits of randomness, and yes, we have "created" a thousand bits of
randomness.

That being said though, from the adversary only gets to see results
pulled out of the secondary pool, and the primary pool is completely
hidden from the adversary.  So when xfer_secondary_pool extracts a
large amount of randomness from the primary pool, it's doing so using
extract_entropy(), which uses SHA to extract randomness from the
primary pool.  Significant amounts of cryptographic analysis (which
would also as a side effect break the SHA hash) would be required in
order to figure out information in the primary pool based solely on
the outputs that are being fed into the secondary pool.

So is it legitimate to credit the secondary pool with 1100 bits of
randomness even though the primary pool only had 100 bits of
randomness in it?  Maybe.  It depends on whether you care more about
"absolute randomness", or "cryptographic randomness".  Yarrow relies
entirely on cryptographic randomness; the effective size of its
primary and secondary pools are 160 bits and 112 bits, respectively.

I tried to take a bit more of a moderate position between relying
solely on crypgraphic randomness and a pure absolute randomness model.
So we use large pools for mixing, and a catastrophic reseeding policy.

From a pure theory point of view, I can see where this might be quite
bothersome.  On the other hand, practically, I think what we're doing
is justifiable, and not really a security problem.

That being said, if you really want to use your patch, please do it
differently.  In order to avoid the iterative guessing attack
described by Bruce Schneier, it is imperative that you extract
r->poolinfo.poolwirds - r->entropy_count/32 words of randomness from
the primary pool, and mix it into the secondary.  However, if you want
to save the entropy count from the primary pool, and use that to cap
the amount of entropy which is credited into the secondary pool, so
that entropy credits aren't "manufactured", that's certainly
accepted.  It would make /dev/random much more conservative about its
entropy count, which might not be a bad thing from the point of view
of encouraging people to use it only for the creation of long-term
keys, and not to use it for generation of session keys.

						- Ted

P.S.  /dev/urandom should probably also be changed to use an entirely
separate pool, which then periodically pulls a small amount of entropy
from the primary pool as necessary.  That would make /dev/urandom
slightly more dependent on the strength of SHA, while causing it to
not draw down as heavily on the entropy stored in /dev/random, which
would be a good thing.

Newsgroups: fa.linux.kernel
From: "Theodore Ts'o" <tytso@mit.edu>
Subject: Re: [PATCH] (0/4) Entropy accounting fixes
Original-Message-ID: <20020819054359.GB26519@think.thunk.org>
Date: Mon, 19 Aug 2002 05:44:54 GMT
Message-ID: <fa.h9mfe2v.nko0g0@ifi.uio.no>

On Sat, Aug 17, 2002 at 09:15:22PM -0500, Oliver Xymoron wrote:
>
> Non-Uniform Distribution Of Timing
>
>  Unfortunately, our sample source is far from uniform. For starters, each
>  interrupt has a finite time associated with it - the interrupt latency.
>  Back to back interrupts will result in samples that are periodically
>  spaced by a fixed interval.

This is true.  But that's also why /dev/random measures not just the
first order delta, but the second and third order delta's as well, and
takes the minimum of the three delta's, capped by 12 bits, as the
entropy estimate.  So for example, if the interrupts are happening as
fast as possible (back to back) and there is merely the fixed interval
caused by the interrupt latency, then delta_1 will be the interrupt
latency, but delta_2 will be zero!

The rest of your analysis also ignores the fact that the current
/dev/random code uses the second and third order delta's (think of
them as second and third order derivatives, except that we're in the
discrete rather than continous domain).

>  Assuming the interrupt actually has a nice gamma-like distribution
>  (which is unlikely in practice), then this is indeed true. The
>  trouble is that Linux assumes that if a delta is 13 bits, it contains
>  12 bits of actual entropy. A moment of thought will reveal that
>  binary numbers of the form 1xxxx can contain at most 4 bits of
>  entropy - it's a tautology that all binary numbers start with 1 when
>  you take off the leading zeros. This is actually a degenerate case of
>  Benford's Law (http://mathworld.wolfram.com/BenfordsLaw.html), which
>  governs the distribution of leading digits in scale invariant
>  distributions.
>
>  What we're concerned with is the entropy contained in digits
>  following the leading 1, which we can derive with a simple extension
>  of Benford's Law (and some Python):

I'm not a statistician, and my last probability class was over 15
years ago, but I don't buy your extension of Benford's law.  Sure, if
we take street addresses numbering from 1 to 10000, the probability
that the leading digit will be 1 will be roughly 30%.  But the
distribution of the low two digits will be quite evenly distributed.
Put another way, by picking a house at random, and looking at the low
two digits, and can very fairly choose a number between 0 and 99.

> Interrupt Timing Independence
> -----------------------------
>
>  Linux entropy estimate also wrongly assumes independence of different
>  interrupt sources. While SMP complicates the matter, this is
>  generally not the case. Low-priority interrupts must wait on high
>  priority ones and back to back interrupts on shared lines will
>  serialize themselves ABABABAB. Further system-wide CLI, cache flushes
>  and the like will skew -all- the timings and cause them to bunch up
>  in predictable fashion.
>
>  Furthermore, all this is observable from userspace in the same way
>  that worst-case latency is measured.
>
>  To protect against back to back measurements and userspace
>  observation, we insist that at least one context switch has occurred
>  since we last sampled before we trust a sample.

First of all, the second order delta already protects against
back-to-back measurements.

Secondly, what is observable from userspace is time which is not spent
in a particular process.  But whether that time was spent in the
system or in another process is not at all obvious, and it's also not
obvious whether that time is spent handling interrupts or processing
bottom half tasks (i.e., processing network packets, et. al).
Moreover, it is not observable to the user process when multiple
interrupts might be happening in this whole mess.

That being said, global CLI's are a problem in that it does mean that
multiple interrupts could be serviced at the same time, and while the
outside adversary won't know exactly when a global CLI/STI might have
happened, it does reduce the quality of the randomness.  The solution
to this though is to avoid global CLI's, not to throw away randomness
samples until after a context switch.

> Questionable Sources and Time Scales
> ------------------------------------
>
>  Due to the vagarities of computer architecture, things like keyboard
>  and mouse interrupts occur on their respective scanning or serial
>  clock edges, and are clocked relatively slowly. Worse, devices like
>  USB keyboards, mice, and disks tend to share interrupts and probably
>  line up on USB clock boundaries. Even PCI interrupts have a
>  granularity on the order of 33MHz (or worse, depending on the
>  particular adapter), which when timed by a fast processor's 2GHz
>  clock, make the low six bits of timing measurement predictable.

We are not mixing in solely the low order bits of the timing
measurement; we're mixing in the entire timestamp.  So the fact that
the low-order bits of the timing measurement might be predictable
isn't necessarily a problem.

We are looking at the low 12 bits of the first, second, and third
order deltas when estimating the entropy.  So predictibility in the
low six bits of the timing measurement will tend to drag the entropy
estimate down, not up.

>  And as far as I can find, no one's tried to make a good model or
>  estimate of actual keyboard or mouse entropy.

Since a human is involved, and we're measuring to a fairly high level
of accuracy, I'm not particularly worried.

>  Randomness caused by
>  disk drive platter turbulence has actually been measured and is on
>  the order of 100bits/minute and is well correlated on timescales of
>  seconds - we're likely way overestimating it.

This has worried me.  The Davis paper was done a long time ago and I
suspect the turbulence values has gone down over time, not up.  But
let's be clear what the problem is.  If the adversary knows the exact
stream of requests sent to a disk, it can probably model the time
necessary to service those requests very accurately --- possibly
missing much less than the 100 bits/minute guestimated by the Davis
paper, given modern disk drives.

So when we measure the disk drive completion interrupts, what we are
really measuring is the uncertainity to the attacker exactly *when*
those disk drive requests were made, and what order of disk drive
requests were sent to it.

Can the adversary control this, or know this?  Sometimes, to a certain
extent.  But remember, we're using a very high resolution timer, and
while the adversary might not the rough time to a few milliseconds
when a request might be issued, it would be much harder for the
adversary to know at what exact time stamp clock value was at a
particular event.

> Trusting Predictable or Measurable Sources
> ------------------------------------------
>
>  What entropy can be measured from disk timings are very often leaked
>  by immediately relaying data to web, shell, or X clients.  Further,
>  patterns of drive head movement can be remotely controlled by clients
>  talking to file and web servers. Thus, while disk timing might be an
>  attractive source of entropy, it can't be used in a typical server
>  environment without great caution.

This is something to be concerned about, to be sure.  But generally a
client won't have complete control of the drive head movement ---
there are other clients involved --- and the adversary generally won't
have complete knowledge of the block allocation of files, for example,
so he/she would not be able to characterize the disk drive timings to
the degree of accuracy required.

Certainly a major concern that I've always had is measuring network
device interrupts, since packet arrival times can be measured by an
outsider.  Such an attack would require that the adversary have a
sophisticated device on the local LAN segment of the attack victim
(since the attacker needs to see all of the packets directed at the
victim in order to make guesses about the inter-packet arrival times,
however.  So the how practical this attack might be is certainly quite
debatable.

>  (Incidentally, tricks like Matt Blaze's truerand clock drift
>  technique probably don't work on most PCs these days as the
>  "realtime" clock source is often derived directly from the
>  bus/PCI/memory/CPU clock.)

Actually, many peripherals do have their own clock crystals and clock
circuitry (network cards, serial cards, IDE disks, etc.).....

> Batching
> --------
>
>  Samples to be mixed are batched into a 256 element ring
>  buffer. Because this ring isn't allowed to wrap, it's dangerous to
>  store untrusted samples as they might flood out trusted ones.

It's not dangerous in the sense that we might suffer a security breach
(assuming that our entropy estimation routines are accurate).  It's a
bad thing in that we might lose some good randomness, but that's not a
disaster.

That being said, if we are running out space in the ring buffer, it
would be good to increase its size.

						- Ted

Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@transmeta.com>
Subject: Re: [PATCH] (0/4) Entropy accounting fixes
Original-Message-ID: <Pine.LNX.4.33.0208231311150.4148-100000@penguin.transmeta.com>
Date: Fri, 23 Aug 2002 20:16:30 GMT
Message-ID: <fa.oaq1dnv.j4m4b5@ifi.uio.no>

On Fri, 2 Nov 2001, Pavel Machek wrote:
>
> Actually, no. If something is not predictable it does not mean >= 1 bit.

I agree. It might be less than one bit. However, I suspect that you can
basically almost never predict the _exact_ TSC value, which implies that
there is one or more bits of true randomness there.

If you can predict it (exactly) a noticeable fraction of the time, there
is clearly less than one bit of randomness.

To some degree it _should_ be fairly easy to test this even without a GHz
scope - just put two totally idle machines, connect them directly with a
cross-over cable, and make one of them send packets as quickly as it can
using something like "ping -f" with no route back to the sender (ie make
the ethernet card on the sending machine be your "exact clock" for sending
purposes).

Then record the stream of TSC on the receiving machine, and if you can
generate a function to predict a noticeable percentage of them exactly,
you've shown that it's much less than 1 bit of information.

Maybe somebody would find this an interesting exercise.

		Linus

From: Theodore Ts'o <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: Fortuna
Date: Thu, 14 Apr 2005 17:38:26 UTC
Message-ID: <fa.d78femm.1rl4tr6@ifi.uio.no>
Original-Message-ID: <20050414133336.GA16977@thunk.org>

On Thu, Apr 14, 2005 at 02:15:38PM -0000, linux@horizon.com wrote:
> I've heard a suggestion that something like /dev/urandom, but which blocks
> until it has received a minimum amount of seed material at least once,
> would be a nice thing.  So boot-time crypto initialization can stall
> until it has achieved a minimum level of security.

With a properly set up set of init scripts, /dev/random is initialized
with seed material for all but the initial boot (and even that problem
can be solved by having the distribution's install scripts set up
/var/lib/urandom/random-seed after installing the system).

If the init scripts aren't set up correctly, then you're screwed, yes,
but (a) all distributions that I know do the init scripts correctly,
and the right of doing this is documented in random.c, and (b) if user
space is screwed up in other ways, you can even worse off.

What if someone accesses the seed file directly before the machine
boots?  Well, either (a) they have broken root already, or (b) have
direct access to the disk.  In either case, the attacker with such
powers can just has easily trojan an executable or a kernel, and
you've got far worse problems to worry about that the piddling worries
of (cue Gilbert and Sullivan overly dramatic music, <Oh, horror!>) the
dreaded state-extension attack.

> It tries to divide up the seed entropy into sub-pools and hold off
> re-seeding the main pool until the sub-pool has accumulated enough entropy
> to cause "catastrophic reseeding" of the main pool, adding enough entropy
> at once that someone who had captured the prior state of the main pool
> would not be able (due to computational limits and the one-way nature
> of the pool output function) to reverse-engineer the post-state.

Actually, if you check the current random.c code, you'll see that it
has catastrophic reseeding in its design already.

> So Fortuna would be helped by some better understanding of what exactly
> makes it fail, so the discussion could move to whether real-world
> seed sources have those properties.
>
> But until that understanding is gained, Fortuna is questionable.
>...
> Until this cloud is dissipated by further analysis, it's not possible to
> say "this is shiny and new and better; they's use it!" in good conscience.

My big concern with Fortuna is that it really is the result of
cryptographic masturbation.  It fundamentally assumes that crypto
primitives are secure (when the recent break of MD4, MD5, and now SHA1
should have been a warning that this is a Bad Idea (tm)), and makes
assumptions that have no real world significance (assume the attacker
has enough privileges that he might as well be superuser and can
trojan your system to a fare-thee-well ---- now, if you can't recover
from a state extension attack, your random number generator is fatally
flawed.)

In addition, Fortuna is profligate with entropy, and wastes it in
order to be able make certain claims of being able to recover from a
state-extension attack.  Hopefully everyone agrees that entropy
collected from the hardware is precious (especially if you don't have
special-purpose a hardware RNG), and shouldn't be wasted.  Wasting
collected entropy for no benefit, only to protect against a largely
theoretical attack --- where if a bad guy has enough privileges to
compromise your RNG state, there are far easier ways to compromise
your entire system, not just the RNG --- is Just Stupid(tm).

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [PATCH 7/14] random: Remove SA_SAMPLE_RANDOM from network drivers
Date: Fri, 05 May 2006 19:13:19 UTC
Message-ID: <fa.h9jtJRM6Y2HQXRKbEKBHDJfeUGw@ifi.uio.no>
Original-Message-ID: <20060505191127.GA16076@thunk.org>

On Fri, May 05, 2006 at 12:24:26PM -0500, Matt Mackall wrote:
> I haven't seen such an analysis, scholarly or otherwise and my bias
> here is to lean towards the paranoid.
>
> Assuming a machine with no TSC and an otherwise quiescent ethernet
> (hackers burning the midnight oil), I think most of the
> hard-to-analyze bits above get pretty transparent.

As always, whether or not the packet arrival times could be guessable
and/or controlled by an attacker really depends on your threat model.
For someone who has an ethernet monitor attached directly to the
segment right next to your computer, it's very likely that they would
be successful in guessing the inputs into the entropy pool.  However,
an attacker with physical access to your machine could probably do all
sorts of other things, such as install a keyboard sniffer, etc.

For a remote attacker, life gets much more difficult.  Each switch,
router, and bridge effectively has a queue into which packets must
flow through, and that is _not_ known to a remote attacker.  This is
especially true today, when most people don't even use repeaters, but
rather switches/bridges, which effectly make each ethernet connection
to each host its own separate collision domain (indeed that term
doesn't even apply for modern high-speed ethernets).

I've always thought the right answer is that whether or not network
packet arrival times should be used as entropy input should be
configurable, since depending on the environment, it might or might
not be safe, and for some hosts (particularly diskless servers), the
network might be the only source of entropy available to them.

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [PATCH 7/14] random: Remove SA_SAMPLE_RANDOM from network drivers
Date: Sat, 06 May 2006 11:55:50 UTC
Message-ID: <fa.zh0XETUrWu2aAddQBmdm7RMLoes@ifi.uio.no>
Original-Message-ID: <20060506115502.GB18880@thunk.org>

On Fri, May 05, 2006 at 03:34:37PM -0500, Matt Mackall wrote:
> Nonetheless, the current SA_SAMPLE_RANDOM scheme should go. A) it's in
> the IRQ fast path B) most of its users are bogus which strongly
> indicates it's a bad API.
>
> Instead (if we want network entropy) we should add an
> add_network_randomness call in some central location in the network
> stack (probably right next to netpoll's RX hooks) and probably have it
> compiled out by default.

I disagree.  It really wants to be a run-time controllable thing, and
probably on a per-interface/per-device driver basis, since it should
be up to the system administrator who is deploying the box, not the
developer or distribution compiling the kernel, to make that
determination.

Also, the entropy sampling *really* wants to be done in the hard IRQ
handling path, not some place higher in the stack since the scheduler
would smooth out the unpredictable timing information.  Moving it into
the interrupt routines is in fact a problem for CONFIG_PREEMPT_RT,
since the device driver's interrupt handlers become a schedulable
entity, since the IRQ handling is moved into a separate kernel thread.
So I would much prefer to see the entropy sampling stay in its current
location, since people using real-time deserve real randomness too.
(In fact, some of them may have a **much** stronger need for it.  :-)

As far as your reasons that you've given, (A) I find it hard to
believe a single conditional jump is really going to be measurable,
and (B) sure, fix the bad users, but the downside of screwing up
SA_SAMPLE_RANDOM is fairly limited; it only messes up the entropy
estimator, and in practice most users are using /dev/urandom anyway.
The random driver's algorithms are designed so that /dev/random can be
world writable, and an attacker can write arbitrary data, include all
zero's, without degrading the entropy in the pool.(*) Hence, while a
bad user of SA_SAMPLE_RANDOM should be fixed, it is hardly a
catastrophic failure.

						- Ted

(*)So if User A writes data which he knows into /dev/random, it
doesn't help User A guess what /dev/random or /dev/urandom might
produce next --- and if User B doesn't know what User A has written
into the pool, User B's job just got harder.  So if User's B, C, and D
all do the same thing, the net result is that effective
unpredictability of the random pool has increased for everyone.)

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: /dev/random on Linux
Date: Tue, 16 May 2006 20:18:32 UTC
Message-ID: <fa.fhP/LUbpEd4H0OAUmsMzkREjASg@ifi.uio.no>
Original-Message-ID: <20060516201749.GA10077@thunk.org>

On Tue, May 16, 2006 at 04:54:00PM +0300, Zvi Gutterman wrote:
>
> I did not get any answer from Matt and was sure that it was of no interest.
> This was my mistake, sorry for not sending it earlier to more people.

Note that Matt took over maintainance from me, but I was the original
designer of most of the /dev/random algorithms (with much help from
Colin Plumb).  I haven't had time to do a full analysis of the paper,
but let me give a couple comments to set the context.

First of all, the Linux /dev/random generator has been around for a
long time; I did the original implementation back in 1994, back when
the crypto iron curtain was still being maintained by the US
Government.  (As far as I know, the Linux /dev/random driver was the
first OS-based RNG and predates efforts present in other systems such
as OpenBSD, et. al.  Not being an academic trying to get tenure, I
never bothered to write a paper back in '94, so I guess that means I
don't get any credit, which is probably fair if you're playing the
academic game.  Publish or perish, after all.  :-)  In any case,
because of the U.S.  export controls on cryptographic algorithms at
that time, in particular anything relating to encryption algorithms, I
chose to use cryptogaphic hashes as its primary non-linear component,
since these were not problematic from an export control perspective.

In addition, the design philosophy is somewhat different from what has
been considered the general wisdom at least in academic circles.
First of all, as noted above, export controls were a major issue.  In
addition, in that time period, flaws in MD4 and MD5 were coming to
light, and it was clear that some cryptographic algorithms provided by
the US government (in particular, DSA) were deliberately designed to
prevent their usefulness in anything other than their released context
(i.e., There are many who believe that one of the DSA primary design
criteria was that it could only be used for digital signatures, and
*not* provide encryption functionality).  So one of the key design
criteria was that /dev/random use its cryptographic primitives in ways
that were non-brittle --- that is, even if the primary use of that
cryptographic primitive were compromised (e.g., the recent collision
attacks against SHA-1), the security of the system would not be
compromised.

One of the reasons why I did not choose a construction such as Bruce
Schneier's Yarrow, which encrypts an incrementing counter as its
output stage, is that repeated encryptions of plaintexts that have
differ from each other by a small hamming distance is one of the
things that cryptographers look for when attacking a cipher.  Maybe
modern encryption systems are so good that we don't have to worry
about such things any more (and certainly most academic papers ignore
such engineering considerations), but I like to take a much more
humble attitude towards making such assumptions, since I *know* the
cryptographers at the NSA know more than I do, and underestimating
them or any other state-sponsored cryptoanalytic organization is
generally not a good idea.  In any case, while if AES is perfect, yes,
using a key to encrypt a counter might be probably good enough --- but
I'm not sure I want to make that assumption.

Secondly, attacks that assume that the attacker can obtain the
contents of the pool were not a major consideration, mainly because
practically speaking, if an attacker can obtain the contents of the
internal state of the RNG, the attacker generally has enough
privileges that the entire system has been compromised, meaning that
the system can be rooted to a fare-thee-well --- all future session
keys and data to be encrypted can then be collected, etc.  So attacks
of the form, "assume that the attack gains access to the pool, but has
no other illicit access to the system" are not particularly practical in
the real world, and are mostly of concern to academics.

This is particularly tree of the criteria put forward by the designers
of the Yarrow RNG, where they were concerned about how quickly the RNG
could recover after the pool state got compromised.  My argument was
always that if an attacker had enough privileges to compromise the
internal state of the RNG, in the real world, recovery would require
re-installing from trusted media.

The desire for forward security is different, admittedly; here the
concern is that if the system has already been compromised, you don't
want to make it easy the attacker to find previously generated session
keys or even long-term keys.  However, it should be noted that it
takes O(2**64) operations to go back a single turn of the crank--- but
that only helps you get the last 10 bytes (80 bits) that were last
generated.  In order to get the next 80 bits, you have to do another
O(2**64) amount of work, and sooner or later (within 18 turns of the
crank, best case) you will run into one of the "unlucky" indexes where
it will take you at least O(2**96) of work effort.  To put this in
perspective, generating a 1024 bit RSA key will require approximately
14 turns of the crank, so if you are lucky with the positioning of the
index *and* you penetrate the machine and capture the state of the
pool (which as I mentioned, probably means you've rooted the box and
the system will probably have to be reinstalled from trusted media
anyway), *and* a 1024-bit RSA key had just been generated, you would
be able to determine that 1024-bit RSA key with a work factor of
approximately O(2**68) if you are lucky (probability 1 in 8), and
O(2**96) if you are not.  Should we try do better, sure, but this is
not the sort of thing where you have to keep a secret and only talk to
the official maintainer about it.

I would also note that one of the things Matt did after he took over
maintainership from me is that he took out a call to
add_timer_randomness() from the extract entropy code.  This bit of
code would have also significantly complicated practical attacks on
backtracking attacks, since it mixes in the TSC cycle counter at the
time of the entropy extraction into the pool.  So even if you capture
the state of the pool, unless you know exactly *when* each of the
various entropy extractions took place, you wouldn't be able to easily
calculate the various candidate pools.

More critically, the paper is seriously flawed in its analysis of
reversing the primary pool.  The only reason why the generic attack
takes only 2**96 when the secondary and urandom pools are only 32
words long, is that the entropy extraction routine is only calls
add_entropy_words ceil(n/16)+1 times.  However, the primary entropy
pool is 128 words long, which means that add_entropy_words() is called
9 times.  The paper recognizes this in Appendix B, which shows that j,
j-1, j-2, ... j-8 is modiified when extracting entropy from the
primary pool.  This makes the generic attack take work factor
O(2**288), and the optimized attack also has to be seriously reworked
before it can be carried out on the primary pool.  This also points
out that the simple expedient of doubling the size of the secondary
and urandom pool would effectively eliminate the practicality of
carrying out an forward attack.

In section 3.4, a comment is made about guessable passwords to seed
the entropy pool, and a recommendation was made to remove the value of
the keyboard from the input to the pool.  I would reject that
recommendation, since the entropy estimated is calulated based only on
the timing of the key, and the inter-key timing values is mixed into
*in* *addition* *to* the value of the keys typed.  If the user types a
known or guessable password, then yes, the value of what is typed is no
help in adding entropy to the pool --- but it doesn't hurt, either.  I
have no argument with the observation that main source of entropy
added is the inter-key arrival time, and even there we are extremely
conservative on estimating the amount of entropy found since obviously
the "fist" of a typist could be analyzed and used to provide some
information about inter-key arrival times, particularly if a
high-resolution time source is not available.

Speaking generally about usage scenarios such as the OpenWRT Linux
distribution, if there is no hard disk and relatively little user
input, the amount of entropy that can be gathered will suffer.  That
however isn't the fault of the /dev/random driver, but how it is put
to use.  One of the things which I would strongly recommend to
security applications, such as those found in the OpenWRT Linux
distribution, is to calculate the cryptographic hash of data which is
to be sent out encrypted, and write that into /dev/random, thus mixing
that hash into the entropy pool.  This will not provide any entropy
credits (since at least one entity would have knowledge of the value
of the hash).  However, to attackers that do *not* have access to the
plaintext of the encrypted communications sent out by the application,
this unknown data will significantly complicate the life of attackers
attempting to analyze the state of the random number generator.  In
the absence of hard drive-generated entropy or human-generated timing
data from a keyboard or pointing device, it's certainly better than
nothing, and in practice is quite effective.

All of this being said, I think the paper does make a number of good
points, and I would recommend making the following changes to Linux's
/dev/random:

1) Restore the add_timer_randomess() call to the entropy extraction
function.  This prevents the forward security attack unless the
attacker can know the exact time when each of the entropy extractions
took place, which in general is going to be quite impractical.

2) Double the size of the secondary and urandom pools from 128 to 256
bytes.

3) Investigate the possibility of adding quotas to /dev/random.  This
is actually much more trickier that the paper suggests, since you want
to allow the user to be able to extract enough entropy to create a
2048 bit (or at least a 1024-bit) RSA key.  The problem is that's a
lot of entropy!  Maybe it would be OK to only allow a 1024-bit RSA key
to be generated every 12 or 24 hours, but suppose someone is
experimenting with GPG and screws up (say they forget their
passphrase); do you tell them that sorry, you can't generating another
key until tomorrow?  So now we have to have an interface so the root
user can reset the user's entropy quota....  And even with a 24-hour
limit, on a diskless system, you don't get a lot of entropy, so even a
1024-bit RSA key could seriously deplete your supply of entropy.

This last point is a good example of the concerns one faces when
trying to design a working system in the real word, as opposed to the
concerns of academicians, where the presence or lack of forward
security in the event of a pool compromise is issue of massive urgency
and oh-my-goodness-we-can-only-tell-the-maintainer-because-it's-such-a-
critical-security-hole attitude.  Where as my attitude is, "yeah, we
should fix it, but I doubt anyone has actually been harmed by this in
real life", which puts it in a different category than a buffer
overrun attack which is accessible from a publically available network
service.

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: Entropy Pool Contents
Date: Fri, 24 Nov 2006 00:49:38 UTC
Message-ID: <fa./gcEIc9WbqoVsYVYZEVXRvM68yA@ifi.uio.no>

On Thu, Nov 23, 2006 at 01:10:08AM +0100, Jan Engelhardt wrote:
> Disk activities are "somewhat predictable", like network traffic, and
> hence are not (or should not - have not checked it) contribute to the
> pool. Note that urandom is the device which _always_ gives you data, and
> when the pool is exhausted, returns pseudorandom data.

Plesae read the following article before making such assertions:

	D. Davis, R. Ihaka, P.R. Fenstermacher, "Cryptographic
	Randomness from Air Turbulence in Disk Drives", in Advances in
	Cryptology -- CRYPTO '94 Conference Proceedings, edited by Yvo
	G. Desmedt, pp.114--120. Lecture Notes in Computer Science
	#839. Heidelberg: Springer-Verlag, 1994.
	http://world.std.com/~dtd/random/forward.ps

Regards,

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: Seeding /dev/random not working
Date: Tue, 29 May 2007 13:15:31 UTC
Message-ID: <fa.7SZxrn0+kR7XjCtxENf2H6okmRE@ifi.uio.no>

On Tue, May 29, 2007 at 12:53:10PM +0100, M Macnair wrote:
> I have two embedded boards (one ARM, one PowerPC), running two
> different versions of 2.6.  They have no hard drives, keyboards or
> mice.  They each have a NIC, but I understand these make no
> contribution to the entropy pool.
>
> 	if [ -f $random_seed ]; then
> 		cat $random_seed >/dev/urandom  # should seed the pool
> 	fi
> 	dd if=/dev/urandom of=$random_seed count=1 2>/dev/null # save some
> data from urandom for next boot
>
> I have rebooted my boards many times, and after each boot I read the
> contents of $random_seed.  Whilst it does not happen every time, the
> contents of $random_seed are /often the same/.  To give you a feel:
> rebooted 11 times, got a total of 3 different outputs.

Ok, so this is telling me a couple of things.  First of all, if you're
only getting three outputs, it means that you don't have any
peripherals feeding entropy into the system from the boot sequence.
Without any hard drives, keyboards or mice, and a NIC whose device
driver hasn't been configured to feed entropy, you're definitely
hosed.

Secondly, and more importantly, your boot scripts aren't set up
correctly.  You really want to make sure that you save the some
entropy at shutdown.  i.e., run at shutdown the same sequence that was
run at the boot-time.  (You don't really need the "cat $random_seed >
/dev/urandom", but it's harmless.)  Repeating this sequence at
shutdown is useful because it saves any entropy gathered from the
current boot session and saves it to be used at the next boot session.
I've noticed a number of distro's have gotten this wrong, and I guess
I should have filed more bugs to get this fixed.  In your case, since
you don't have anything wired up to feed entropy into your pool, it
will make /dev/urandom effectively a psuedo-random number generator,
but at least you won't get repeating outputs.

Another thing which I noticed is that when Matt Mackall took over
maintainership of /dev/random, he apparently took out one of the
safeguards I had, which was that before, when entropy was extracted
from the pool the time stamp when it was extracted was mixed back into
the pool.  The theory was that an external attacker might not know
when a program might be calling /dev/random, so mixing in the time of
that entropy was extracted wouldn't hurt, and might help.  I'll submit
a patch to add that support back in, which will help you a little.

HOWEVER, the more important thing to remember is that /dev/random
isn't magic.  If you don't have any entropy sources to mix in, and you
don't do anything at the application level to mix in data that might
not be known by external attackers, you will always be vulnerable,
especially if your threat model includes government funded
intelligence agencies.  So if you are planning on using /dev/urandom
for cryptographic purposes, while some of these fixes will avoid at
least the embarassing result of repeating outputs from /dev/urandom,
it's not really going to result in high-quality unpredictable results
against a well-funded, determined adversary.

If that is your threat model, then you really should consider adding
hardware random number generator into your product design.

Regards,

					- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: Seeding /dev/random not working
Date: Tue, 29 May 2007 16:31:23 UTC
Message-ID: <fa.CKB+oHdEv/zKg3/rQBukze31Reo@ifi.uio.no>

On Tue, May 29, 2007 at 02:14:56PM +0000, Pavel Machek wrote:
> Can we get at least time-of-boot from rtc clock to the pool? We really
> should not be getting identical outputs...

We are mixing the time-of-dat at boot time into the pool already,
using ktime_get_real() in random.c:init_std_data().  The problem
though is if the board doesn't have a valid RTC clock, or its battery
is dead, and the time-of-day clock is getting set by userspace, then
unless userspace also mixes in the time once it's been set via ntpdate
or whatever, it won't help the entropy pool.

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: Why does reading from /dev/urandom deplete entropy so much?
Date: Wed, 05 Dec 2007 13:33:53 UTC
Message-ID: <fa.PPfmnxpDbhntV44ZdCD6t9292+4@ifi.uio.no>

On Wed, Dec 05, 2007 at 01:29:12PM +0100, Marc Haber wrote:
> On Tue, Dec 04, 2007 at 05:18:11PM +0100, Adrian Bunk wrote:
> > On Tue, Dec 04, 2007 at 12:41:25PM +0100, Marc Haber wrote:
> > > While debugging Exim4's GnuTLS interface, I recently found out that
> > > reading from /dev/urandom depletes entropy as much as reading from
> > > /dev/random would. This has somehow surprised me since I have always
> > > believed that /dev/urandom has lower quality entropy than /dev/random,
> > > but lots of it.
> >
> > man 4 random
>
> Thanks for this pointer, I was not aware of the documentation. After
> reading this thread and the docs, I am now convinced that GnuTLS
> should seed a PRNG from /dev/(u)random instead of using the entropy
> directly. I will go filing a bug against GnuTLS.

BTW, note that it would be a polite thing for GnuTLS when it is
encrpyting data, which represents information which might not be
available to an adversary, and SHA1 hash it (out of paranoia) and feed
it to /dev/random.

This won't give any "credits" to the random entropy counter, but to
the extent that is information that isn't available to the adversary,
it adds additional uncertainty to the random pool.

   		   	       	      	     - Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: Why does reading from /dev/urandom deplete entropy so much?
Date: Sat, 08 Dec 2007 17:50:49 UTC
Message-ID: <fa.Hr2j1F/zeIbEkcPNm8mnC4NDliI@ifi.uio.no>

On Sat, Dec 08, 2007 at 11:33:57AM -0600, Mike McGrath wrote:
>> Huh?  What's the concern?  All you are submitting is a list of
>> hardware devices in your system.  That's hardly anything sensitive....
>
> We actually had a very vocal minority about all of that which ended up
> putting us in the unfortunate position of generating a random UUID instead
> of using a hardware UUID from hal :-/

Tinfoil hat responses indeed!  Ok, if those folks are really that
crazy, my suggestion then would be to do a "ifconfig -a > /dev/random"
before generating the UUID, and/or waiting until you just about to
send the first profile, and/or if you don't yet have a UUID,
generating it at that very moment.  The first will mix in the MAC
address into the random pool, which will help guarantee uniqueness,
and waiting until just before you send the result will mean it is much
more likely that the random pool will have collected some entropy from
user I/O, thus making the random UUID not only unique, but also
unpredictable.

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: entropy gathering (was Re: Why does reading from /dev/urandom
Date: Sat, 08 Dec 2007 20:32:59 UTC
Message-ID: <fa.hat9d+7bqEzDUFgPaac4oC3gGCs@ifi.uio.no>

On Sat, Dec 08, 2007 at 03:04:32PM -0500, Jeff Garzik wrote:
> That's a bit of a tangent on a tangent.  :)  Most people don't have a
> hardware RNG.

Actually, most Business class laptops from IBM/Lenovo, HP, Dell,
Fujitsu, and Sony laptops *do* have TPM chips that among other things,
contain a slow but (supposedly, if the TPM microprocessors are to be
believed) secure hardware random number generator for use as a session
key generator.  This is thanks to various US legal mandates, such as
HIPPA for the medical industry, and not just the paranoid ravings of
the MPAA and RIAA.  :-)

The problem is enabling the TPM isn't trivial, and life gets harder if
you want the TPM chip to simultaneously work on dual-boot machines for
both Windows and Linux, but it is certainly doable.

>> I think we should re-evaluate having an internal path from the hwrngs
>> to /dev/[u]random, which will reduce the need for userspace config
>> that can go wrong.

I think the userspace config problems were mainly due to the fact that
there wasn't a single official userspace utility package for the
random number package.  Comments in drivers/char/random.c for how to
set up /etc/init.d/random is Just Not Enough.

If we had a single, official random number generator package that
contained the configuration, init.d script, as well as the daemon that
can do all sorts of different things that you really, Really, REALLY
want to do in userspace, including:

  * FIPS testing (as Jeff suggested --- making sure what you think is
    randomness isn't 60Hz hum is a Really Good Idea :-)
  * access to TPM (if available --- I have a vague memory that you may
    need access to the TPM key to access any of its functions, and the
    the TPM key is stored in the filesystem)

So....  anyone interested in belling the metaphorical cat?   :-)

	       		     	     	 	      - Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: /dev/urandom uses uninit bytes, leaks user data
Date: Sat, 15 Dec 2007 04:33:10 UTC
Message-ID: <fa.+/pX2FM8RnUodPuNwH+iiP7/Ic8@ifi.uio.no>

On Fri, Dec 14, 2007 at 04:30:08PM -0800, John Reiser wrote:
> There is a path that goes from user data into the pool.  This path
> is subject to manipulation by an attacker, for both reading and
> writing.  Are you going to guarantee that in five years nobody
> will discover a way to take advantage of it?  Five years ago
> there were no public attacks against MD5 except brute force;
> now MD5 is on the "weak" list.

Yep, I'm confident about making such a guarantee.  Very confident.

First of all, keep in mind that the attacks on MD5 are about being
able to find hash collisions.  The way the cryptographic hash is being
used in /dev/random, merely being able to find hash collision means
squat.  You need to be able to carry out a preimage attack; which no
one has been able to do yet.  And even if you could figure out how to
do a pre-image attack (which none of the "successful attacks" on MD4,
MD5, SHA-0, HAVAL-128, RIPEMD, et. al have been able to acomplish), we
don't give you the entire hash output; instead, what you get is the
hash folded in half via XOR, so you only get the two halves of the SHA
hash XOR'ed together to form 80 bits.

So given one of these folded 80 bits of hash, you need to figure out a
large number of the possible combinations 1024 bits that were in
secondry entropy pool could have resulted in the folded hash image.
And using the pigeon-hole princple and assuming that SHA approximates
a random function, you need to figure out which one of the 2**944
possible combination of 1024 bits was the correct pool pre-image that
generated those 80 bits.  That's a hard problem.

But secondly, even *that's* not enough.  As I said earlier, the pool
is simply unavailable to the attacker; we never make it available,
except by revealing 80 bit hashes of the pool.  So you can't read the
initial or current state of the pool without first breaking root ---
and after 3 bytes of kernel stack is mixed into the pool, via an XOR
operation, there is no way to read out the pool.  And if you don't
know the initial contents of the pool --- funny thing, but UNKNOWN XOR
KNOWN == UNKNOWN.  So here I'm not even relying on cryptographers of
the future not being able to find preimage attacks.  I'm just relying
on simple logic.

						- Ted

From: Theodore Tso <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: /dev/urandom uses uninit bytes, leaks user data
Date: Tue, 18 Dec 2007 03:06:16 UTC
Message-ID: <fa.UCiG0CHgnkZeIHu+bd9zUAvAPhc@ifi.uio.no>

On Mon, Dec 17, 2007 at 07:52:53PM -0500, Andy Lutomirski wrote:
> It runs on a freshly booted machine (no
> DSA involved, so we're not automatically hosed), so an attacker knows the
> initial pool state.

Not just a freshly booted system.  The system has to be a freshly
booted, AND freshly installed system.  Normally you mix in a random
seed at boot time.  And during the boot sequence, the block I/O will
be mixing randomness into the entropy pool, and as the user logs in,
the keyboard and mouse will be mixing more entropy into the pool.  So
you'll have to assume that all entropy inputs have somehow been
disabled as well.

BUT --- if the pool state is totally known, you're really, REALLY,
REALLY hosed, since normally /dev/random might get used to initialize
a CRNG to *generate* the ephemeral key.  So the danger is not *3*
*bytes* of the ephemeral key accidentally getting mixed into the
entropy pool, followed by an attacker managing to crack the system so
bad that he or she has read access into kernel memory (without
managing to mix more entropy into the pool), and then doing
sophisticated cryptoanalytic attacks with an O(2**24) order time, to
try to leak the *3* *bytes* of ephemeral key.  No, the problem is
that the attacker, with access to the known initial state of the pool,
will be able to find out *THE* *ENTIRE* *EPHEMERAL* *KEY*, since it was
probably generated via /dev/random --- and without needing to break
into the system with sufficient privs to be able to read kernel
memory.

So it is your argument which is absurd.  If you're going to assume a
completely known pool state, and then assume some program is using
/dev/random, the danger is not in change that some previous kernel
stack state might contain something secret that could theoretically be
revealed after an attacker manages to break into machine as root.  No,
the real danger is in what this presumably cryptographically sensitive
program did with the predictable data from /dev/random.

So the real answer is that there needs to be sufficient randomness
mixed into /dev/random.  For a typical Linux system, it's there.
There are some potential configuration (say a pure diskless system
with NFS remote filesystems and no keyboard or mouse) being used in
some kind of security-sensitive system.  Now, someone who wanted to
say, run a remote certificate signing server or IPSEC key management
system on such a system would arguably be performing security
malpractice, and the lack of entropy for /dev/random is probably the
least of such a system's security problems.

But in any case, instead of trying to worry about these sorts of
hypothetical worries, the much better use of time and energy is the
userspace daemon that can sample entropy from a sound device, or some
other hardware entropy available to the system.  The real issue is
being able to make sure the pool is *not* knowable to an attacker.

      	      	   	    	    - Ted

From: Theodore Tso <tytso@MIT.EDU>
Newsgroups: fa.linux.kernel
Subject: Re: [PATCH] drivers/net: remove network drivers' last few uses of
Date: Sun, 25 May 2008 23:29:10 UTC
Message-ID: <fa.NMB2G4FphbPafB9Nq8BZmbv7Xnw@ifi.uio.no>

On Mon, May 26, 2008 at 12:39:49AM +0930, Glen Turner wrote:
>
> For example, /dev/random has run out. So the output of /dev/urandom
> is now determined by previous values of /dev/random.  I then send in
> a stack of network packets at regular intervals. So the output of
> /dev/urandom is now greatly determined by those packets.  My search
> space for the resulting key is small since /dev/urandom appears to
> be random, but in fact is periodic.

That's not how it works.  Basically, as long as there is *some*
entropy in the system, even from the /var/lib/random-seed, or from
keyboard interrupts, or from mouse interrupts, which is unknown to the
attacker, in the worse case /dev/urandom will be no worse than a
cryptographic random number generator.

Even if you feed it huge amounts of known data, it won't allow you to
"influence" the cryptographic random number generator --- unless of
course SHA-1 is totally and thoroughly broken as a cryptographic hash
algorithm (invalidating all public key certificates and digital
signatures made using SHA-1 algorithm).

There is a reason why /dev/random is world-writeable; it's perfectly
safe to write arbitary amounts of data into /dev/random.  If the
attacker doesn't know what has been fixed into the entropy pool, his
life just got a lot harder.  If it is the attacker mixing known data
into the pool, it's no worse.

The problems with /dev/urandom only appear if there *all* of the data
is known by the attacker --- so all of the keyboard interrupts, all of
the network interrupts, all of the mouse interrupts, the initial
random seed file --- everything.  In practice the time when this has
come up is very early in the initial install process, where there is
no random seed file, and before any interrupt entropy has had a chance
to be mixed into the pool, particularly if it is a headless (i.e., no
keyboard, no mouse, no monitor) install process.

And here there is no magic bullet.  If you are doing a headless
install, and there is no entropy, and you don't have a way of
accessing a real hardware random number generator, THIS IS NOT THE
RIGHT TIME TO BE GENERATING SSH HOST KEYS.

> I'll also note that there is a huge number of periodic packets seen by
> hosts on quiet networks -- such as a preparation VLAN where a system
> administrator might choose to run up a new machine.

If the attacker has the power to monitor your preparation/installation
network exactly when the machine is being installed, you probably have
worse problems on your hands --- for example, most distribution
installs off of CD include the RPM, and then get on the network to
grab the security updates.  If you have an attacker on your
preparation/install VLAN, they might be able to attack your machine
and install rootkit before you have a chance to install the security
errata RPM's.  That would be much simpler than trying to record all of
the network packet arrival times so you can try to guess the random
number generator!

		      	     	     	- Ted

Index Home About Blog