Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Using /dev/urandom to generate every new ID is not optimal for a few reasons:

1) It's not better than the method I suggested, but requires to use I/O for every generated ID. So without gaining anything you lose performances. In fact you could imagine /dev/urandom as a more complex form of an hash function in "counter mode". Given that for this application, cryptographic properties are not needed, but only good distribution, the two methods are identical but one is very fast and one is slower.

2) /dev/urandom in certain systems, notably OSX, at some point may block for some time, it is not guaranteed to be always able to generate a stream of random numbers. Certain operating systems may generate only a given amount of stream to /dev/urandom, starting with a given amount of entropy they have in the pool. If the seed cannot be renewed with new entropy the OS may block readers on /dev/urandom. I think this implementation does not make much sense and /dev/urandom should always provide output as long as the original seed had enough entropy, but implementations may vary, and you cannot ignore OSX.



> /dev/urandom in certain systems, notably OSX, at some point may block for some time

This is not true. OS X has the same behavior as the BSDs: /dev/urandom is a compatibility symlink to /dev/random, and neither device blocks.

https://developer.apple.com/library/mac/documentation/Darwin...

"Yarrow is a fairly resilient algorithm, and is believed to be resistant to non-root. The quality of its output is however dependent on regular addition of appropriate entropy. If the SecurityServer daemon fails for any reason, output quality will suffer over time without any explicit indication from the random device itself."

The only device node that's broken in this way is Linux's /dev/random.


I had to switch implementation to use SHA1 in counter mode in actual code, because on OSX, /dev/urandom blocked. To make it block did not required a huge amount of data.


Huh. Well that's a fairly serious bug that someone should track down and document widely. I'll see if I can reproduce it on my OS X machine.


> It's not better than the method I suggested, but requires to use I/O for every generated ID. So without gaining anything you lose performances.

I highly doubt unique ID generation is going to be the bottle neck in your app. Even if is, while your method is faster, it is not that much faster; for me in Python:

  In [15]: %timeit hashlib.sha1(b'100'+seed).digest()
  1000000 loops, best of 3: 632 ns per loop

  In [17]: %timeit dev_urand.read(20)
  100000 loops, best of 3: 1.08 µs per loop

  In [22]: %timeit uuid.uuid4()
  10000 loops, best of 3: 15.7 µs per loop


In certain systems (OSX, certain *BSD), /dev/urandom is an alias for /dev/random so the generator will completely block waiting for entropy.


On FreeBSD it is the other way around: /dev/random will never block once seeded. /dev/urandom as a symlink will give the same behavior.

I haven't tried OS X, but given that it also uses a PRNG (Yarrow) for /dev/random, I would expect it to behave similarly. Are you able to reproduce the described behavior?


Do you have measurements that show that urandom is slower than a SHA1 hash on Linux? I don't have a good sense for how fast/slow the read call is (obviously you can cache the fd across reads).

Also, I can't find references to OS X or any other OS blocking on urandom reads. Can you point me to one?


Can you explain why you 'cannot ignore OSX'? I thought you talk about server side generation here. I .. might be a bit harsh here, but I don't think anyone would _want_ to run a server on OSX? Why can't you _not_ just ignore OSX for these use cases?


Couldn't you request N blocks of bits from every call to read() and amortize the syscall cost?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: