Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I would be very, very, careful about using PRNG to generate unique IDs.

If the ID indeed needs to be unique, I would stay as far away as possible from PRNG. That a PRNG is cyclic with some cycle size that's larger than the number of atoms in the universe does not prevent you from running into repeated sequences in a much higher probability, especially if you're reducing the space of values returned one way or another.

To some extent they're even wanted properties of a good PRNG. If you're using a PRNG whose cycle is larger than the space of distinct values it returns, there _will_ be repeated sequences of length > 1.

If your application cannot handle duplicated Ids, then use an ever incrementing counter concatenated with a serverId, add a checksum for the good cause if you want (it won't add any entropy or reduce the risk of collision)

If you really want to use a PRNG to generate unique ids, use an ever incrementing counter + a serverId + a random number generator. You can probably concatenate them any way you want, as long as the ever incrementing counter is stored in full, but the minute you truncate that one, I wouldn't expect any uniqueness anymore.

If you simply concatenate partial values coming from a PRNG, I would expect the probabilities for a collision to be much higher than cycle_length.

Using a hash function on a counter will still increase the risk of collision that simply using the counter. And using a hash function on a PRNG will also increase the risk of collision compared to the pure PRNG approach.

If your app is fine with duplicated IDs, then go ahead, use a PRNG. It's completely OK. Alternatively if you have a PRNG whose space of values is bigger than the space of UUIDs you're generating I would suspect should be fine. Same if you have a (non pseudo) RNG at hand.

EDIT : added the hash function paragraph.



Or use a cryptographically-secure PRNG (CSPRNG), which doesn't have the flaw of a cycle size that will be reached within our lifetimes. (Usually they return 128-bit or more outputs, and 2^128 is a very large number; by comparison, the standard Mersenne twister returns 32-bit outputs.)

For most applications, CSPRNGs are way, way more than fast enough. They're also readily available and well-tested on all current OSes and languages.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: