Technical Details

This page lists more technical details of the implementation of the site and exactly how it generates passwords.

What Platform and Environment is the Site Based On?

I'm a Windows .NET developer, so my platform choice reflects that:

Most of the site could be quite easily run on the Mono platform. However there are some quite windows centric calls when deriving entropy for the random number generator. Appropriate alternatives should be available on other platforms.

Where Can I Get the Source Code?

From BitBucket. Under the Apache License.

What Developer Tools do I need to Build the Site?

I'm developing using Visual Studio Express 2013 for Web. Dependencies are referenced by nuget or included as binary assets. There is no database behind the scenes.

What Random Number Generator are you Using?

The fundamental building block of any password generator is a cryptographic random number generator. The key requirements of which are:

  1. You can't determine the next result based on the current one (or after observing many results).
  2. You can't determine previous results based on the current one (or after observing many results).

The standard option on Windows .NET is to use RNGCryptoServiceProvider. But I don't want to put all my eggs in one basket, so I designed my own random number generator (which RNGCryptoServiceProvider is just one of many sources of entropy). Incidentally, I realised after designing my RNG that I'd made a more primitive version of Yarrow or Fortuna. I'd certainly have used one of them if I was starting from scratch, but it was a good learning exercise.

The implementation is in the RandomService class. An English description of the algorithm follows.

A random number generator is created per thread (IIS serves each request on a thread). The generator instance is never destroyed over the life time of the site. Although it is re-seeded periodically.

The RNG is seeded with entropy from two sources, one expensive but high quality, the other cheaper but lower quality. You can see the details of those two categories of entropy. Initialisation occurs in the RandomService constructor. The expensive entropy is passed in as an argument, while the cheap entropy is derived in the constructor. Both are combined using SHA256 to create the initial seed for the RNG.

The actual random data is based on the HMAC-SHA256 hash function. The secret key is derived from RNGCryptoServiceProvider in the RNG constructor. Each 32 byte block is produced from one iteration of HMAC-SHA256. If more than one block of 32 bytes is required to produce a password, HMAC-SHA256 is called several times to derive several blocks.

The input to each block is the previous block (or the initial seed), plus some timers and counters. At regular intervals, a new Guid is created an included as part of the input. At less regular intervals an entirely new expensive seed is obtained and incorporated into the hash function. The private method GenerateNextBlock() contains the implementation which produces each block.

At its core, the RandomService class produces 32 byte blocks. However, consuming code needs Int32s, Singles and Booleans. So there are some helper methods to convert raw bytes into integers and floats.

What Random Tests Have you Subjected Your RNG to?

Nothing other than informal testing and observation of generated passwords (the hex generation returns the raw RNG results).

I am aware of the diehard and dieharder tests used on rngs, but have not had the time to use them.

What Sources of Entropy do you use?

No matter what randomisation algorithms are used, without a good source of entropy you'll get the same numbers out the other end.

Major / Expensive Entropy

A 16kB buffer is initialised using RNGCryptoServiceProvider when the website starts. Each of the above sources are queried to retrieve some random numbers. Those numbers are written to the buffer in 64 byte blocks. Each time a block is written to the buffer, an SHA256 hash is taken of the whole buffer. This final SHA256 digest is added to the pool of seeds.

As most expensive entropy is the result of network requests, a pool is initialised at startup. This pool is refreshed with new data when it gets low. As such, expensive entropy is always available to the main RNG quickly (and if network derived data is not available RNGCryptoServiceProvider is used instead).

Minor / Cheap Entropy

The RandomSeedService class generates seeds based on the major / expensive entropy. The RandomService uses the cheap / minor sources.

What Algorithms do you use to Generate Passwords?

Each algorithm is implemented in MVC controller classes named (for example) ApiPassphraseV1Controller. Generally, they are table lookups. Where the table varies from style to style. By convention, a method SelectPhrases() / SelectPasswords() / SelectPINs() returns one or more passwords based on the parameters supplied.

More details can be found in the API documentation.

Hex Style

The hex passwords are taken directly from the random number generator without any modification. No lookups, tables or hash functions here. Just raw, unadulterated bytes.

Passphrase Style

Each word is chosen from the dictionary as a giant lookup. After the phrase is constructed, it may be rejected if it does not meet the length requirements. After 100 attempts without meeting length requirements, a null passphrase will be returned.

Pronouncable Style

Uses a lookup into vowel and consonant sounds. Some very basic logic is used to alternate between vowels and consonants, but I could do much better cleaning up double letters and so on.

Readable Passphrase Style

This calls out to a 3rd party library to generate passwords. Source code is available. After the phrase is constructed, it may be rejected if it does not meet the length requirements. After 100 attempts without meeting length requirements, a null passphrase will be returned.

PIN Style

PINs are constructed as a lookup into the digits 0..9. After a PIN is constructed, it may be rejected if it is on the blacklist.

Unicode Style

There are too many code points to do lookups into a full table of known characters. So generating Unicode passwords uses a different algorithm.

  1. A random Int32 is generated to select a candidate code point.
  2. High bits from the int is masked depending on if only code points from the basic multilingual plane are selected (or not)
  3. The candidate code point may be rejected if it lies in certain ranges (surrogate points, control codes, etc)
  4. The candidate code point must be classed as a particular Unicode category (this is how East Asian characters are excluded)