How can you use a UUID as a primary key if uniqueness is not guaranteed?


A UUID is a Universally Unique Identifier. It is a 128-bit number that has two useful properties:
  1. A UUID is unique for all practical purposes. The asterisk is there because there is, in principle, some non-zero chance that two UUIDs somewhere are the same.

  2. UUIDs are generated without any central service handing them out.

Why are UUIDs useful?

Think about how you might generate unique identifiers for use in your database. You might create a counter in your application, start at 0, increment each time an id is needed. But you quickly run into problems:

For all of these reasons, it is nice to have an id generator already implemented, that doesn't need crash recovery logic, that is thread-safe, that deals with concurrency, and that just works in a distributed system without coordination, (so that no network communication is required).

That's why UUIDs are nice.

Probability of seeing a duplicate UUID

There are different kinds of UUID. This discussion focuses on version 4, or random UUIDs. A random UUID has 122 random bits, (6 bits are used to identify that the UUID is a version 4 UUID). I.e., a UUID represents 122 truly unbiased coin tosses. If your application generates 103 trillion UUIDs, your odds of collision are 1 in 1,000,000,000. (See the wikipedia article for more on this topic.)

You are far more likely to get struck by lightning twice in your lifetime than to observe a UUID duplicate in your application.

But if you are still worried about it ...

If, in spite of these odds, you are still worried about collisions, then the fix is easy: catch the exception created when a duplicate occurs (e.g., your database tells you that you have a PK violation), and get another UUID.

How to use UUIDs

Postgres (for example), has a UUID datatype. You can declare a column to be of type UUID, and then use the built-in UUID functions to generate values.

In order to use UUIDs in postgres, you have to install an optional module. E.g., sudo apt-get install postgresql-contrib, (or use yum if that is correct for your version of Linux).

Then in psql:
create extension "pgcrypto"

Once you've done this, you can do the following in psql:
create table t(id uuid not null primary key); insert into t values(gen_random_uuid()); insert into t values(gen_random_uuid()); insert into t values(gen_random_uuid()); select * from t; id -------------------------------------- bdb9eb79-95f9-489e-a22f-dbd377595bb7 64850733-ef4a-46c6-804f-31d005c7fb0c 9626a993-a264-43ce-a404-8f5638d21c74 (3 rows)