PII and UIDs – Joseph Zang

This should go without saying, but personally identifiable information (PII) should not, under any circumstances, be used as unique identifiers (UIDs). There are gross misuses of PII in every industry from retail, to banking, to education, and more. This post is going to outline several reasons why PII data should never, to any extent, be used as a UID.

Most organizations try to argue that if they only use a subset of PII, say the last 4 digits of a Social Security Number (SSN), there is no harm and no foul, but that could not be more incorrect. That is like giving a hacker half your password because, by itself, half a password is worthless. They fail to realize that by giving away any subset of data they exponentially decrease the amount of work needed to compromise the superset from which it came.

Every database designer ever to use SSNs as primary keys failed in their design. It does not matter how normalized their data was; they failed through fundamental laziness. SSNs were never designed to be globally unique identifiers in IT systems. Yes, they do happen to work well for uniquely identifying Americans, but it is time to think globally. To treat SSN data as just another random unique number is absurd; it has other much more sensitive and important uses. Using SSNs as primary keys is analogous to using a sledge hammer on a finishing nail; it is overkill to the point of stupidity.

SSNs aren’t the only example, but they are the most abused for their inherently unique design. The same poor uses of PII are common of phone numbers, street addresses, bank account numbers, and birthdays. Often times these sources are used in combination to generate UIDs. This might seem to some like a good idea because it uses subsets from multiple sources, but in actuality it just discloses more sources for a hacker, giving him a better starting point in his quest to gather information before an attack.