Musings on Auth: How do we authenticate?

Authentication is central to securing applications and enabling personalised websites. This post discusses the different forms of authentication used in software.

Photo by Markus Spiske / Unsplash

Authentication is central to securing applications and enabling personalised websites. Without it we'd have no way to know if we are serving the correct data to the correct person. In today's post we will discuss the different forms of authentication used in software.

This post is part 2 in a series on authentication. It is assumed that you are familiar with the definitions and concepts described in the previous post.

Authenticating Actors

In software, when we talk about "authenticating actors" we are generally referring to authenticating that a person is the owner of some account in some application. The exact relationship between a real person and the account will depend on the application, for example:

A banking application will generally have one "user" account per a person, with all banking products (e.g. saving accounts, loans, cards) associated with the user account. This might even include business to which you as a person have access to.
Email providers will be authenticating that you are the person that created the email address but will not restrict you to having one email address. In fact it may not care about your real world identity as a person at all, and only that you can prove that you created the email address.
Many advanced systems may have an account per a person acting in a specific role - with different authentication requirements depending on the role. In a software company you might have a "software engineer - read only" account and a "software engineer - edit" account. This way the system has not only checked that you are the owner of the account, but now also knows in what capacity you are acting. It may do this so that it can make different decisions about what you are authorised to do.

For the purpose of this post we are not going to discuss how these accounts are created, only the different types of authentication.

Authentication Factors

There are three different things that are used for authentication. They are:

Something you have. A physical object that you posses.
For example: the key to a lock is something that you have.
Something you know. A piece of information.
For example: a passphrase.
Something you are. A physical attribute of you.
For example: your fingerprints.

These are generally referred to as "factors" - the very ones referred to in Two Factor or Multi-Factor Authentication (2FA, MFA). The actual meaning of 2FA and MFA is a little flexible. Purists will tell you that you must use different types of factors, however in most applications, having multiple factors of the same type is acceptable. This is because
a) as we discussed in part 1, we can have varying levels of confidence in our authentication and more and different factors simply increase our confidence, and
b) actually identifying the factor in play is non-trivial and open to all kinds of weird transitive properties.

Transitive Properties

At first glance it seems trivial to identify which of these methods we are using to identify an actor. However in reality, most methods of authentication have weird transitive properties that manipulate this. Lets run through some examples.

Biometrics

Despite their sci-fi origins, using biometrics for authentication is very common through fingerprint scanners and facial recognition. For any biometric authentication to work the computer needs a sensor to generate a digital version of the feature which it will then compare against the a stored digital model.

However most sensors like the ones used in your phone or computer don't really have the ability to check you're a real living person, they only have what they can perceive through their (price constrained) sensors. How does it know that it can see your face and not a picture of your face? If it's a computer how does it know the webcam it's using is really a webcam and not a piece of hardware pretending to be a webcam?

SMS

Receiving SMS messages is one of the most common forms of MFA today due to the ubiquity of mobile phones. You provide your phone number, so that you can receive codes that you then type into the website. So SMS can be used to prove that we are in possession of our phone right?

What happens when we drop our phone into a puddle? Apart from turning our expensive pocket computer into an expensive paper weight, we're going to go to our local mobile phone shop, purchase a new phone, pop in our SIM, and carry on looking at memes. Which says that perhaps what we are really doing is proving that we are in possession of the SIM card. Right?

Suppose instead of merely frying your phone you lose it by dropping it in the harbour. It's gone forever. You make your way to the mobile phone shop, however this time instead of just getting a new phone you also get a new SIM card to go with it. The salesperson tells you that in a couple hours your new SIM will be activated and you can resume scrolling memes and receiving text messages on your new phone.

That's right. Your phone number isn't even tied to a particular SIM card. Instead your provider simply maps phone numbers to SIM cards.

Leaving us in this weird state of is SMS authentication really a second factor or is it some form of delegated authentication? More on this later.

Credit Cards and Security Keys

Our final example is credit cards and security keys - things that fall into the category of something you have. These devices work by having embedded chips that can do cryptographic operations. Once registered with your provider (i.e. the bank or website), it can determine that the correct device is present based on the responses to the challenges it sends. By sending random challenges and/or binding them to contextual data like the current time, it becomes very difficult to fake these responses without having the actual device.

But once again, the question is what are we really testing. See this challenge response mechanism works on standard cryptographic algorithms. What makes them work is some data on the chip that never leaves the chip, but the existence of that data can be proved cryptographically. Which means what we are actually checking is that you "know" that secret material.

Of course most people can't remember multiple 600 digit numbers and do complicated maths in their head using said numbers. So it's not really feasible to claim that it's something you know. But if it's data, what stops it being copied to another device? It's the secret that matters, not the chip that it came on.

As such the creators of such devices spend a lot of effort making sure that the secret can't be extracted from the device without destroying it or at least leaving it noticeably tampered with. Unlike data stolen from a computer, you'd notice pretty quick if the credit card you use every day goes missing.

Delegated Authentication

You will find it unsurprising that I assert that authentication is hard - it takes a non-trivial amount of knowledge and resources to build and operate a good authentication system.

Instead you could use third-party you trust to do the authentication for you.

Delegated authentication is probably the most common form of authentication on the internet because HTTPS is a form of delegated authentication. In short:

We "choose" to trust a number of Certificate Authorities (CA) - usually as apart of our operating system or web browser.
Website operators apply to a CA to receive a certificate for a particular domain.
If the CA deems that the website operator is indeed the owner of the domain they issue to certificate.
When we connect to a website, we receive the certificate. We then validate that the domain on the certificate matches that of the website, and that the certificate was issued by one of our trusted CAs.

We can also delegate the authentication of actors to other third parties. Typically this takes one of two forms:

An outsourced provider that only keeps track of and authenticates our system's actors.
For example: Firebase, Okta, Ping Identity, Auth0
(This is not a recommendation or endorsement of these companies)
A provider that only keeps tracks of it's own users, but lets you know about them. This is commonly referred to as "social login" because of the large social media companies that offer this ability.
For example: Apple, Google, Facebook, GitHub.

If we go back to the "bank checking identity documents" of the previous post, we actually see that the bank is using a form of delegated authentication. Banks will list the document types that can be used to prove your identity, this is effectively a list of organisations for which they trust to only provide the documents to legitimate persons - typically various government organisations. So once they confirm that you match the person on the document and that the document is legitimate, the bank conclude that you are who you say you are.

Authenticating Sessions

As discussed in the previous post, sessions are a form of authentication which allows us to authenticate requests without needing to complete a full authentication of an actor each time. In fact sessions can be used just to know that we are dealing with the same anonymous actor without knowing who that actor is. In the case of known actors (e.g. logging into a website), sessions become a form of delegated authentication.

Despite the many specific session mechanisms that exist, they generally fit into one of three broad categories:

Token based authentication relies on some static piece of information provided with each request.
Examples include: HTTP Basic Auth, a session ID, and JWT based schemes.

Signature based authentication relies on some registered secret and an algorithm to prove knowledge of the secret without revealing it.
Examples include: HTTP Digest Auth and AWS Signature Version 4.

Stateful connections sit somewhere in between token and signature based authentication. In fact including them here may seem a bit odd because the purpose of sessions is for dealing with stateless connection yet we are now talking about stateful connections - which is kind of the point. These work by keeping track of data that is sent and encoding it in messages as either signatures or tokens. If this data becomes de-synchronised then we ignore it or abort.
Examples include: TCP, SSH, and TLS connections.

Summary

Whilst this post has covered much of the "standard" mechanisms of authentication that are taught, we've also begun to take a more critical look at how these mechanisms work and the nuance that needs to be explored in order to fully evaluate them.

From here on out we will start diving into specific authentication mechanisms and evaluating them.