knowledge-kitchen
/
course-notes
class: center, middle # App Authentication Dealing with user accounts. --- # Agenda 1. [Overview](#overview) 1. [Terminology](#teminology) 1. [HTTP Basic Authentication](#http-auth) 1. [Sessions](#sessions) 1. [JSON Web Tokens](#jwt) 1. [OAuth](#oauth) 1. [Conclusions](#conclusions) --- name: overview # Overview --- template: overview ## Concept Apps that allow users to create accounts must authenticate those users in some way. There are several common techniques for app authentication: -- - HTTP Basic Authentication (not commonly used by apps, but exists) -- - Session-based authentication -- - JSON Web Token-based authentication -- - Authentication using "Single Sign-On" services, such as Facebook, Google, etc. --- name: terminology # Terminology -- template: terminology ## Authentication vs authorization While the two words are often used interchangeably in practice, we can differentiate between authentication and authorization: - **Authentication** is the process of identifying who is attempting to access a system. Subsequent to successful authentication, authorization can be performed. - **Authorization** determines the appropriate access-level, i.e. what resources of the system the authenticated user is allowed to access. --- template: terminology name: encode-encrypt-hash ## Encoding vs. encryption vs. hashing -- **Encoding** is transforming data from its raw state using a particular mapping scheme, where each part of original data is replaced by its correspondant in the mapping. Encoding is used most often to make the data portable between systems. Encoding can easily be reversed if the mapping scheme is known. Encoding is most often done using well-known schemes, such as [base64](https://en.wikipedia.org/wiki/Base64). ![Base64 encoding example](../assets/authentication/encoding-example.png) --- template: encode-encrypt-hash **Encryption** is a process of more securely encoding data, where an attempt is made to keep the mapping scheme unique and private. Encryption is used most often to keep data private. The exact encoding mapping is based on `keys` that are known only to the parties involved. For example, in [symmetric-key](https://en.wikipedia.org/wiki/Symmetric-key_algorithm) encryption, the same key can be used to both encrypt and decrypt the data. In [public-key encryption](https://en.wikipedia.org/wiki/Public-key_cryptography), different keys must be used to encrypt and decrypt the data. ![Public-key encryption example](../assets/authentication/encryption-example.png) --- template: encode-encrypt-hash - Unlike encoding and encryption, **hashing** is a one-way function that transforms data into a fixed-length "digest" form. There is no known way to reverse the process. Hashing is commonly used by apps to store user passwords. When a user creates an account, a hashed version of their password is stored. When the user logs in later, the hash of the password they enter is compared to the stored hashed version. ![SHA256 hashing example](../assets/authentication/hashing-example.png) --- name: http-auth # HTTP Basic Authentication -- ## Concept While HTTP is a stateless protocol, it does come with a native ability to handle passing authentication credentials between a client and server. This is known as **HTTP Basic Authentication**. -- - When an incoming request attempts to access a protected resource, the server looks for an HTTP request header called `Authorization` containing **encoded credentials**. -- - If not present, the server responds with a special HTTP response header, `WWW-Authenticate`. -- - This triggers a web browser to automatically pop open a log-in form for the user. -- - Once the user submits it, the browser sends an `Authorization` request header to the server with the user's credentials. -- - If those credentials are correct, the server responds with the requested content. --- template: http-auth ## UML sequence diagram ![HTTP Basic Authentication](../images/auth_http_sequence_diagram.png) --- template: http-auth ## Considerations HTTP Basic Authentication has some nice advantages: -- - Natively supported and 'understood' by all web browsers and web servers. -- - Log in form is automatically generated by browser. -- - Easy to add manage user access on server side (e.g. managed in `.htaccess` and `.htpasswd` simple text files on Apache ) --- template: http-auth ## Considerations (continued) HTTP Basic Authentication has some aspects that make it unsuitable for some common app use cases: -- - Access control is managed by server settings files, while other app content is managed in in databases. -- - Log in form is automatically generated by browser and cannot be designed to match app design. -- - Developers have no control over the encoding and encryption - passwords are sent in simple `base64` encoding, which is easily decoded. -- - Credentials must be passed with every request, increasing the attack window. --- name: sessions # Session Authentication -- ## Concept Because all web browsers support cookies, it is possible for a web browser used by a logged-in user to identify itself to the server with every request by sending a unique ID along with every request. This is called a session cookie. -- - When an incoming request attempts to access a protected resource, the server looks for an HTTP request header called `Cookie` containing a unique **session identifier**. -- - If not present, the server doesn't return the requested resource, but rather sends back a log in page. -- - The browser displays the log-in form to the user. -- - Once the user submits it, the browser sends an `HTTP POST` request to the server with the user's credentials. -- - If those credentials are correct, the server responds with the requested content and a `Set-Cookie` header containing a unique session id. --- template: sessions ## UML sequence diagram ![Session Authentication](../images/auth_sessions_sequence_diagram.png) --- template: sessions ## Considerations Session authentication mechanisms have a few features that make them more suitable for web applications. -- - Access control can be managed in the same databases used to manage other app content. -- - Passwords need only be sent once to the server, reducing the password attack window. -- - Browsers automatically store session IDs and send them automatically with every request to the server, requiring less custom code. -- - Cookies automatically self-destruct at an expiration date set by the server, or the server can decide to invalidate the session id, immediately logging out the user, if desired. --- template: sessions ## Considerations (continued) Despite having several advantages over HTTP Basic Authentication for app developers, sessions have their limitations as well. -- - Cookies require the server to validate the session by hitting the database with every request. -- - Cookies are supported by web browsers, but not necessarily by other kinds of clients like native mobile apps, desktop apps, bots, and others. -- - Sessions _require the server to maintain state_ of the session.... one more thing to worry about. --- name: jwt # JSON Web Token Authentication -- ## Concept Today's apps often have multiple interfaces: web, native mobile app clients, desktop app clients, bots, etc, and monolithic servers are increasingly being replaced by cloud microservices. [JSON Web Tokens](https://en.wikipedia.org/wiki/JSON_Web_Token) are well-suited for this. -- - When an incoming request attempts to access a protected resource, the server looks for an HTTP request header called `Authentication` containing a unique **token**. -- - If not present, the server returns an `HTTP 401` response code indicating unauthorized access. -- - The client logic asks the user for their credentials in any manner it deems appropriate. -- - Once the user submits credentials, the browser sends an `HTTP POST` request to the server with the user's credentials. -- - If those credentials are correct, the server responds with a **token**. The client stores this token in any way it chooses. --- template: jwt ## UML sequence diagram ![JSON Web Token Authentication](../images/auth_jwt_sequence_diagram.png) --- template: jwt ## Considerations JSON Web Tokens (JWT) offer a few benefits over other authentication mechanisms. -- - As with sessions, user passwords must only be sent once to the server, reducing the attack window. -- - While the token itself is encoded but not encrypted, tokens can contain arbitrary _payload_ data that can be encrypted however developers desire, allowing for strong encryption of sensitive content. -- - Tokens are _cryptographically signed_ by the server with a secret key. Once signed, the token can be easily validated by the server with every request. The token includes data about what access privileges the user should be granted. There is _no need to maintain any session data or state_ on the server. This also allows a single token to be used across multiple servers or microservices that don't share a credentials database. -- - Tokens are most often stored in browsers' local storage, but can be stored in cookies or any client-side data storage for other kinds of apps. --- template: jwt ## Considerations (continued) As with any authentication scheme, there are limitations and concerns in regard to JWT as well. -- - As with other credentials, tokens must be stored and transported securely. -- - Since tokens are can be validated independent of any server state, they are more difficult to destroy than sessions. -- - To be able to immediately invalidate a JWT token, the server must maintain the current state of each token in a server-side database of some kind. This defeats one of purported benefits of tokens not requiring the server to maintain session state. --- name: oauth # OAuth -- ## Concept OAuth 2.0 is an open standard for _delegated access control_, i.e. allowing one app or website to access resources contained within another app or website on behalf of a user. -- - OAuth is an authorization protocol, _not_ an authentication protocol. However, OAuth is most typically used in the so-called [three-legged authentication flow](https://www.ibm.com/docs/en/datapower-gateway/10.5.x?topic=flows-three-legged-oauth-flow) where one service delegates user authentication to another service. -- - For example, a university may offer students to sign into university systems using a Google account. Thus, the student must grant permission to the University system to access the student's personal data located in the student's Google account. Note that for this to work, the university must have given student identity information to Google, who then controls access to that data, which arguably should raise [ethical questions](https://link.springer.com/article/10.1007/s11528-021-00599-4), but rarely does. --- template: oauth ## Comparison to JWT Like [JWT](#json-web-token-authentication), OAuth relies on token exchange between client and server for authorization. But whereas JWT is a token format - specifying what encryption is being used, what access the token grants, and including a cryptographic signature, Oauth does not specify how to create and use these tokens. -- - In practice, JWT is most often used as the token format for OAuth. -- - In contrast, OAuth is an entire protocol from start-to-finish specifying the steps a client and user must go through to gain access credentials to a protected resource and how these credentials must be passed back and forth. --- template: oauth ## Example For example, here is a [sequence diagram](/content/courses/software-engineering/slides/uml-diagrams/#30) showing how apps can use LinkedIn for delegated access control using OAuth: ![Linkedin OAuth](../assets/authentication/linkedin-oauth-flow.png) --- template: oauth ## Considerations In practical terms of developing application authentication systems, OAuth is more complex and multilayered than JSON Web Tokens. It provides some flexibiliy in implementation which leads to a wide variety of implementations. -- - Covering OAuth in detail will require more dedication and example code than possible in a simple slide deck. -- - You are encouraged to [read more about it](https://stackoverflow.com/questions/32964774/oauth-or-jwt-which-one-to-use-and-why/48333725#48333725). -- - To simply implement sign-in to your applications using a tech giant's OAuth authentication system, there are many detailed code tutorials available that do not require you to master all the details of the protocol. --- name: conclusions # Conclusions -- This has been a short survey of common authentication methods. -- - The article, [Cookies vs. Tokens: The Definitive Guide](https://dzone.com/articles/cookies-vs-tokens-the-definitive-guide) provides a detailed comparison between the two. -- - You are _recommended to use JSON Web Token-based authentication_ for your own apps, since it provides the most flexibility. -- - The YouTube series, [API Authentication With Node](https://www.youtube.com/playlist?list=PLSpJkDDmpFZ7GowbJE-mvX09zY9zfYatI), by CodeWorkr, provides a very detailed code tutorial of sessions, JSON Web Tokens, and OAuth authentication code using [Express](/content/courses/agile-development-and-devops/slides/express), [React](/content/courses/agile-development-and-devops/slides/react-intro), and [MongoDB](/content/courses/database-design/slides/mongo-setup) with mongoose. Highly recommended. -- - [This example repository](https://github.com/nyu-software-engineering/data-storage-example-app) contains a full app (front-end and back-end) that includes code to do JWT authentication. -- - Thank you. Bye.