Authentication
The Three A's Revisited
- The filesystem chapter introduced authentication, authorization, and access control
- This chapter focuses on authentication: how the system establishes who you are
- Authentication is always a claim followed by evidence
- "I am Ning" — but how does the system verify that?
- Two classic approaches:
- Something you know (password, PIN)
- Something you have (a key, a hardware token, a phone)
- Something you are (fingerprint, face — biometrics)
- Real systems increasingly combine more than one (multi-factor authentication)
User Accounts on Unix
- Unix stores user account information in
/etc/passwd- Readable by everyone (programs need to look up user names)
- One line per account, colon-separated fields
$ grep tut /etc/passwd
tut:x:501:20:Tutorial User:/Users/tut:/bin/bash
- Fields in order: username, password placeholder, UID, GID, comment, home directory, shell
- The
xin the password field means "look in /etc/shadow instead"- Historically passwords were stored here — hence the name "passwd"
- Moving them to
/etc/shadowwas a security improvement
Password Storage
- Passwords are never stored as cleartext
- If the file is leaked, attackers would have everyone's password immediately
- Instead, the system stores a hash of the password
- A one-way function: easy to compute hash from password, infeasible to reverse
- Deterministic: same password always produces same hash
- When you log in:
- You type your password
- The system hashes what you typed
- It compares the result to the stored hash
- If they match, you are authenticated
/etc/shadowstores the hashed passwords; only readable by root
$ sudo cat /etc/shadow | grep tut
tut:$6$rounds=656000$salt$hashedvalue...:19831:0:99999:7:::
- The hash field encodes:
$6$— hashing algorithm (6 = SHA-512)$rounds=656000$— work factor (how many iterations)$salt$— random value mixed in before hashing (see below)- The actual hash
Salts and Rainbow Tables
- A salt is a random value stored alongside the hash
- Before hashing, concatenate the salt and the password
- Same password + different salt = different hash
- Why bother?
- Without salts, attackers can pre-compute a table of (password → hash) pairs
- Called a "rainbow table"
- With salts, the attacker must redo the computation for each account separately
- In Python,
hashlibprovides low-level hashing; usepassliborbcryptfor password storage in real applications
import hashlib
import os
# Low-level: do not use this directly for passwords
password = "correct horse battery staple"
salt = os.urandom(16)
hashed = hashlib.pbkdf2_hmac("sha256", password.encode(), salt, 260000)
print(f"salt: {salt.hex()}")
print(f"hash: {hashed.hex()}")
pbkdf2_hmacis a "slow" hash function — intentionally expensive to compute- Makes brute-force attacks much slower
- The 260000 is the iteration count; higher = slower to crack, also slower to verify
Exercise: Hash Properties
-
Run the hash example above twice. Are the outputs the same? Why or why not?
-
Change the password by one character and re-run. How does the hash change? What property of hash functions does this illustrate?
sudo and su
- Principle of least privilege: run processes with only the permissions they need
su username— substitute user; switches to another account for the rest of the session- Requires the target user's password (or root password)
su -starts a fresh login shell as that user
$ whoami
tut
$ su nobody
Password:
$ whoami
nobody
$ exit
$ whoami
tut
sudo command— runs a single command as root (or another specified user)- Configured in
/etc/sudoers - Uses your password, not root's
- Actions are logged (unlike
su)
- Configured in
$ cat /etc/shadow
cat: /etc/shadow: Permission denied
$ sudo cat /etc/shadow | head -n 3
root:*:19810:0:99999:7:::
daemon:*:19810:0:99999:7:::
tut:$6$rounds=656000$...
/etc/sudoerscontrols who can run what as whom
# Allow tut to run any command as root (with password)
tut ALL=(ALL) ALL
# Allow deploy user to restart nginx without a password
deploy ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart nginx
- Use
visudoto edit/etc/sudoers— it validates syntax before saving
Exercise: sudo Logging
-
Run
sudo ls /rootand then look at the system log (try/var/log/auth.logon Linux orlog show --predicate 'process == "sudo"'on macOS). What information is recorded? -
Why is logging important for
sudobut less important forsu?
Windows Authentication
- Windows stores local account credentials in the Security Account Manager (SAM) database
- Located at
C:\Windows\System32\config\SAM - Locked while Windows is running; readable only by the SYSTEM account
- Located at
- Passwords are hashed with NTLM (NT LAN Manager)
- Historically used MD4 (now considered weak)
- Modern Windows also stores a more recent hash format
- Domain environments use Active Directory (AD)
- Kerberos protocol for authentication tickets
- Single sign-on across machines in the domain
- Windows equivalents of Unix tools:
| Unix | Windows equivalent | Purpose |
|---|---|---|
su |
runas /user:name cmd |
Run command as another user |
sudo |
UAC elevation prompt | Temporarily raise privileges |
id |
whoami /all |
Show current user and groups |
passwd |
net user name * |
Change a user's password |
useradd |
net user name /add |
Create a new user account |
- User Account Control (UAC) is Windows' version of least-privilege enforcement
- Prompts when a program requests elevated privileges
- Can be suppressed (but shouldn't be) for automation
SSH Key Pairs
- Password authentication has weaknesses
- Passwords can be guessed, phished, or intercepted
- Typing a password into a remote terminal exposes it to the remote machine
- Public key cryptography provides an alternative
- Generate a key pair: a private key (keep secret) and a public key (share freely)
- Something encrypted with the public key can only be decrypted with the private key
- Something signed with the private key can be verified with the public key
- For SSH: server challenges the client to prove it has the private key
- Without ever transmitting the private key
$ ssh-keygen -t ed25519 -C "tut@example.com"
Generating public/private ed25519 key pair.
Enter file in which to save the key (/Users/tut/.ssh/id_ed25519):
Enter passphrase (empty for no passphrase):
Your identification has been saved in /Users/tut/.ssh/id_ed25519
Your public key has been saved in /Users/tut/.ssh/id_ed25519.pub
ed25519is the recommended key type (newer and more secure than RSA)- Add a passphrase to protect the private key if it is ever stolen
- Copy the public key to the remote server
$ ssh-copy-id -i ~/.ssh/id_ed25519.pub tut@remote.example.com
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s)
Number of key(s) added: 1
- This appends the public key to
~/.ssh/authorized_keyson the remote machine - SSH agent avoids re-typing the passphrase in each session
$ eval $(ssh-agent)
Agent pid 12345
$ ssh-add ~/.ssh/id_ed25519
Enter passphrase for /Users/tut/.ssh/id_ed25519:
Identity added: /Users/tut/.ssh/id_ed25519
$ ssh tut@remote.example.com
# No passphrase prompt — agent handles it
- Configure SSH behaviour in
~/.ssh/config
Host remote
HostName remote.example.com
User tut
IdentityFile ~/.ssh/id_ed25519
ForwardAgent no
- After this,
ssh remoteis equivalent tossh -i ~/.ssh/id_ed25519 tut@remote.example.com
Exercise: Key Inspection
-
Generate a key pair with
ssh-keygen. What files are created? What do the contents of the public key file look like? -
What does the
ForwardAgent nosetting in~/.ssh/configdo, and why might you want to keep it disabled on untrusted hosts?
Data Authentication: Hashing and HMAC
- Authentication is not only about who you are — it can also mean verifying data integrity
- "Did this file arrive without modification?"
- "Did this message really come from who it claims to?"
- Checksums let you verify a file hasn't changed (accidentally or maliciously)
$ sha256sum site/birds.csv
3f2a... site/birds.csv
$ sha256sum site/birds.csv
3f2a... site/birds.csv
$ echo "extra" >> site/birds.csv
$ sha256sum site/birds.csv
9b1c... site/birds.csv ← different hash
- A plain hash proves the data is unchanged, but not who produced it
- An HMAC (Hash-based Message Authentication Code) uses a shared secret key
import hashlib
import hmac
key = b"shared-secret-key"
message = b"the data we are authenticating"
mac = hmac.new(key, message, hashlib.sha256).hexdigest()
print(f"HMAC: {mac}")
# Verify: recompute and compare
expected = hmac.new(key, message, hashlib.sha256).digest()
received = bytes.fromhex(mac)
print(f"valid: {hmac.compare_digest(expected, received)}")
- Use
hmac.compare_digestrather than==to prevent timing attacks==short-circuits on the first mismatch, leaking timing informationcompare_digestalways takes the same amount of time
Multi-Factor Authentication
- Even strong passwords can be stolen; SSH keys can be copied
- Multi-factor authentication (MFA) requires two or more independent factors
- Something you know + something you have is most common
- TOTP (Time-Based One-Time Password) is the mechanism behind authenticator apps
- Server and app share a secret key
- Every 30 seconds, both compute
HMAC(secret, current_time // 30) - The app displays the result; you type it in
- An intercepted code is useless after 30 seconds
- Hardware security keys (e.g., YubiKey) are more phishing-resistant
- They verify the domain of the site before responding
import pyotp
import time
# Server-side: generate a shared secret once and store it
secret = pyotp.random_base32()
print(f"secret (store securely): {secret}")
# App-side: generate a code using the secret and current time
totp = pyotp.TOTP(secret)
code = totp.now()
print(f"current code: {code}")
# Server-side: verify a submitted code
print(f"valid: {totp.verify(code)}")
- Never implement your own TOTP: use a well-tested library