Inspired from EECS 388 @ UMich
Security Mindset
Level-2 Problem: “Weakness” (Factors that predispose systems to vulnerability)
Level-1 Problem: “Vulnerability” (Specific things that can be exploited to cause an assault
Level-0 Problem: “Assault” (Actual attack on a specific flaw)
Cryptography
Goals:
- Message Integrity (Attacker can’t modify messages without that being detected)
- Want a kind of “hash”
- Pure random function (mapping), although secure, is impractical due to space limitations.
- Use a PRF (pseudo random function) instead.
- Secret k indexes into a family of pseudo random functions and uses that for hashing.
- Recall function here is just that, a function. It does not need to be 1-1 or invertible.
- PRF can be asserted as secure by asking if an attacker can win the "PRF Game”
- Integrity solved via a MAC (Message Authentication Code). e.g. HMAC-SHA256
- A MAC is a function that takes a key and provides the hash using a “hash function” (which is fixed)
- MAC is ~ PRF, although slightly different.
- Message Confidentiality (Keep message contents hidden from eavesdropper)
- Horrible Ciphers:
- Caesar Cipher - Uses a hidden secret k and shifts all values in the plaintext k places. Decryption works by reversing this.
- Can be broken by trivial frequency analysis.
- Vigenere Cipher - Uses a hidden secret k (which is a word, not a single value now). Shifts each letter in the plaintext by the corresponding letter in k. If |k| < |m|, essentially concatenate k to itself and continue.
- Kasiski Method:
- Can be broken by looking for repeating strings and getting the distance.
- Distance is likely a multiple of the key length.
- Once we know the key length n we can break the cipher text into n slices and solve each as a Caesar Cipher
- If |k| >= |m|, reduces to OTP (One Time Pad)
- Semi-Working Ciphers:
- One-time Pad (OTP)
- Generate a shared secret k of random bits, s.t. |k| = |m|.
- Encrypt: c = m xor k
- Decrypt m’ = c xor k
- Provably secure because we are essentially generating a random string which can contain no useful information to an attacker. Any string could be the result of this cipher text given some key k’.
- Impractical because it truly can only be used once.
- (a xor k) xor (b xor k) = (a xor b).
- Then be sliding common strings that we expect to be in either a or b and xor with our result, we will get the output of the other value.
- Stream Cipher
- Given a shared secret k, have Alice and Bob seed a PRG with k.
- Encrypt: To encrypt a message m of length l, generate l bits from the PRG, and xor those bits with m.
- Decrypt: To decrypt a message m of length l, generate l bits from the PRG, and xor those bits with m.
- Potential Flaws:
- Assuming Alice and Bob’s PRG is at the same state. If Alice encrypts using l bits from the PRG, Bob can only decrypt by generating EXACTLY the same l bits from his PRG. If two messages are encrypted simultaneously the whole system breaks down.
- Can never reuse the key or PRG output bits.
- e.g. Salsa20, RC2 (Broken), RC4 (Broken)
- Block Cipher (Best option)
- Encrypt message in fixed length blocks with a reusable key.
- Use a PRP (Pseudorandom Permutation) instead of a PRF because we need to be able to invert the function.
- e.g. AES (Fixed length input and output)
- Split k into ten sub keys. Perform a set of operations ten times, each time using a different sub key.
- Operations:
- Non-Linearize - Run each byte through a non-linear function
- Shift - Circular shift each row. ith row shifted by i
- Linear-mix - Treat each columns as a 4d vector. Multiply by a constant invertible matrix
- Key-Addition - xor each byte with corresponding byte of round sub key
- AES only works on fixed length messages
- To encrypt multiple blocks:
- Don’t use:
- Encrypted Codebook Mode - Encrypts each block independently. Can keep edge details.
- Cipher-Block Chaining (CBC) Mode - c_i = E_k(m_i xor c_i-1). Can suffer from padding oracle attacks.
- Can Use:
- Counter Mode - Uses block cipher as a PRG. msg_id = random data, m_i = E_k(msg_id || ctr)
- Authentication (Recipient can verify message was sent by some sender)
- Sometimes called a “digital signature”
- RSA is a good way to accomplish this.
- Confidentially and Integrity simultaneously
- Encrypt, then MAC.
- AEAD (Authenticated Encrypted and Associated Data)
- c, tag := Seal(k, m, associated_data)
- m’ = Unseal(k, c, associated_data, tag)
- AES_GCM (“Galois Counter Mode”)
- e.g. ChaCha-Poly1305, Salsa20-Poly1305
- Need two shared keys (shouldn’t reuse keys across methods. crypto detail, not mentioned here). Can use PRG to generate two shared keys via one shared key
- If there exists a reverse channel, use separate keys!
- Cryptographic Doom Principle: If you are required to perform any cryptographic operation before verifying the MAC on a message, it will probably lead to issues.
Hash Functions:
- Arbitrary length inputs. Fixed length outputs.
- Not random.
- e.g. SHA, SHA-256, etc.
- Desirable properties:
- Collision Resistant: Hard to find x, x’ s.t. H(x) = H(x’)
- Second Preimage Resistant: Given fixed x, hard to find x’ s.t. H(x) = H(x’)
- Preimage Resistant: Given y, hard to find x’ s.t. H(x’) = y
- Broken Hash Functions: MD5, SHA1
- Usually vulnerable to length extension attacks.
Length Extension Attacks:
- Performed on hash functions.
- Given z = H(m) for some m, find H(m || padding || v) for chosen v.
- This is possible on algorithms using the Merkle-Damgard Construction, which use intermittent states.
- Therefore, at some point during the calculation of H(m || padding || v), it will use H(m) as an initial state.
- So if we know H(m), we can compute H(m || padding || v) since the hash function is known to us.
Randomness:
- True randomness may not really exist.
- Use a PRG instead.
- Given small seed, generate sequence of ~random bits.
- Can be asserted as secure by asking if an attacker can win the “PRG Game” (Similar to the “PRF Game”).
- On Linux
- /dev/random gives “reliable” random bits, can block until these bits are available.
- /dev/urandom gives output of a PRF, doesn’t block, but is seeded from /dev/random (and so initially may not really be random).
Padding Oracles:
- Find a way to distinguish between invalid MAC and invalid padding.
- Can be enough to learn plaintext.
- e.g. Vaudenay padding oracle attack.
- Defences:
- Don’t use separate errors for MAC and padding (unrealistic)
- Always check integrity
- Constant-time code
- Limit branching
Key Exchange:
- Diffie-Hellman - Relies on discrete log problem
- Alice and Bob public choose modulus p and base g (primitive root modulo 23)
- Alice chooses secret a and sends Bob g^a mod p
- Bob chooses secret b and sends Alice g^b mod p
- Alice computes k = (g^b mod p)^a
- Bob computes k = (g^a mod p)^b
- MITM Possible. Use digital signatures to defend against.
- Forward Secrecy
- Learning old key shouldn’t help adversary learn new key
- Compromising an old session should not compromise future sessions.
- Compromising long-term key should not allow decryption of recorded cipher texts
- Public-Key Cryptography
- RSA - Relies on factoring being hard.
- Pick two large random primes p and q
- N := pq mod N
- Pick e to be relatively prime to (p-1)(q-1)
- Find d so that ed mod (p-1)(q-1) = 1
- Public Key: (e, N), Private Key: (d, N)
- E_e(x) = x^e mod N, D_d(x) = x^d mod N
Secure Channel:
- Establish shared secret via Diffie-Hellman
- Verify RSA Signatures on k.
Modern Cryptography
- Elliptic Curve Cryptography
- Relies on elliptic curve discrete log problem
- Smaller key sizes can be used
Web Architecture
Early Technologies: telnet, ftp, sftp, smtp, nfs
HTTP
- Four Phases:
- Open Connection
- Client Request
- Server response
- Close connection
Cookies
- Store state for identity
- Sent by server, stored by browser
Same Origin Policy
- Two pages have the same origin if the protocol, port, and host are all the same.
- Cross-origin embedding legal
- Cross-origin scripting illegal
Attacks
- Cross Site Request Forgery
- Sites may rely (incorrectly!) on cookies for authentication.
- A user can log into a site so obtain a session cookie, navigate to another site, and that site might try performing request expecting the cookie to exist.
- Defense Idea: Embed a random token into each action on the site and require that token to be sent with the request.
- If you’re on the website, the website can handle adding the hidden value.
- An attacker can’t access this however, so CSRF won’t work.
- SQL Injection
- If database inputs aren’t sanitized, an attacker may be able to mess with your database by exploiting
- Defense:
- Sanitize
- Use Prepared Statements
- XSS (Cross Site Scripting)
- Webpages may treat data as code.
- This can be exploited to do malicious things
HTTPS
- TLS
- Replaced SSL.
- Provides encryption between client and server.
- Client and Server themselves are NOT protected here. TLS provides protection against the network itself.
- TLS Handshake
- Client sends cipher suites it supports
- Server picks a cipher suite, sends its choice, CA certificate, and signed DH parameters.
- Client then sends its DH parameters to server and asks server to switch to encrypted communication based on the agreed upon cipher spec.
- Server sends a response if it agrees.
- Encrypted communication ensues.
- Attacks
- RC4-related Attacks
- RC4 stream cipher can possess biases that can be utilized to leak cookies and passwords.
- CBC-related Attacks
- e.g. BEAST and PODDLE
- exploit weaknesses in TLS’s use of CBC to leak data.
- Compression-related Attacks
- TLS and HTTP use compression, but TLS broadcasts length of plaintext.
- Using both TLS and HTTP can leak data.
- Export-related Attacks
- e.g. Freak, Logjam, DROWN
- Exploit weaknesses in 1990s-era “export-grade” cryptography.
- Certificates
- Say Bob has a web server (bob.com). He wants to obtain a certificate.
- Generates public/private key pair.
- Sends public key to a CA (Certificate Authority).
- CA verifies public key (not via crypto).
- CA signs a certificate with CA’s private key noting Bob’s public key and sends to bob.com
- bob.com keeps certificate.
- When Alice wants to load bob.com, she first checks the certificate against the CA’s public key.
- Mixed Content
- Don’t allow active HTTP content on an HTTPS page. Passive is okay.
- Strict Transport Security (HSTS)
- HTTP header indicating always use HTTPs
- Prevents downgrade attacks.
- Protects future sessions, but not first session.
- Preload Lists
- List of sites that are HTTPS only.
- Attacks
- Picture in Picture Attack
- Spoof user interface to stick user interface into thinking they’ve loaded a secure site.
- Semantic Attacks
- Invalid Certs
- Expired, != URL, unknown CA (self-signed)
- Users can and will skip through warnings.
- ssl_strip attack
- MITM Attack
- Victim <— HTTP —> Attacker <— HTTPS —> Some HTTPS Site
- Mixed Content Attack
- Page loads over HTTPs but loads content over HTTP
- Active attacker can tamper with the HTTP content.
- FREAK Attack
- MITM where it downgrades available cipher suites to include only export-grade RSA
- Export-grade RSA uses only 512-bit keys, which can be factored quickly.
- Worked on ~9% of HTTPS sites
- Browser bug
- Logjam Attack
- Similar to FREAK, but against export-grade Diffie-Hellman instead of RSA.
- Protocol flaw.
Networking
Internet Layers
- Application - Application Packet Data
- DNS, SSH, FTP, SMTP, NTP, HTTP
- Transport - Prepends TCP Header to packet
- Network - Prepends IP Header to packet
- Link - Prepends Frame Header and appends Frame Trailer to packet.
DNS (Domain Name System)
- Distributed system to map domain names to ip addresses (IPv4 32 bits, IPv6 128 bits)
- Implemented in “name servers”
- Application-layer protocol
TCP (Transmission Control Protocol)
- Connection-Oriented
- Setup needed between client and server
- Reliable-Transport
- Will keep sending requests until it gets a response
- Flow-Control
- Sender won’t overwhelm receiver
- Congestion-Control
- Throttles sender when network overloaded
- Doesn’t supply
- Timing
- Minimum Throughput
- Security
- When sending
- Send Side: Breaks application data into segments, passes to network layer
- Receive Side: Reassembles data into messages, passes to application layer
- Handshake
- Client —> SYN Packet —> Server
- Server —> SYNACK Packet —> Client
- Client —> ACK Packet —> Server
- Connection Established.
UDP (User Datagram Protocol)
- Message-oriented
- Unreliable-Transport
- Doesn’t supply
- Connection Setup
- Reliability
- Flow Control
- Congestion control
- timing
- throughput
- security
Internet Protocol (IP)
- Connectionless
- Individual packets completely independent from other packets
- Unreliable
- No guarantee packets get sent
- May be lost, corrupted, reordered, duplicated, etc.
- IP Addresses
- Divided into two parts
- Network: Routes packets
- Host: Individual host
Network Address Translation (NAT)
- IPv4 only contains ~4 billion addresses
- Workaround is to use NAT
- Routers give “private ip addresses” for local use only.
- Router must remember mappings, because it only sends out addresses from the actual IP of the router.
Link Layer
- Converts datagram from virtual to “physical” world.
- Connectionless
- Unreliable
- e.g. Ethernet
Address Resolution Protocol (ARP)
- Mechanism to find MAC addresses of other machines
- Broadcast ARP query (Who has IP ___, respond to A)
- Any node that recognizes the IP responses to A.
- Used in ARP Spoofing
- Simply respond loud enough that you recognize some IP, but just respond a wrong MAC.
Network Attacks
- Link Layer
- ARP Spoofing (Mentioned earlier)
- Only works on local network
- Network Layer
- IP Spoofing
- BGP Route Hijacking
- DNS Spoofing (DNS Cache Poisoning)
- Transport Layer
- TCP Injection
- DoS via TCP RST
- UDP Amplification
- Application
Exploits
- Virus - Exploit that attaches to a file or program
- Worm - Self-Propagating exploit
- Trojan - Exploit designed as something interesting
- Botnets - Compromised distributed systems
Application Security
Password Security
- Weak passwords can suffer from brute force attacks
- Storing Passwords in Plaintext
- Easy to develop
- Database leak reveals passwords to attackers. Site Ops can find passwords
- Storing Encrypted Passwords
- Leaked database probably will also leak key. Site Ops can still find passwords
- Store Hash of password
- Identical Passwords have identical hashes.
- Rainbow tables can be used to break these
- Store salted hashes
- Randomly generate a salt for each user’s password
- Store Hash(salt || password)
- Don’t use SHA or MD5! This is not the same as integrity
- bcrypt, scrypt, argon2 are better
- U2F
- Open standard developed by Google and Yahoo
- Browser gets username and password, verifies password.
- If passes, generate challenge, send to user’s device.
- Verify response
Control-Hijacking
- Buffer Overflow Attacks
- If a program is vulnerable to allow writing over the expected buffer, we can write over the stack and replace the return address.
- Stack Shellcode
- Load your own application into memory
- Buffer Overflow but jump to your application
- Return to libc
- Overwrite data that is sent to existing library command (execv)
- Buffer Over-read
- Don’t write over the buffer, but instead read more than the buffer is suppose to contain which can leak information
- e.g. Heartbleed vulnerability
- ROP-Chains (Return Oriented Programming)
- Return to libc without function calls
- Integer Overflow
- When integers overflow, they wrap around (modulus)
- This can cause exploits if the length supplied is very large (but doesn’t overflow)
- You can generate a small buffer via malloc (size * sizeof(type)) which can overflow and give you a small buffer
- You then write a very large amount of data.
Control-Hijacking Defenses
- Data Execution Prevention (DEP)
- Mark data pages as either R/W, executable,
- Requires hardware support
- Attacker cannot return to stack
- Stack Canaries
- On function call, add a secret value “canary” right above the return address.
- If we read the canary, throw error
- ASLR (Address Space Layout Randomization)
- Makes it hard to predict references.