Notes on Computer Security

Notes by Josiah Bruner (josiahbruner.com)

Inspired from EECS 388 @ UMich

Security Mindset

Level-2 Problem: “Weakness” (Factors that predispose systems to vulnerability)

Level-1 Problem: “Vulnerability” (Specific things that can be exploited to cause an assault

Level-0 Problem: “Assault” (Actual attack on a specific flaw)

Cryptography

Goals:

Message Integrity (Attacker can’t modify messages without that being detected)

Want a kind of “hash”
Pure random function (mapping), although secure, is impractical due to space limitations.
Use a PRF (pseudo random function) instead.

Secret k indexes into a family of pseudo random functions and uses that for hashing.
Recall function here is just that, a function. It does not need to be 1-1 or invertible.
PRF can be asserted as secure by asking if an attacker can win the "PRF Game”

Integrity solved via a MAC (Message Authentication Code). e.g. HMAC-SHA256

A MAC is a function that takes a key and provides the hash using a “hash function” (which is fixed)
MAC is ~ PRF, although slightly different.

Message Confidentiality (Keep message contents hidden from eavesdropper)

Horrible Ciphers:

Caesar Cipher - Uses a hidden secret k and shifts all values in the plaintext k places. Decryption works by reversing this.

Can be broken by trivial frequency analysis.

Vigenere Cipher - Uses a hidden secret k (which is a word, not a single value now). Shifts each letter in the plaintext by the corresponding letter in k. If |k| < |m|, essentially concatenate k to itself and continue.

Kasiski Method:

Can be broken by looking for repeating strings and getting the distance.
Distance is likely a multiple of the key length.
Once we know the key length n we can break the cipher text into n slices and solve each as a Caesar Cipher

If |k| >= |m|, reduces to OTP (One Time Pad)

Semi-Working Ciphers:

One-time Pad (OTP)

Generate a shared secret k of random bits, s.t. |k| = |m|.
Encrypt: c = m xor k
Decrypt m’ = c xor k
Provably secure because we are essentially generating a random string which can contain no useful information to an attacker. Any string could be the result of this cipher text given some key k’.
Impractical because it truly can only be used once.

(a xor k) xor (b xor k) = (a xor b).
Then be sliding common strings that we expect to be in either a or b and xor with our result, we will get the output of the other value.

Stream Cipher

Given a shared secret k, have Alice and Bob seed a PRG with k.
Encrypt: To encrypt a message m of length l, generate l bits from the PRG, and xor those bits with m.
Decrypt: To decrypt a message m of length l, generate l bits from the PRG, and xor those bits with m.
Potential Flaws:

Assuming Alice and Bob’s PRG is at the same state. If Alice encrypts using l bits from the PRG, Bob can only decrypt by generating EXACTLY the same l bits from his PRG. If two messages are encrypted simultaneously the whole system breaks down.
Can never reuse the key or PRG output bits.

e.g. Salsa20, RC2 (Broken), RC4 (Broken)

Block Cipher (Best option)

Encrypt message in fixed length blocks with a reusable key.
Use a PRP (Pseudorandom Permutation) instead of a PRF because we need to be able to invert the function.
e.g. AES (Fixed length input and output)

Split k into ten sub keys. Perform a set of operations ten times, each time using a different sub key.
Operations:

Non-Linearize - Run each byte through a non-linear function
Shift - Circular shift each row. ith row shifted by i
Linear-mix - Treat each columns as a 4d vector. Multiply by a constant invertible matrix
Key-Addition - xor each byte with corresponding byte of round sub key

AES only works on fixed length messages

To encrypt multiple blocks:

Don’t use:

Encrypted Codebook Mode - Encrypts each block independently. Can keep edge details.
Cipher-Block Chaining (CBC) Mode - c_i = E_k(m_i xor c_i-1). Can suffer from padding oracle attacks.

Can Use:

Counter Mode - Uses block cipher as a PRG. msg_id = random data, m_i = E_k(msg_id || ctr)

Authentication (Recipient can verify message was sent by some sender)

Sometimes called a “digital signature”
RSA is a good way to accomplish this.

Confidentially and Integrity simultaneously

Encrypt, then MAC.
AEAD (Authenticated Encrypted and Associated Data)

c, tag := Seal(k, m, associated_data)
m’ = Unseal(k, c, associated_data, tag)
AES_GCM (“Galois Counter Mode”)
e.g. ChaCha-Poly1305, Salsa20-Poly1305

Need two shared keys (shouldn’t reuse keys across methods. crypto detail, not mentioned here). Can use PRG to generate two shared keys via one shared key
If there exists a reverse channel, use separate keys!
Cryptographic Doom Principle: If you are required to perform any cryptographic operation before verifying the MAC on a message, it will probably lead to issues.

Hash Functions:

Arbitrary length inputs. Fixed length outputs.
Not random.
e.g. SHA, SHA-256, etc.
Desirable properties:

Collision Resistant: Hard to find x, x’ s.t. H(x) = H(x’)
Second Preimage Resistant: Given fixed x, hard to find x’ s.t. H(x) = H(x’)
Preimage Resistant: Given y, hard to find x’ s.t. H(x’) = y

Broken Hash Functions: MD5, SHA1
Usually vulnerable to length extension attacks.

Length Extension Attacks:

Performed on hash functions.
Given z = H(m) for some m, find H(m || padding || v) for chosen v.

This is possible on algorithms using the Merkle-Damgard Construction, which use intermittent states.
Therefore, at some point during the calculation of H(m || padding || v), it will use H(m) as an initial state.
So if we know H(m), we can compute H(m || padding || v) since the hash function is known to us.

Randomness:

True randomness may not really exist.
Use a PRG instead.

Given small seed, generate sequence of ~random bits.
Can be asserted as secure by asking if an attacker can win the “PRG Game” (Similar to the “PRF Game”).

On Linux

/dev/random gives “reliable” random bits, can block until these bits are available.
/dev/urandom gives output of a PRF, doesn’t block, but is seeded from /dev/random (and so initially may not really be random).

Padding Oracles:

Find a way to distinguish between invalid MAC and invalid padding.
Can be enough to learn plaintext.
e.g. Vaudenay padding oracle attack.
Defences:

Don’t use separate errors for MAC and padding (unrealistic)
Always check integrity
Constant-time code
Limit branching

Key Exchange:

Diffie-Hellman - Relies on discrete log problem

Alice and Bob public choose modulus p and base g (primitive root modulo 23)
Alice chooses secret a and sends Bob g^a mod p
Bob chooses secret b and sends Alice g^b mod p
Alice computes k = (g^b mod p)^a
Bob computes k = (g^a mod p)^b
MITM Possible. Use digital signatures to defend against.

Forward Secrecy

Learning old key shouldn’t help adversary learn new key
Compromising an old session should not compromise future sessions.
Compromising long-term key should not allow decryption of recorded cipher texts

Public-Key Cryptography

RSA - Relies on factoring being hard.

Pick two large random primes p and q
N := pq mod N
Pick e to be relatively prime to (p-1)(q-1)
Find d so that ed mod (p-1)(q-1) = 1
Public Key: (e, N), Private Key: (d, N)
E_e(x) = x^e mod N, D_d(x) = x^d mod N

Secure Channel:

Establish shared secret via Diffie-Hellman
Verify RSA Signatures on k.

Modern Cryptography

Elliptic Curve Cryptography

Relies on elliptic curve discrete log problem
Smaller key sizes can be used

Web Architecture

Early Technologies: telnet, ftp, sftp, smtp, nfs

HTTP

Four Phases:

Open Connection
Client Request
Server response
Close connection

Store state for identity
Sent by server, stored by browser

Same Origin Policy

Two pages have the same origin if the protocol, port, and host are all the same.
Cross-origin embedding legal
Cross-origin scripting illegal

Attacks

Cross Site Request Forgery

Sites may rely (incorrectly!) on cookies for authentication.
A user can log into a site so obtain a session cookie, navigate to another site, and that site might try performing request expecting the cookie to exist.
Defense Idea: Embed a random token into each action on the site and require that token to be sent with the request.

If you’re on the website, the website can handle adding the hidden value.
An attacker can’t access this however, so CSRF won’t work.

SQL Injection

If database inputs aren’t sanitized, an attacker may be able to mess with your database by exploiting
Defense:

Sanitize
Use Prepared Statements

XSS (Cross Site Scripting)

Webpages may treat data as code.
This can be exploited to do malicious things

HTTPS

Replaced SSL.
Provides encryption between client and server.

Client and Server themselves are NOT protected here. TLS provides protection against the network itself.

TLS Handshake

Client sends cipher suites it supports
Server picks a cipher suite, sends its choice, CA certificate, and signed DH parameters.
Client then sends its DH parameters to server and asks server to switch to encrypted communication based on the agreed upon cipher spec.
Server sends a response if it agrees.
Encrypted communication ensues.

Attacks

RC4-related Attacks

RC4 stream cipher can possess biases that can be utilized to leak cookies and passwords.

CBC-related Attacks

e.g. BEAST and PODDLE
exploit weaknesses in TLS’s use of CBC to leak data.

Compression-related Attacks

TLS and HTTP use compression, but TLS broadcasts length of plaintext.
Using both TLS and HTTP can leak data.

Export-related Attacks

e.g. Freak, Logjam, DROWN
Exploit weaknesses in 1990s-era “export-grade” cryptography.

Certificates

Say Bob has a web server (bob.com). He wants to obtain a certificate.

Generates public/private key pair.
Sends public key to a CA (Certificate Authority).
CA verifies public key (not via crypto).
CA signs a certificate with CA’s private key noting Bob’s public key and sends to bob.com
bob.com keeps certificate.

When Alice wants to load bob.com, she first checks the certificate against the CA’s public key.

Mixed Content

Don’t allow active HTTP content on an HTTPS page. Passive is okay.

Strict Transport Security (HSTS)

HTTP header indicating always use HTTPs
Prevents downgrade attacks.
Protects future sessions, but not first session.

Preload Lists

List of sites that are HTTPS only.

Attacks

Picture in Picture Attack

Spoof user interface to stick user interface into thinking they’ve loaded a secure site.

Semantic Attacks

Use “tricky” characters. e.g. microsoft.com vs micr0soft.com

Invalid Certs

Expired, != URL, unknown CA (self-signed)
Users can and will skip through warnings.

ssl_strip attack

MITM Attack
Victim <— HTTP —> Attacker <— HTTPS —> Some HTTPS Site

Mixed Content Attack

Page loads over HTTPs but loads content over HTTP
Active attacker can tamper with the HTTP content.

FREAK Attack

MITM where it downgrades available cipher suites to include only export-grade RSA

Export-grade RSA uses only 512-bit keys, which can be factored quickly.

Worked on ~9% of HTTPS sites
Browser bug

Logjam Attack

Similar to FREAK, but against export-grade Diffie-Hellman instead of RSA.
Protocol flaw.

Networking

Internet Layers

Application - Application Packet Data

DNS, SSH, FTP, SMTP, NTP, HTTP

Transport - Prepends TCP Header to packet

UDP, TCP

Network - Prepends IP Header to packet

Link - Prepends Frame Header and appends Frame Trailer to packet.

WiFi, LTE, Ethernet

DNS (Domain Name System)

Distributed system to map domain names to ip addresses (IPv4 32 bits, IPv6 128 bits)
Implemented in “name servers”
Application-layer protocol

TCP (Transmission Control Protocol)

Connection-Oriented

Setup needed between client and server

Reliable-Transport

Will keep sending requests until it gets a response

Flow-Control

Sender won’t overwhelm receiver

Congestion-Control

Throttles sender when network overloaded

Doesn’t supply

Timing
Minimum Throughput
Security

When sending

Send Side: Breaks application data into segments, passes to network layer
Receive Side: Reassembles data into messages, passes to application layer

Handshake

Client —> SYN Packet —> Server
Server —> SYNACK Packet —> Client
Client —> ACK Packet —> Server
Connection Established.

UDP (User Datagram Protocol)

Message-oriented
Unreliable-Transport
Doesn’t supply

Connection Setup
Reliability
Flow Control
Congestion control
timing
throughput
security

Internet Protocol (IP)

Connectionless

Individual packets completely independent from other packets

Unreliable

No guarantee packets get sent
May be lost, corrupted, reordered, duplicated, etc.

IP Addresses

Divided into two parts

Network: Routes packets
Host: Individual host

Network Address Translation (NAT)

IPv4 only contains ~4 billion addresses
Workaround is to use NAT

Routers give “private ip addresses” for local use only.
Router must remember mappings, because it only sends out addresses from the actual IP of the router.

Link Layer

Converts datagram from virtual to “physical” world.
Connectionless
Unreliable
e.g. Ethernet

Address Resolution Protocol (ARP)

Mechanism to find MAC addresses of other machines
Broadcast ARP query (Who has IP ___, respond to A)
Any node that recognizes the IP responses to A.
Used in ARP Spoofing

Simply respond loud enough that you recognize some IP, but just respond a wrong MAC.

Network Attacks

Link Layer

ARP Spoofing (Mentioned earlier)

Only works on local network

Network Layer

IP Spoofing
BGP Route Hijacking
DNS Spoofing (DNS Cache Poisoning)

Transport Layer

TCP Injection
DoS via TCP RST
UDP Amplification

Application

DNS maps
BGP maps

Exploits

Virus - Exploit that attaches to a file or program
Worm - Self-Propagating exploit
Trojan - Exploit designed as something interesting
Botnets - Compromised distributed systems

Application Security

Password Security

Weak passwords can suffer from brute force attacks
Storing Passwords in Plaintext

Easy to develop
Database leak reveals passwords to attackers. Site Ops can find passwords

Storing Encrypted Passwords

Leaked database probably will also leak key. Site Ops can still find passwords

Store Hash of password

Identical Passwords have identical hashes.
Rainbow tables can be used to break these

Store salted hashes

Randomly generate a salt for each user’s password
Store Hash(salt || password)
Don’t use SHA or MD5! This is not the same as integrity

bcrypt, scrypt, argon2 are better

Open standard developed by Google and Yahoo
Browser gets username and password, verifies password.
If passes, generate challenge, send to user’s device.
Verify response

Control-Hijacking

Buffer Overflow Attacks

If a program is vulnerable to allow writing over the expected buffer, we can write over the stack and replace the return address.

Stack Shellcode

Load your own application into memory
Buffer Overflow but jump to your application

Return to libc

Overwrite data that is sent to existing library command (execv)

Buffer Over-read

Don’t write over the buffer, but instead read more than the buffer is suppose to contain which can leak information

e.g. Heartbleed vulnerability

ROP-Chains (Return Oriented Programming)

Return to libc without function calls

Integer Overflow

When integers overflow, they wrap around (modulus)
This can cause exploits if the length supplied is very large (but doesn’t overflow)
You can generate a small buffer via malloc (size * sizeof(type)) which can overflow and give you a small buffer
You then write a very large amount of data.

Control-Hijacking Defenses

Data Execution Prevention (DEP)

Mark data pages as either R/W, executable,
Requires hardware support
Attacker cannot return to stack

Stack Canaries

On function call, add a secret value “canary” right above the return address.
If we read the canary, throw error

ASLR (Address Space Layout Randomization)

Makes it hard to predict references.