cs763/website/docs/resources/slides/lecture-langsec.md

---
author: Topics in Security and Privacy Technologies (CS 839)
title: Language-Based Security
date: November 21, 2018
---

# LangSec: Principles

## Programs are written in programming languages

## Security holes are bugs
1. Programmer writes some code
2. Programmer makes a mistake!
    - Forgot to check permissions
    - Mixed private and public data
    - Didn't allocate enough space
    - ...
3. Attacker exploits the security flaw

## Programming languages: <br> first line of defense
- Catching errors earlier is better
- Earliest possible time: when program is written
- Easier to reject program than try to protect against it

## Design languages <br> to reduce security flaws
- Make it easier for programmer to do right thing
- Make certain kinds of bugs impossible
- Limit damage caused by any bugs

## When are errors caught?
- When the program is running
    - Stop program when it does something unsafe
    - "Dynamic analysis"
- When the program is compiled
    - Reject bad program before it even runs
    - "Static analysis"

# Overall Strategy

## 1. Pick a language
- Real languages: Java, C, ...
    - Highly complex: tons of features
    - Hard to modify language
- Idealized "core" languages
    - Much simpler, small number of features
    - Model "essence" of real languages

## 2. Formalize what programs "do"
- Run it on a machine and find out?
    - Not very useful for proofs...
- Formalize behavior mathematically, on paper
    - Discard "unimportant" details
    - Describe how program "steps"

## 3. Describe how to check <br> a given program
- Must work *without* running the program
- Other desirable features:
    - Scales up to large programs
    - Runs in a reasonable amount of time
    - Doesn't reject too many correct programs

## 4. Prove correctness
- We want to prove two things:
    - Soundness: catch all buggy programs
    - Completeness: accept all correct programs

> Almost always: can't have both!

## Usually: pick soundness
- All buggy programs are rejected
    - If the check says "safe", then it is safe
- But: some safe programs might be rejected
    - Hopefully, not too many

# Imperative Languages

## Most familiar
- Basis of many popular languages
    - Java, C++, Python, ...
- Program executes sequence of instructions
- Can read/write to *variables*

## Keep essential features
- Assignments to variables
- Sequencing ("semicolon")
- Conditionals ("if-then-else")
- Loops

## Drop fancier features
- Memory management
- Jumps and gotos
- Function pointers
- Templates
- ...

## Example

# Functional Languages

## Maybe a bit less familiar
- No imperative features
    - Can't modify variables
    - Usually no looping command (instead: recursion)
- Instead: functions
    - Define functions
    - Call ("apply") functions on arguments

## Simpler to formalize
- Program includes "all the information"
    - Behavior doesn't depend on state of variables
- Program "runs" by changing the code itself
    - Simplifies all the way down to final answer

## Example

# Operational Semantics

## Execute by "stepping"
- Start with program
    - Imperative: plus variable setting
- In each step, perform update:
    - Functional: modify the program
    - Imperative: update variables
- Terminates when it stops stepping (is a "value")

## Example

## Different styles
- Big-step
    - Describe value a program eventually steps to
- Small-step
    - Describe one step of a program

# Type Systems

## Assign "types" to programs
- A type $\tau$ describes a class of programs
    - Usually: well-behaved in some way
- Can automatically check if program has type $\tau$
    - Type of program depends on types of components
    - Analysis scales to large programs

## Example

## Strengths
- Lightweight
    - Checking types is simple, automatic
    - Don't need to run program
- Natural and intuitive
    - Can't add a String to a Boolean
    - Programmers often think in terms of types
- Identify correct programs

## Weaknesses
- Programmer may need to add annotations
    - Extra hints for compiler
    - Common for more complex types
- Compiler sometimes rejects correct programs
    - Figuring out why can be very frustrating
    - May need to write program in less natural form

## Types can be complex
- Simpler types
    - String, Char, Bool, Int, function types, ...
- More complex types
    - Secret values/public values
    - Trusted values/untrusted values
    - Local data/remote data
    - Random values
    - ...

## Prove: soundness
- If program has a type, it should be well-behaved
    - Relate type system to operational behavior
    - "Soundness theorem"
- Many possible notions of "well-behaved"
    - Don't add Strings to Bools
    - Don't mix public and private data
    - Don't write past end of buffer
    - ...

> "Well-typed programs can't go wrong"