---
author: Topics in Security and Privacy Technologies (CS 839)
title: Language-Based Security
date: November 21, 2018
---
# LangSec: Principles
## Programs are written in programming languages
## Security holes are bugs
1. Programmer writes some code
2. Programmer makes a mistake!
- Forgets to check permissions
- Mixes private and public data
- Doesn't allocate enough space
- Reads from malicious input
- ...
3. Attacker exploits the security flaw
## Programming languages:
first line of defense
- Catching errors earlier is better
- Earliest possible time: when program is written
- Easier to reject program than try to defend against it
## Design languages
to reduce security flaws
- Make it easier for programmer to do right thing
- Make certain kinds of bugs impossible
- Limit damage caused by any bugs
## When are errors caught?
- When the program is running
- Stop program when it does something unsafe
- "Dynamic analysis"
- When the program is compiled
- Reject bad program before it even runs
- "Static analysis"
# Overall Strategy
## 1. Pick a language
- Real languages: Java, C, ...
- Highly complex: tons of features
- Hard to modify language
- Idealized "core" languages
- Much simpler, small number of features
- Model "essence" of real languages
## 2. Formalize what programs "do"
- Run it on a machine and find out?
- Not very useful for proofs...
- Formalize behavior mathematically, on paper
- Discard "unimportant" details
- Describe how program "steps"
## 3. Describe how to check
a given program
- Ideally: works *without* running the program
- Other desirable features:
- Scales up to large programs
- Runs in a reasonable amount of time
- Doesn't reject too many correct programs
## 4. Prove correctness
- We want to prove two things:
- Soundness: catch all buggy programs
- Completeness: accept all correct programs
> Almost always: can't have both!
## Usually: pick soundness
- All buggy programs are rejected
- If the check says "safe", then it is safe
- But: some safe programs might be rejected
- Hopefully, not too many
# Imperative Languages
## Most familiar
- Basis of many popular languages
- Java, C++, Python, ...
- Program executes sequence of instructions
- Can read/write to *variables*
## Keep essential features
- Assignments to variables
- Sequencing ("semicolon")
- Conditionals ("if-then-else")
- Loops
## Drop fancier features
- Memory management
- Jumps and gotos
- Function pointers
- Templates
- ...
## Example
# Functional Languages
## Maybe a bit less familiar
- No imperative features
- Can't modify variables
- Usually no looping command (instead: recursion)
- Instead: functions
- Define functions
- Call ("apply") functions on arguments
## Simpler to formalize
- Program includes "all the information"
- Behavior doesn't depend on state of variables
- Program "runs" by changing the code itself
- Simplifies all the way down to final answer
## Example
# Operational Semantics
## Execute by "stepping"
- Start with program
- Imperative: plus variable setting
- In each step, perform update:
- Functional: modify the program
- Imperative: update variables
- Terminates when it stops stepping (is a "value")
## Example
## Different styles
- Big-step
- Describe value a program eventually steps to
- Small-step
- Describe one step of a program
# Type Systems
## Assign "types" to programs
- A type $\tau$ describes a class of programs
- Usually: well-behaved in some way
- Can automatically check if program has type $\tau$
- Type of program depends on types of components
- Analysis scales to large programs
## Example
## Strengths
- Lightweight
- Checking types is simple, automatic
- Don't need to run program
- Natural and intuitive
- Can't add a String to a Boolean
- Programmers often think in terms of types
- Identify correct programs
## Weaknesses
- Programmer may need to add annotations
- Extra hints for compiler
- Common for more complex types
- Compiler sometimes rejects correct programs
- Figuring out why can be very frustrating
- May need to write program in less natural form
## Types can be complex
- Simpler types
- String, Char, Bool, Int, function types, ...
- More complex types
- Secret values/public values
- Trusted values/untrusted values
- Local data/remote data
- Random values
- ...
## Prove: soundness
- If program has a type, it should be well-behaved
- Relate type system to operational behavior
- "Soundness theorem"
- Many possible notions of "well-behaved"
- Don't add Strings to Bools
- Don't mix public and private data
- Don't write past end of buffer
- ...
> "Well-typed programs can't go wrong"