justhsu

/

cs763

Archived

This repository has been archived on 2024-11-04. You can view files and clone it, but cannot push or open issues or pull requests.

4.8 KiB

Raw Blame History

author	title	date
Topics in Security and Privacy Technologies (CS 839)	Language-Based Security	November 21, 2018

LangSec: Principles

Programs are written in programming languages

Security holes are bugs

Programmer writes some code
Programmer makes a mistake!
- Forgets to check permissions
- Mixes private and public data
- Doesn't allocate enough space
- Reads from malicious input
- ...
Attacker exploits the security flaw

Programming languages:
first line of defense

Catching errors earlier is better
Earliest possible time: when program is written
Easier to reject program than try to defend against it

Design languages
to reduce security flaws

Make it easier for programmer to do right thing
Make certain kinds of bugs impossible
Limit damage caused by any bugs

When are errors caught?

When the program is running
- Stop program when it does something unsafe
- "Dynamic analysis"
When the program is compiled
- Reject bad program before it even runs
- "Static analysis"

Overall Strategy

1. Pick a language

Real languages: Java, C, ...
- Highly complex: tons of features
- Hard to modify language
Idealized "core" languages
- Much simpler, small number of features
- Model "essence" of real languages

2. Formalize what programs "do"

Run it on a machine and find out?
- Not very useful for proofs...
Formalize behavior mathematically, on paper
- Discard "unimportant" details
- Describe how program "steps"

3. Describe how to check
a given program

Ideally: works without running the program
Other desirable features:
- Scales up to large programs
- Runs in a reasonable amount of time
- Doesn't reject too many correct programs

4. Prove correctness

We want to prove two things:
- Soundness: catch all buggy programs
- Completeness: accept all correct programs

Almost always: can't have both!

Usually: pick soundness

All buggy programs are rejected
- If the check says "safe", then it is safe
But: some safe programs might be rejected
- Hopefully, not too many

Imperative Languages

Most familiar

Basis of many popular languages
- Java, C++, Python, ...
Program executes sequence of instructions
Can read/write to variables

Keep essential features

Assignments to variables
Sequencing ("semicolon")
Conditionals ("if-then-else")
Loops

Drop fancier features

Memory management
Jumps and gotos
Function pointers
Templates
...

Example

Functional Languages

Maybe a bit less familiar

No imperative features
- Can't modify variables
- Usually no looping command (instead: recursion)
Instead: functions
- Define functions
- Call ("apply") functions on arguments

Simpler to formalize

Program includes "all the information"
- Behavior doesn't depend on state of variables
Program "runs" by changing the code itself
- Simplifies all the way down to final answer

Example

Operational Semantics

Execute by "stepping"

Start with program
- Imperative: plus variable setting
In each step, perform update:
- Functional: modify the program
- Imperative: update variables
Terminates when it stops stepping (is a "value")

Example

Different styles

Big-step
- Describe value a program eventually steps to
Small-step
- Describe one step of a program

Type Systems

Assign "types" to programs

A type \tau describes a class of programs
- Usually: well-behaved in some way
Can automatically check if program has type \tau
- Type of program depends on types of components
- Analysis scales to large programs

Example

Strengths

Lightweight
- Checking types is simple, automatic
- Don't need to run program
Natural and intuitive
- Can't add a String to a Boolean
- Programmers often think in terms of types
Identify correct programs

Weaknesses

Programmer may need to add annotations
- Extra hints for compiler
- Common for more complex types
Compiler sometimes rejects correct programs
- Figuring out why can be very frustrating
- May need to write program in less natural form

Types can be complex

Simpler types
- String, Char, Bool, Int, function types, ...
More complex types
- Secret values/public values
- Trusted values/untrusted values
- Local data/remote data
- Random values
- ...

Prove: soundness

If program has a type, it should be well-behaved
- Relate type system to operational behavior
- "Soundness theorem"
Many possible notions of "well-behaved"
- Don't add Strings to Bools
- Don't mix public and private data
- Don't write past end of buffer
- ...

"Well-typed programs can't go wrong"