--- author: Topics in Security and Privacy Technologies (CS 839) title: Language-Based Security date: November 21, 2018 --- # LangSec: Principles ## Programs are written in programming languages ## Security holes are bugs 1. Programmer writes some code 2. Programmer makes a mistake! - Forgot to check permissions - Mixed private and public data - Didn't allocate enough space - ... 3. Attacker exploits the security flaw ## Programming languages:
first line of defense - Catching errors earlier is better - Earliest possible time: when program is written - Easier to reject program than try to protect against it ## Design languages
to reduce security flaws - Make it easier for programmer to do right thing - Make certain kinds of bugs impossible - Limit damage caused by any bugs ## When are errors caught? - When the program is running - Stop program when it does something unsafe - "Dynamic analysis" - When the program is compiled - Reject bad program before it even runs - "Static analysis" # Overall Strategy ## 1. Pick a language - Real languages: Java, C, ... - Highly complex: tons of features - Hard to modify language - Idealized "core" languages - Much simpler, small number of features - Model "essence" of real languages ## 2. Formalize what programs "do" - Run it on a machine and find out? - Not very useful for proofs... - Formalize behavior mathematically, on paper - Discard "unimportant" details - Describe how program "steps" ## 3. Describe how to check
a given program - Must work *without* running the program - Other desirable features: - Scales up to large programs - Runs in a reasonable amount of time - Doesn't reject too many correct programs ## 4. Prove correctness - We want to prove two things: - Soundness: catch all buggy programs - Completeness: accept all correct programs > Almost always: can't have both! ## Usually: pick soundness - All buggy programs are rejected - If the check says "safe", then it is safe - But: some safe programs might be rejected - Hopefully, not too many # Imperative Languages ## Most familiar - Basis of many popular languages - Java, C++, Python, ... - Program executes sequence of instructions - Can read/write to *variables* ## Keep essential features - Assignments to variables - Sequencing ("semicolon") - Conditionals ("if-then-else") - Loops ## Drop fancier features - Memory management - Jumps and gotos - Function pointers - Templates - ... ## Example # Functional Languages ## Maybe a bit less familiar - No imperative features - Can't modify variables - Usually no looping command (instead: recursion) - Instead: functions - Define functions - Call ("apply") functions on arguments ## Simpler to formalize - Program includes "all the information" - Behavior doesn't depend on state of variables - Program "runs" by changing the code itself - Simplifies all the way down to final answer ## Example # Operational Semantics ## Execute by "stepping" - Start with program - Imperative: plus variable setting - In each step, perform update: - Functional: modify the program - Imperative: update variables - Terminates when it stops stepping (is a "value") ## Example ## Different styles - Big-step - Describe value a program eventually steps to - Small-step - Describe one step of a program # Type Systems ## Assign "types" to programs - A type $\tau$ describes a class of programs - Usually: well-behaved in some way - Can automatically check if program has type $\tau$ - Type of program depends on types of components - Analysis scales to large programs ## Example ## Strengths - Lightweight - Checking types is simple, automatic - Don't need to run program - Natural and intuitive - Can't add a String to a Boolean - Programmers often think in terms of types - Identify correct programs ## Weaknesses - Programmer may need to add annotations - Extra hints for compiler - Common for more complex types - Compiler sometimes rejects correct programs - Figuring out why can be very frustrating - May need to write program in less natural form ## Types can be complex - Simpler types - String, Char, Bool, Int, function types, ... - More complex types - Secret values/public values - Trusted values/untrusted values - Local data/remote data - Random values - ... ## Prove: soundness - If program has a type, it should be well-behaved - Relate type system to operational behavior - "Soundness theorem" - Many possible notions of "well-behaved" - Don't add Strings to Bools - Don't mix public and private data - Don't write past end of buffer - ... > "Well-typed programs can't go wrong"