Course Description

Objective:
Students would learn to analyze C-like programs to extract information for varied application domains such as security and parallelization.

Grading:
There would be programming assignments and a course project, midterm and final exams. The grading is being worked out.

Contents (how will I cover all of this!):


Basic Dataflow Analysis (DFA)
 - iterative worklist based, dead-code elimination, reaching definitions.
 - monotone frameworks, confluence operators, MFP/MOP solution.
 - analysis dimensions: flow-sensitivity, context-sensitivity, field-sensitivity, path-sensitivity.
Pointer Analysis (PTR)
 - introduction, applications, as a DFA problem.
 - design decisions for precision vs. efficiency trade-off, heap modeling, analysis dimensions, set implementation.
 - Andersens's and Steensgaard's analyses, common model.
 - as a graph problem, reachability, online cycle detection.
 - dominators, propagation orders, optimal ordering, heuristics.
 - applications to optimizations such as dead-code elimination, reaching definitions, parallelization, escape analysis.
 - parallelization of PTR, constraint-based formulation.
 - parallelization using replication, graph rewrite-rules.
Shape Analysis (SHA)
 - limitations of pointer analysis, identification of simple structures like lists.
 - identifying trees, DAGs and cyclic graphs.
 - identifying rotations, limitations. 
 - reversal and other transformations, limitations.
Dynamic Analysis (DYN)
 - limitations of static analysis, trade-offs, applications, introduction.
 - profiling techniques, intrusive and non-intrusive methods, limitations, applications.
 - finding invariants, equality invariants.
 - finding affine invariants and other applications.
 - dynamic type inferencing, limitations.
Polyhedral Model (POL)
 - iteration space, identifying inequalities, ILP.
 - transformations, tiling, unrolling. 
 - fusion, splitting.
 - skewing, interchange.
Parallelization (PAR)
 - motivation, simple auto-parallelization, dependencies.
 - using polyhedral model, limitations.
 - synchronization, atomics, barriers, locks, semaphores.
 - happens-before and may-happen-before analysis.
 - parallelizing irregular algorithms.
Program Slicing (SLI)
 - concept, applications, methodology.
 - static slicing, interprocedural analysis.
 - object-oriented codes, exception-handling constructs.
 - dynamic slicing, overheads, precision.
Security Analysis (SEC)
 - information flow, role of program analysis, applications.
 - identifying string vulnerabilities, array out-of bound checks.
 - null pointer analysis, dangling pointers, pointers beyond a range.
 - type-based security, secure constructs and operations.

Textbooks:
Advanced Compiler Design and Implementation, Muchnick, Morgan Kaufmann 1997.

Reference Books:

  1. Compilers: Principles, Techniques and Tools (2nd Edition), Aho, Lam, Sethi, Ullman, Addison Wesley 2006.
  2. Data Flow Analysis: Theory and Practice, Khedker, Sanyal, Karkare, CRC Press 2009.
  3. Principles of Program Analysis: Nielson, Nielson, Hankin, Springer 2004.

Reference Papers:
-- Listed in the schedule

course home page