Beyond Edge Coverage: Per-Task Data-Flow Extraction at Kernel Function Boundaries via LLVM
Yunseong Kim
Coverage-guided kernel fuzzers such as syzkaller rely on edge coverage (trace-pc) as their sole feedback signal. This context-blind approach cannot distinguish execution paths that differ only in argument values -- for example, two invocations of copy_from_user() with different size parameters hit identical basic blocks yet have vastly different security implications. I present TOOLNAME, an LLVM-based instrumentation framework that extends Linux KCOV with data-flow extraction of function arguments and return values. A compiler pass emits lightweight callbacks capturing structured tuples of program counter, argument metadata, and field values at function entry and return. Composite types are automatically decomposed via DWARF DICompositeType metadata with zero source annotation. A lock-free per-task ring buffer delivers records to user space with no interference to existing KCOV or syzkaller infrastructure. I demonstrate dual utility: (1) fuzzers gain state-aware feedback for mutation guidance into value-dependent state transitions, and (2) security analysts obtain deterministic argument records for root-cause analysis without printk or kprobe overhead. Two Rust instrumentation paths are provided: a post-compilation pipeline requiring no rustc modification, and native instrumentation via rustc built against the custom LLVM -- both the only runtime methods for capturing Rust function arguments given that drgn/vmcore fails under -O2 DWARF elision.
Read on ELI