Building reusable, composable, and shareable huntflows across different data sources and threat intel.
Overview
Cyberthreat Hunting
Cyberthreat hunting is the planning and developing of threat discovery procedures against new and customized advanced persistent threats (APT). Cyberthreat hunting is comprised of several activities such as:
Understanding the security measurements in the target environment.
Thinking about potential threats escaping existing defenses.
Obtaining useful observations from system and network activities.
Developing threat hypotheses.
Revising threat hypotheses iteratively with the last two steps.
Confirming new threats.
Threat hunters create customized intrusion detection system (IDS) instances every day with a combination of data source queries, complex data processing, machine learning, threat intelligence enrichment, proprietary detection logic, and more. Threat hunters take advantage of scripting languages, spreadsheets, whiteboards, and other tools to plan and execute their hunts. In traditional cyberthreat hunting, many pieces of hunts are written against specific data sources and data types, which makes the domain knowledge in them not reusable, and hunters need to express the same knowledge again and again for different hunts.
Kestrel in a Nutshell
Kestrel provides a layer of abstraction to stop the repetition involved in cyberthreat hunting.
Kestrel language: a threat hunting language for a human to express what to hunt.
expressing the knowledge of what in patterns, analytics, and hunt flows.
composing reusable hunting flows from individual hunting steps.
reasoning with human-friendly entity-based data representation abstraction.
thinking across heterogeneous data and threat intelligence sources.
applying existing public and proprietary detection logic as analytic hunt steps.
reusing and sharing individual hunting steps, hunt-flow, and entire hunt books.
Kestrel runtime: a machine interpreter that deals with how to hunt.
compiling the what against specific hunting platform instructions.
executing the compiled code locally and remotely.
assembling raw logs and records into entities for entity-based reasoning.
caching intermediate data and related records for fast response.
prefetching related logs and records for link construction between entities.
defining extensible interfaces for data sources and analytics execution.