Compliance-as-Code for Cybersecurity Automation in Hybrid Cloud
- Vikas Agarwal
- Chris Butler
- et al.
- 2022
- CLOUD 2022
Security and compliance officers struggle to get a handle on their organization’s cybersecurity posture. This information is usually spread across various policy documents, spreadsheets and operational systems. As a result, gauging whether the existing security program is sufficient or not becomes a hard question to answer for these officers. This is critical since regulators often hold these officers personally responsible for any lapses resulting in data breaches.
The problem's complexity has grown manifold with the advent of hybrid cloud infrastructures.An organization can now have their infrastructure spanning across multiple clouds and on-premise environments. This area focuses on transforming security and compliance operations of an organization from a document-centric process to a data-entric one. This spans development time source code scanning for security vulnerabilities and runtime continuous monitoring for compliance. Essentially, all security and compliance artifacts - be it security policies, infrastructure to be secured, specific configurations or raw compliance measurements etc., everything is expressed as code using standardized machine processable representations. This enables the entire security and compliance operation to be automated, with help of AI and keeping human experts in the loop.
Such automation of cybersecurity processes is non-trivial with large scale deployment of sensitive workloads happening across regulated on-prem, private, and public cloud environments. Regulatory and standards bodies such as Payment Card Industry (PCI), Federal Financial Institutions Examination Council (FFIEC), International Organization for Standardization (ISO), and others govern the minimal set of cybersecurity controls that an organization must implement.
To meet such requirements while maintaining business agility, organizations need to modernize from manual document based compliance management to automated processes for continuous compliance. This modernized process is called compliance-as-code.
We have designed an architecture for compliance-as-code based on NIST OSCAL framework. It is essentially a system for manipulating compliance information in a standardized manner and a data interchange protocol for inter-operable communication of compliance information. Specifically, we have architected OSCAL concepts into an open source software development kit (SDK) called Trestle. It is really an ensemble of tools that enables the creation, validation, and governance of documentation artifacts for compliance needs. Such a process for Agile Authoring of compliance artifacts is based upon the following key tenets:
Adopt Git repository as the single source of truth for all compliance documents and artifacts. This enables an effective change management process.
Use a command line interface instead of graphical user interface (GUI) to expose its functionality. This enables compliance engineers to work with this data and also facilitates a UI to be built on top of it, for risk and compliance subject matter experts (SMEs).
Use standardized data interchange format rather than a proprietary one. This guarantees extensibility of the platform.
On the other hand, to address data privacy, security and compliance concerns, cloud providers offer specialized clouds for heavily regulated industries. These clouds implement prescribed security standards. A critical step in the migration-to-cloud process is to ensure that the customer’s security requirements are fully met by the cloud provider.
With a few hundred services in a typical cloud provider’s infrastructure, this becomes a non-trivial task. Few tens to hundreds of security checks exposed by each applicable service need to be matched with several hundreds to thousands of security controls from the customer. Mapping customer’s controls to cloud provider’s control set is done manually by experts, a process that often takes months to complete, and needs to be repeated with every new customer. Moreover, these mappings have to be re-evaluated following regulatory or business changes, as well as cloud infrastructure upgrades.
Live Crosswalks is an AI-assisted system for mapping security controls, which drastically reduces the number of candidates a human expert needs to consider, allowing substantial speed-up of the mapping process. We employ hierarchical classification using fine-tuned Transformer networks to accomplish this task.
The task of cloud security controls mapping using AI has several distinct features that make it particularly challenging:
Text complexity: both customer controls and NIST controls may be complex texts, spanning several sentences. Furthermore, control matches are often partial: only part of the source control may be matched, and only part of the target control may be relevant for the match.
Domain specificity: cloud controls are written in a domain-specific language, and their mapping requires domain expertise.
Data scarcity: Annotation complexity makes data collection slow and expensive, and limits the amount of data available for training AI models.
An opinionated tooling platform for managing compliance as code, using continuous integration and NIST's OSCAL standard.