Kubernetes is a cloud in and of itself and DivvyCloud can help companies understand and identify excessive cloud risk. Organizations need to take a holistic approach to Kubernetes security and consider both the hosting environments and the Kubernetes clusters, as both are fundamentally intertwined.
Kubernetes (commonly referred to as K8s) is an open-source container-orchestration system for automating deployment, scaling, and management of containerized applications. It was originally designed by Google. The company open-sourced the Kubernetes project in 2014, and it is now maintained by the Cloud Native Computing Foundation.
In July 2018, Google introduced commercial Kubernetes applications in their GCP Marketplace and DivvyCloud was proud to be included as a launch partner. The partnership made it even easier for customers to deploy DivvyCloud to mitigate security and compliance risk while embracing the dynamic, self-service nature of Kubernetes. Customers can govern their container environments running on AWS Elastic Kubernetes Service (EKS), Google Kubernetes Engine (GKE), and Microsoft Azure Kubernetes Service (AKS) Using DivvyCloud to monitor, apply policy, and take action on six resource types: Containers, Pods, Ingress, Node, Deployments, and Services. For the first time, customers gain a holistic view of their cloud container infrastructure. Enabling the application of policies across all of the related elements (e.g., IAM and underlying or related cloud infrastructure) and supporting the easier development of a roadmap for security and compliance.
“The availability of commercial Kubernetes applications from providers like DivvyCloud is a critical part of extending enterprise investments and can simplify adoption of container-based infrastructure no matter what environment they operate in, either on-premise or in the public cloud.”
Jennifer Lin, Director of Product Management, Google Cloud
Developing a Security and Compliance Roadmap
There are three keys to building a roadmap for security and compliance: culture, frameworks, and systems. Combining these three keys enables you to build cloud operations maturity through automation.
First, we must reject the “command and control” approach that was successful in the traditional data center world. Embracing the new “trust but verify” approach supports innovation derived by self-service access to the public cloud.
Second, incorporate PCI DSS, CSA CCM, and CIS Benchmarks (and GDPR as necessary) as the foundation of your cloud security and GRC strategy.
Third, identify and implement cloud-native systems to help you address the unique challenges of the public cloud and containers through automation. Fortunately, DivvyCloud helps companies achieve continuous security, compliance, and governance while embracing the dynamic, software-defined, self-service nature of public cloud and container infrastructure.
As companies embrace self-service in cloud and containers, it’s far easier to introduce misconfigurations and incur security risks that are commonly exploited. Misconfiguration is an issue that continues to plague software-defined infrastructure. These few recent examples show just how simple it is for misconfiguration to happen.
Misconfiguration Can Have Serious Consequences
In June 2018, Weight Watchers suffered a security breach after researchers found that the company forgot to set a password for the administration console of one of its Kubernetes instances. This oversight granted anyone (via port 10250) access to the servers without the need to enter a username and password. Though it remains unclear if another actor besides the researchers discovered the instance, the exposed data revealed an administrator’s root credentials, access keys for 102 of their domains, and 31 IAM users including users with administrative credentials and applications with programmatic access.
Also in 2018, Tesla’s cloud environment was exploited to mine cryptocurrencies through an unsecured administrative console for Kubernetes. Hackers had infiltrated Tesla’s Kubernetes console which was not password protected, and gained access to Tesla’s AWS system which contained sensitive data including vehicle telemetry. These intrusions were significant, but they could have been avoided with the proper technology and security precautions. Having the correct security in place to enable an organization to identify potential hazards early in the development process as they were being created.
The Drive to Innovate Can Create Security Challenges
Many security and GRC (governance, risk, and compliance) professionals are being forced into the uncomfortable position of supporting the rapid adoption of new technologies such as cloud and containers. Cloud and container adoption is being driven by aggressive corporate innovation initiatives. These initiatives frequently go hand in hand with the trend of self-service provisioning as “the way” to create cloud computing resources.
The drive to innovate creates a difficult situation. Security and GRC professionals must rapidly understand how to secure these new technologies at scale. The accelerated rate of change produced by innovative automation can make it nearly impossible for any human or group of humans to truly understand the potential impact of computing environment changes. The only way to effectively monitor and secure an automation based technology such as Kubernetes is to use automation itself.
DivvyCloud provides just this sort of automation technology. DivvyCloud is designed to effectively monitor and remediate orchestration frameworks at scale. Its technology is particularly useful when applied to digital infrastructures based on Kubernetes.
Improving Kubernetes Security Using CIS Benchmarks and DivvyCloud Technology
Kubernetes is becoming the container orchestration technology most prevalent in the enterprise. It’s a proven framework for deploying containers into the cloud. Kubernetes is supported by all the major cloud providers including, but not limited to, AWS, Google Cloud, and Azure. Also, there is a sizable user community and a growing number of certification authorities that provide examinations to qualify technical personnel as competent Kubernetes Administrators. The most notable authority is the Cloud Native Computing Foundation (CNCF), which is the vendor-neutral home for Kubernetes.
As Kubernetes grows in popularity, so too have security concerns about the technology. The publication of CIS Benchmarks for Kubernetes in 2017 by the Center for Internet Security is a major step forward to establish a formal approach to using Kubernetes securely. CIS Benchmarks are consensus-driven security guidelines defined by representatives from industry and government. These guidelines are intended for system and application administrators, security specialists, auditors, help desk, and platform deployment personnel.
The CIS Benchmarks for Kubernetes define a standard by which to determine the state of security in a Kubernetes cluster. In addition, the Benchmarks provide guidance for remediation when security shortcomings are identified. The Benchmarks are an important document that, given the widely acclaimed authority and expertise of the CIS, serves as the industry standard reference for securing Kubernetes deployments. The CIS Benchmarks for Kubernetes have been incorporated into DivvyCloud’s technology, thus allowing companies to use DivvyCloud automation technology with their Kubernetes clusters while ensuring CIS compliance.
In this guide, we provide an overview of how DivvyCloud applies the CIS Benchmarks for Kubernetes to help security and GRC professionals keep pace with their company’s demands. We’ll discuss the holistic approach that DivvyCloud takes when analyzing a company’s digital infrastructure. We’ll illustrate how DivvyCloud uses its unified data model to identify and remediate security risks. And, we’ll describe how best to adopt DivvyCloud technology, both in terms of technology and culture, in order to ensure a maximum return on investment.
Taking a Holistic Approach to Kubernetes Security
Monolithic application architecture is becoming a thing of the past. Today, companies understand that loosely coupled, component-based architectures offer a better way to meet the continuously growing demand for increasingly complex software. Component-based architectures are more flexible and can be updated quickly to meet new demands in the marketplace. However, a component-based architecture is only as good as the digital infrastructure in which it operates, and the software development processes used to create the hundreds (or maybe thousands) of independent parts that make up the system overall.
In a monolithic application, application code is tightly intertwined with the computing environment. Thus, all that’s needed to monitor and remediate the system should a failure occur, is a single view that reports how the application is performing within the given computing environment.
Component-based applications are different. First, they are becoming more ephemeral. In an ephemeral system, all computing and application resources are expected to be temporary and constantly changing. Application components will move among a variety of computing resource automatically. With Kubernetes, clusters will typically be multi-region to provide fault tolerance. Thus, a holistic approach — one that observers both the host environment as well as the components operating in the host — is better suited for monitoring, analyzing, and remediating a component-based application such as those created using Kubernetes. (See Figure 1, below)
What Is Ephemeral Computing?
Ephemeral computing is when an operational environment is created on demand to support the immediate needs of a given application. Kubernetes is a good example of ephemeral computing. Kubernetes allows application architects to segment an application’s components into a variety of containers. The containers are organized into a Kubernetes object called a pod. A pod can have one or many containers. In order to ensure high availability, an application architect will orchestrate Kubernetes to run a number of copies of a pod over a variety of virtual machines. Load balancing technology works with Kubernetes to route traffic to a given pod.
Should a pod fail or its host virtual machine go offline, Kubernetes has the ability to replenish the affected pod automatically. Should the number of calls being made to a pod start to affect system performance, the Kubernetes orchestration configuration can be changed to increase the number of pods running. If demand subsides, the number of pods can be reduced. The orchestration technology ensures that enough containers are always available to meet demand.
In an ephemeral computing environment, resources are always being created and destroyed. And sometimes the lifetime of a given resource can be measured in seconds, maybe milliseconds. Thus, system architects need to create systems in which any and all application resource is expected to be temporary. Such systems require a more dynamic approach to design that is different from one in which the computing environment is static and stable.
Figure 1: A holistic approach to Kubernetes security considers both the hosting environments and the Kubernetes clusters.
As mentioned above, a holistic approach is one that monitors and analyzes both the hosting environment as well as the applications being deployed. The benefit of a holistic approach is that it provides the complete picture necessary for accurate reporting and remediation.
For example, imagine a container this is part of an application designed to validate a credit card transaction. Let’s say that the container is subject to a regulation that requires all credit card processing to take place within a given national locale (for example, the United States). The application is deployed to a transnational cloud provider. Upon initial deployment, the container is hosted in the cloud provider’s data center in the central United States. However, as traffic demand grows, more containers are created and deployed. Only this time, one container makes its way to the cloud provider’s data center in central Europe. (See Figure 2.)
Figure 2: Scaled container deployment example
Without a holistic approach to monitoring the entire system state — technical as well as regulatory — automated orchestration technology can unwittingly commit regulatory violations.
This deployment is a clear regulatory violation. However, the container has no awareness that it’s in violation. All it knows is how to validate a credit card transaction. And the host environment has no idea that it’s committing a regulatory violation. The host environment is simply running a container as it’s intended to. It’s only when combining knowledge of the regulation with information about the container and the host that the violation can be discovered. Combining all aspects relevant to an operational scenario is the basic premise of a holistic view.
As you can see from the scenario described above, taking a holistic view of a cloud-based computing environment is essential in order to understand the true state of a system in terms of operational security and compliance. Without a holistic view, companies are flying blind. Risks become undetectable.
DivvyCloud understands that in order to provide a comprehensive solution that addresses all aspects of operational risk, having a holistic view into the entirety of an application as well as its computing environment is imperative. It’s core to DivvyCloud technology.
Understanding DivvyCloud Technology
DivvyCloud software is designed to be agentless and standalone. You can apply it to any computing environment — public cloud or private software-defined infrastructure. The way DivvyCloud interacts with the host environment and Kubernetes is by way of their respective APIs. DivvyCloud continuously interacts with the APIs to gather information about the state of the hosts and the Kubernetes clusters of interest. These hosts can be Google Cloud, AWx`S, Azure, or a private data center that can expose infrastructure information via an API.
Once DivvyCloud is set up and targeted at the relevant host and Kubernetes clusters, it starts pulling down data about the environments — servers, security groups, load balancers, network-attached stores, S3 buckets, and any resource that is exposed via an API. This information is then unified into a single data model that represents the infrastructure and represents containment holistically. (See Figure 3.)
Figure 3: DivvyCloud is an agentless technology that discovers and saves host and cluster state into a unified data model.
Once DivvyCloud is operational, it analyzes data for configuration and security issues according to policies defined by regulations such as PCI DSS, GDPR, and HIPPA, to name a few. Also, included in the roster of supported regulations are the CIS Benchmarks, including those for Kubernetes. DivvyCloud supports over 220 policies out of the box. The CIS Benchmarks for Kubernetes are one of the latest additions. Support of CIS Benchmarks for Kubernetes is a significant breakthrough in automated remediation, as we’ll demonstrate shortly.
Applying CIS Benchmarks for Kubernetes to Automated Remediation
The CIS Benchmarks for Kubernetes are a comprehensive set of prescriptive security guidelines intended to provide companies with a way to implement safe and reliable Kubernetes clusters. The CIS Benchmarks for Kubernetes guideline rules apply to all of the components that make up Kubernetes. (See Figure 4.)
Figure 4: The CIS Benchmarks for Kubernetes affect all components in a Kubernetes cluster.
Overview of the CIS Benchmarks for Kubernetes Guidelines
The CIS Benchmarks for Kubernetes define over 120 guidelines. These rules apply to master and worker nodes. They apply to the control plane components — Controller Manager, Scheduler, API Server and etcd. In addition, the rules cover the components that are part of each worker node — kubelet, kube-proxy, cAdvisor and container network interfaces.
There are rules that define configuration settings that must be in force in order to ensure the security of a particular component. Also, there are rules intended to ensure that certain environment conditions are met — for example, requiring that a security context is applied to pod or container. Table 1 below shows examples of CIS Benchmark for Kubernetes guidelines, according to a particular Kubernetes component:
Table 1: Examples of guideline rules as defined in the CIS Benchmarks for Kubernetes
The CIS Benchmarks for Kubernetes are based on scoring. System administrators apply each guideline rule to the relevant asset in Kubernetes to which the given rule applies. Then, after all the rules have been applied, a score is determined. The resulting score describes the overall “health” of the system being inspected.
The CIS Benchmarks for Kubernetes rules are divided into two types: scored and not scored. A scored rule indicates that compliance with the given rule is considered as part of the target’s benchmark score and thus increases the overall score. A not-scored rule has no effect on the overall score should the target not adhere to the conditions of the rule.
An example of a scored rule is:
- 1.1.4 Ensure that the –kubelet-https argument is set to true (Scored)
An example of a not-scored rule is:
- 1.5.7 Ensure that a unique Certificate Authority is used for etcd (Not Scored)
Defining Levels of Severity
The CIS Benchmarks for Kubernetes defines two levels of severity: Level 1 and Level 2. Level 1 rules are deemed to “be practical and prudent; provide a clear security benefit; and not inhibit the utility of the technology beyond acceptable means.”
Violating Level 2 rules indicates a significant security compromise. According to the CIS Benchmarks for Kubernetes, they “may negatively inhibit the utility or performance of the technology.”
DivvyCloud has not only built the rules defined in the Benchmarks into its automation technology but also integrates the rules with regulations and best practices guidelines published by other regulatory organizations and government agencies. In addition, DivvyCloud allows enterprise personnel to define the degree of remediation behavior that is executed in response to a rule violation.
Creating a Security Strategy That Integrates Culture and Technology
DivvyCloud technology is well suited to address the security concerns of any company using Kubernetes in the cloud or in private data centers. However, implementing robust technology such as DivvyCloud is only part of a comprehensive security solution. More is needed. Companies that use DivvyCloud successfully not only embrace the technology, but also make the cultural and organizational changes necessary to realize the full benefit of securing the enterprise using an automated monitoring, analysis, and remediation tool.
To Take full advantage of the cloud and containerized computing paradigm, companies need to have the right people, processes, and tools in place.. Yet, many companies will incur a great deal of expense hoping to achieve the goal and still come up short. These companies spend money on all kinds of software and training, but they overlook the cultural and process changes necessary to fully adopt containerized computing on the cloud.
Companies that have experienced success moving to containerized computing in the cloud understand that you can’t simply buy your way into a digital transformation. A successful digital transformation requires an investment of time and effort from a people perspective. It’s about moving from a command and control management style to one based on an operational theme of trust but verify. And, with DivvyCloud, it means creating remediation policies that work for your environment but don’t get in the way of innovation.
Moving from Command and Control to Trust But Verify
The introduction of cloud computing and containers has brought about a significant change in the way large enterprise customers approach information technology. They’ve gone from having a centralized IT department that’s focused on controlling everything — from user access to server, storage, and network provisioning — to a self-service model in which developers create the computing infrastructure as they need it.
This transformation has forced system administrators to move away from being the sole protectors of the IT infrastructure. Their new role is more akin to that of Systems Management Consultant who is concerned with ensuring maximum business value from the investment Thus, while the operational sensibility in the past has been about Command and Control, today the watchwords are Trust But Verify. However, for many companies, making this transformation has not been easy.
The notion of letting developers provision environments independently is a hard pill for many system administrators to swallow. Some never make the transformation. But others see the value of making automated monitoring and remediation technology part of the IT infrastructure. Allowing developers to have more independence promotes the agility, speed, innovation, and sense of experimentation required for modern businesses to maintain a competitive advantage.
Providing a robust set of automated monitoring and remediation tools gives businesses the ability to ensure that developers are acting wisely and not creating risks that are preventable. Supporting a theme of Trust But Verify means having a culture that allows developers the freedom to experiment and innovate, while also giving systems personnel the tools they need to make sure that developers are working safely. As such, automated monitoring and remediation tools are indispensable. But, as with any tool, they must be used wisely — otherwise, the anticipated benefits of the technology can become unforeseen roadblocks. This is particularly true when it comes to configuring a remediations tool’s severity policies.
Creating Remediation Policies That Work
Automated remediation can be an effective tool for ensuring system security–provided remediation policies are configured in a way that is appropriate to a company’s software release process. There are few things more distressing than having a remediation tool that’s intended to avoid disaster inadvertently create one.
For example, imagine adding a new security policy to an automated remediation system that’s intended to restrict a container from having root access to its host. At first glance, this policy is a reasonable addition and one that is, in fact, documented as a guideline in the CIS Benchmarks for Kubernetes. But, after deploying the policy into a production environment, the result is that a number of preexisting containers that require root access are made inoperative. Thus, system failure occurs. What started as a good idea turned into an IT disaster.
Clearly, the scenario described above this is one that needs to be avoided. The question is how? After all, the scenario’s policy rule is sound. The problem is that the rule was introduced too late in the development cycle. The rule should have been introduced earlier in the software development lifecycle — for example in the testing or staging phases of the release process. Of course, introducing the remediation rule early on will still wreak havoc, but the failure will occur in a “safe” environment in which the problem will be exposed and fixes can be put into place. Then, once the troublesome behavior in the software is corrected, both the new policy and the new code can be moved in tandem into production.
While the scenario illustrated above is a bit dramatic, it does provide a good example of the importance of establishing appropriate remediations policies throughout the entire software development process. A good set of remediation policies will react to security and best practices violations according to both the degree of severity and the release phase in which the violation occurs. Draconian severity responses might be appropriate in testing and staging phases, yet completely unwarranted in production environments, and vice versa.
One of the benefits of DivvyCloud technology is that a response to a given violation is configurable according to the needs and maturity of the IT organization. Companies that are just starting out with automated remediation might do well to respond to problems by sending out emails or notifications in a Slack channel and leaving physical remediation actions in the hands of a developer or system administrator.
Other companies that are further along with automated remediation and are more trusting of the technology will impose more stringent remediation behavior in response to a policy violation — for example, gracefully stopping a build or safely removing a container from a cluster. Adopting automated remediation is not an all-or-nothing undertaking. It can be done in an incremental manner by introducing more powerful automation over time as companies become more skillful using the technology.
Few companies get remediation automation right at the beginning. It takes time to establish a set of policies that work. The important first step for effectively implementing this approach is to make sure that all members of a company’s IT staff are committed to using automation. Once the commitment is made, a company then develops appropriate remediation policy in an iterative fashion that fit the needs of the enterprise’s day to day operations.
The world of ephemeral computing using the cloud, containers, and Kubernetes continues to evolve in ways that are both innovative and challenging. Change happens so fast it’s hard for Security and GRC professionals to keep up. But there is help available. Using CIS Benchmarks combined with the automation capabilities of DivvyCloud will help companies embrace Kubernetes while improving their overall security posture.
DivvyCloud automation allows developers to engage in more experimentation and innovation It provides the trust and verification that system administrators need to ensure that work is being done according to industry standard security guidelines and well-established best practices. Automated remediation technology is a powerful tool for companies that use Kubernetes to get quality software into the hands of customers at web scale. DivvyCloud and its holistic approach to supporting the CIS Benchmarks for Kubernetes provide a competitive advantage that is unequaled for companies that put Kubernetes at the forefront of their digital infrastructure.