Security
Applied Security Policies
kspacr uses Kyverno, a Kubernetes Native Policy Manager and Admission Controller, to validate every API server request and perform extensive security checks using these following security policies. This ensures that all requests are valid and secure, preventing unauthorized or malicious actions.
Pod Security Standard (Strict)
Apply PSS Restricted Profile
Pod Security Standards
define the fields and their options which are allowable for Pods to achieve certain security best practices. While these are typically validation policies, workloads will either be accepted or rejected based upon what has already been defined. It is also possible to mutate incoming Pods to achieve the desired PSS level rather than reject. This policy sets all the fields necessary to pass the PSS restricted
profile.
Policy Definition: Apply PSS Restricted Profile
Disallow Capabilities (Strict)
Adding capabilities other than NET_BIND_SERVICE
is disallowed. In addition, all containers must explicitly drop ALL
capabilities.
Policy Definition: Disallow Capabilities (Strict)
Disallow Host Namespaces
Host namespaces (Process ID namespace, Inter-Process Communication namespace, and network namespace) allow access to shared information and can be used to elevate privileges. Pods should not be allowed access to host namespaces. This policy ensures fields which make use of these host namespaces are unset or set to false
.
Policy Definition: Disallow Host Namespaces
Disallow hostPath
hostPath
volumes let Pods use host directories and volumes in containers. Using host resources can be used to access shared data or escalate privileges and should not be allowed. This policy ensures no hostPath
volumes are in use.
Policy Definition: Disallow hostPath
Disallow hostPorts
Access to host ports allows potential snooping of network traffic and should not be allowed, or at minimum restricted to a known list. This policy ensures the hostPort
field is unset or set to 0
.
Policy Definition: Disallow hostPorts
Disallow hostProcess
Windows pods offer the ability to run hostProcess
containers which enables privileged access to the Windows node. Privileged access to the host is disallowed in the baseline policy. This policy ensures the hostProcess
field, if present, is set to false
.
Policy Definition: Disallow hostProcess
Disallow procMount
The default /proc
masks are set up to reduce attack surface and should be required. This policy ensures nothing but the default procMount
can be specified. Note that in order for users to deviate from the Default
procMount requires setting a feature gate at the API server.
Policy Definition: Disallow procMount
Disallow SELinux
SELinux options can be used to escalate privileges and should not be allowed. This policy ensures that the seLinuxOptions
field is undefined.
Policy Definition: Disallow SELinux
Restrict AppArmor
On supported hosts, the 'runtime/default' AppArmor profile is applied by default. The default policy should prevent overriding or disabling the policy, or restrict overrides to an allowed set of profiles. This policy ensures Pods do not specify any other AppArmor profiles than runtime/default
or localhost/*
.
Policy Definition: Restrict AppArmor
Restrict sysctls
Sysctls can disable security mechanisms or affect all containers on a host, and should be disallowed except for an allowed "safe" subset. A sysctl is considered safe if it is namespaced in the container or the Pod, and it is isolated from other Pods or processes on the same Node. This policy ensures that only those "safe" subsets can be specified in a Pod.
Policy Definition: Restrict sysctls
Disallow Privileged Containers
Privileged mode disables most security mechanisms and must not be allowed. This policy ensures Pods do not call for privileged mode.
Policy Definition: Disallow Privileged Containers
Disallow Privilege Escalation
Privilege escalation, such as via set-user-ID or set-group-ID file mode, should not be allowed. This policy ensures the allowPrivilegeEscalation
field is set to false
.
Policy Definition: Disallow Privilege Escalation
Require Non-Root Groups
Containers should be forbidden from running with a root primary or supplementary GID. This policy ensures the runAsGroup
, supplementalGroups
, and fsGroup
fields are set to a number greater than zero (i.e., non root).
Policy Definition: Require Non-Root Groups
Require Run As Non-Root User
Containers must be required to run as non-root users. This policy ensures runAsUser
is either unset or set to a number greater than zero.
Policy Definition: Require Run As Non-Root User
Require runAsNonRoot
Containers must be required to run as non-root users. This policy ensures runAsNonRoot
is set to true
.
Policy Definition: Require runAsNonRoot
Restrict Seccomp (Strict)
The seccomp profile in the Restricted group must not be explicitly set to Unconfined but additionally must also not allow an unset value. This policy, requiring Kubernetes v1.19 or later, ensures that seccomp is set to RuntimeDefault
or Localhost
.
Policy Definition: Restrict Seccomp (Strict)
Restrict Volume Types
In addition to restricting HostPath
volumes, the restricted pod security profile limits usage of non-core volume types to those defined through PersistentVolumes
. This policy blocks any other type of volume other than those in the allow list.
Policy Definition: Restrict Volume Types
Containers
Add Default Resources
Pods which don't specify at least resource requests are assigned a QoS class of BestEffort which can hog resources for other Pods on Nodes. At a minimum, all Pods should specify resource requests in order to be labeled as the QoS class Burstable. This sample mutates any container in a Pod which doesn't specify memory or cpu requests to apply some sane defaults.
Policy Definition: Add Default Resources
Add TTL to Jobs
Jobs which are user created can often pile up and consume excess space in the cluster. In Kubernetes 1.23, the TTL-after-finished controller is stable and will automatically clean up these Jobs if the ttlSecondsAfterFinished
is specified. This policy adds the ttlSecondsAfterFinished
field to a Job that does not have an ownerReference set if not already specified.
Policy Definition: Add TTL to Jobs
Disallow CRI socket mounts
Container daemon socket bind mounts allows access to the container engine on the node. This access can be used for privilege escalation and to manage containers outside of Kubernetes, and hence should not be allowed. This policy validates that the sockets used for CRI engines Docker, Containerd, and CRI-O are not used.
Policy Definition: Disallow CRI socket mounts
Disallow Default Namespace
Kubernetes Namespaces provide a way to segment and isolate cluster resources across multiple applications and users. As a best practice, workloads should be isolated with Namespaces. Namespaces should be required and the default (empty) Namespace should not be used. This policy validates that Pods specify a Namespace name other than default
. Rule auto-generation is disabled here due to Pod controllers need to specify the namespace
field under the top-level metadata
object and not at the Pod template level.
Policy Definition: Disallow Default Namespace
Require Requests and Limits for emptyDir
Pods which mount emptyDir
volumes may be allowed to potentially overrun the medium backing the emptyDir
volume. This sample ensures that any initContainers
or containers
mounting an emptyDir
volume have ephemeral-storage requests and limits set. Policy will be skipped if the volume has already a sizeLimit
set.
Policy Definition: Require Requests and Limits for emptyDir
Mutate termination Grace Periods Seconds
Pods with large terminationGracePeriodSeconds
(tGPS) might prevent cluster nodes from getting drained, ultimately making the whole cluster unstable. This policy mutates all incoming Pods to set their tGPS under 30s. If the user creates a pod without specifying tGPS, then the Kubernetes default of 30s is maintained.
Policy Definition: Mutate termination Grace Periods Seconds
Network
Disallow empty Ingress host
An ingress resource needs to define an actual host name in order to be valid. This policy ensures that there is a hostname for each rule defined.
Policy Definition: Disallow empty Ingress host
Traefik: Disallow Default TLSOptions
The TLSOption
CustomResource sets cluster-wide TLS configuration options for Traefik when none are specified in a TLS router. Since this can take effect for all Ingress resources, creating the default
TLSOption is a restricted operation. This policy ensures that only a cluster-admin can create the default
TLSOption resource.
Policy Definition: Traefik: Disallow Default TLSOptions
Storage
Require StorageClass
PersistentVolumeClaims
(PVCs) and StatefulSets
may optionally define a StorageClass
to dynamically provision storage. In a multi-tenancy environment where StorageClasses are far more common, it is often better to require storage only be provisioned from these StorageClasses
. This policy requires that PVCs and StatefulSets define the storageClassName
field with some value.
Policy Definition: Require StorageClass
Restrict StorageClass
StorageClasses
allow description of custom "classes" of storage offered by the cluster, based on quality-of-service levels, backup policies, or custom policies determined by the cluster administrators. For shared StorageClasses
in a multi-tenancy environment, a reclaimPolicy of Delete
should be used to ensure a PersistentVolume
cannot be reused across Namespaces. This policy requires StorageClasses
set a reclaimPolicy of Delete
.
Policy Definition: Restrict StorageClass
Remove hostPath Volumes
Pods which mount hostPath
volumes are provided access to the underlying filesystem of the Node on which they run. In most scenarios, this should be forbidden. In others, it may be useful to silently remove those hostPath
volumes rather than blocking the Pod. This policy removes all hostPath
volumes and their volumeMount
references from all containers within a Pod.
Policy Definition: Remove hostPath Volumes