Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .cspell/terms.txt
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,5 @@ cephclusters
cephfilesystems
cephobjectstores
cephobjectstoreusers
ghostunnel
lvmdconfig
7 changes: 7 additions & 0 deletions docs/en/security/workload_security/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
Comment thread
tossmilestone marked this conversation as resolved.
weight: 40
---

# Workload Security

<Overview />
311 changes: 311 additions & 0 deletions docs/en/security/workload_security/spire.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,311 @@
---
weight: 10
title: SPIRE
---

# Alauda Build of SPIRE

SPIRE (the SPIFFE Runtime Environment) is a toolchain for establishing trust between software systems across a wide variety of hosting platforms. SPIRE exposes the [SPIFFE](https://spiffe.io/) (Secure Production Identity Framework for Everyone) API, which allows workloads to securely authenticate each other without the need for shared secrets or hardcoded credentials.

In the Alauda Container Platform (ACP), SPIRE is provided as a cluster plugin to enhance workload security by providing cryptographically verifiable identities (SVIDs) for all workloads.

## Core Components

The SPIRE plugin includes several components to manage identities:

- **SPIRE Server**: The central authority that manages trust and issues identities.
- **SPIRE Agent**: Runs on every node and delivers identities to local workloads.
- **SPIFFE CSI Driver**: A Container Storage Interface driver that mounts SVIDs as volumes into Pods.
- **OIDC Discovery Provider**: Exposes an OIDC discovery document for SPIRE's trust domain.

## Workflow

The SPIRE plugin operates based on a zero-trust model. The following high-level workflow describes how identities are established and delivered:

1. **Deployment**: The SPIRE Server, Agent, and CSI Driver are deployed as cluster plugins.
2. **Node Attestation**: When a node starts, the SPIRE Agent connects to the SPIRE Server and identifies the node (e.g., using Kubernetes node labels and security tokens).
3. **Agent SVID Issuance**: The SPIRE Server evaluates the node's identity against its policies and issues an **Agent SVID** and a Trust Bundle (CA Certificate) to the Agent.
4. **Workload Request**: A workload (such as a Pod) starts on the node and requests a secure identity from the local SPIRE Agent's Workload API (exposed via the SPIFFE CSI Driver).
5. **Workload Attestation**: The SPIRE Agent gathers metadata about the workload (e.g., namespace, service account, labels) and sends it to the SPIRE Server.
6. **Workload SVID Issuance**: The SPIRE Server verifies the workload information against its registration entries and issues a **Workload SVID**.
7. **SVID Delivery**: The SPIRE Agent receives the SVID and delivers it to the workload. The SPIFFE CSI Driver mounts this identity as a volume into the Pod.
8. **Secure Communication**: The workload now has a cryptographically verifiable identity and can use it to perform mTLS-authenticated communication with other SPIFFE-aware services.

## Installation

### Prerequisites

- A running Alauda Container Platform cluster.
- An available **StorageClass** in the cluster for persistent storage.
- Sufficient resources for the SPIRE stack.

### Install via Console

1. Log in to the ACP console and navigate to **Administrator**.
2. Click **Marketplace** > **Cluster Plugins**.
3. Select the target cluster in the top navigation bar.
4. Search for **Alauda Build of SPIRE** and click to view its details.
5. Click **Install**.
6. In the configuration step, update the following parameters:
- **Cluster Name**: The unique identifier for your cluster (e.g., `dce-cluster`).
- **Trust Domain**: The trust domain name (e.g., `example.org`).
- **Common Name**: The common name for the SPIRE CA.
- **Storage Class**: Select an available storage class for SPIRE Server data storage.
7. Click **Confirm** to complete the installation.

## Usage

### Using SPIFFE IDs in Workloads

Once SPIRE is installed, workloads can obtain their SVIDs using the SPIFFE CSI Driver.

1. **Enable SPIFFE for a Pod**: Add the SPIFFE CSI volume to your Pod specification.

```yaml
apiVersion: v1
kind: Pod
metadata:
name: example-workload
spec:
containers:
- name: app
image: alpine
volumeMounts:
- name: spiffe-workload-api
mountPath: /run/spiffe.io/public
readOnly: true
volumes:
- name: spiffe-workload-api
csi:
driver: 'csi.spiffe.io'
readOnly: true
```

2. **Workload Authentication**: Your application can now use the SPIFFE Workload API at `/run/spiffe.io/public` to retrieve its identity and certificates.

### Attestation

Attestation is the process by which SPIRE identifies and verifies the identity of nodes and workloads.

- **Node Attestation**: The process where a SPIRE Agent identifies itself to the SPIRE Server. In the Alauda Container Platform, this typically uses the `k8s_psat` (Projected Service Account Token) method. The Agent provides a signed Kubernetes service account token, which the SPIRE Server verifies against the Kubernetes API to confirm the Agent's identity and node location.
- **Workload Attestation**: The process where a workload identifies itself to the local SPIRE Agent. When a workload requests an identity, the Agent inspects the workload's metadata (such as its Kubernetes namespace, service account name, or Unix user ID) using "selectors". These selectors are then compared against registration entries in the SPIRE Server to determine which SVID the workload is authorized to receive.

### Workload Registration

By default, the **SPIRE Controller Manager** automatically creates SPIRE entries for Kubernetes workloads based on their labels or service accounts. You can customize this behavior using annotations on your workloads.

## End-to-End Implementation Example

The following example demonstrates a complete SPIRE deployment showing how to set up mutual TLS (mTLS) authentication between workloads using SPIFFE identities.

<Steps>

### Implementation Goals and Architecture

This example shows how to:

- Deploy SPIRE Server and Agent in a single Kubernetes cluster
- Complete Node Attestation using k8s_psat
- Automatically issue SPIFFE IDs and X.509 SVIDs for example workloads
- Verify that authorized workloads can successfully obtain identities while unauthorized workloads cannot

The architecture consists of:

- **Server Workload**: SPIFFE ID `spiffe://example.org/ns/example/sa/server-sa`, retrieves certificates via Workload API, enables mTLS trusting only specific Client SPIFFE ID
- **Client Workload**: SPIFFE ID `spiffe://example.org/ns/example/sa/client-sa`, retrieves certificates via Workload API, accesses Server using mTLS
- Both communicate with **SPIRE Agent** which talks to **SPIRE Server**

### Prerequisites

Before starting, ensure you have:

- The `example` namespace created: `kubectl create namespace example`

### Workload Registration

Register the client and server workloads with SPIRE:

1. Register the client workload:

```bash
kubectl exec -n spire spire-server-0 -- \
/opt/spire/bin/spire-server entry create \
-spiffeID spiffe://example.org/ns/example/sa/client-sa \
-parentID spiffe://example.org/ns/spire/sa/spire-agent \
-selector k8s:ns:example
```
Comment on lines +130 to +136
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Use distinct workload selectors for client vs server registrations.

Both SPIRE entries currently use the same selector (k8s:ns:example). That allows ambiguous identity issuance and undermines the mTLS authz example. Bind each entry to its specific service account (and optionally pod label).

🔐 Suggested selector fix
 kubectl exec -n spire spire-server-0 -- \
   /opt/spire/bin/spire-server entry create \
   -spiffeID spiffe://example.org/ns/example/sa/client-sa \
   -parentID spiffe://example.org/ns/spire/sa/spire-agent \
-  -selector k8s:ns:example
+  -selector k8s:ns:example \
+  -selector k8s:sa:client-sa

 kubectl exec -n spire spire-server-0 -- \
   /opt/spire/bin/spire-server entry create \
   -spiffeID spiffe://example.org/ns/example/sa/server-sa \
   -parentID spiffe://example.org/ns/spire/sa/spire-agent \
-  -selector k8s:ns:example
+  -selector k8s:ns:example \
+  -selector k8s:sa:server-sa

Also applies to: 142-147

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/security/workload_security/spire.mdx` around lines 131 - 137, The two
SPIRE entry create examples use the same selector "k8s:ns:example", which
permits ambiguous identities; update each spire-server entry create command (the
one with spiffeID spiffe://example.org/ns/example/sa/client-sa and the
corresponding server entry) to bind to distinct selectors that include the
specific service account (e.g., k8s:ns:example:k8s:sa:client-sa for the client
entry and k8s:ns:example:k8s:sa:server-sa for the server entry) and optionally
add a pod label selector (e.g., k8s:label:app:client-app) so each registration
uniquely ties to its service account/pod.


2. Register the server workload:

```bash
kubectl exec -n spire spire-server-0 -- \
/opt/spire/bin/spire-server entry create \
-spiffeID spiffe://example.org/ns/example/sa/server-sa \
-parentID spiffe://example.org/ns/spire/sa/spire-agent \
-selector k8s:ns:example
```

Both commands will return entry IDs confirming successful registration.

### Workload Deployment

Deploy the workloads that use ghostunnel to handle mTLS with SPIRE identities.

#### Deploy Server Workload

Create the server workload that exposes an HTTPS endpoint with mTLS:

```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: server-sa
namespace: example
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: server-workload
namespace: example
spec:
replicas: 1
selector:
matchLabels:
app: server-workload
template:
metadata:
labels:
app: server-workload
spec:
containers:
- args:
- echo 'Authenticated SPIFFE Content' > index.html && python3 -m http.server 8080 --bind 127.0.0.1
command:
- /bin/sh
- -c
image: python:3.9-alpine
imagePullPolicy: IfNotPresent
name: my-app
ports:
- containerPort: 8080
protocol: TCP
- args:
- server
- --listen
- :8443
- --use-workload-api-addr=unix:/run/spire/sockets/api.sock
- --target
- 127.0.0.1:8080
- --allow-uri
- spiffe://example.org/ns/example/sa/client-sa
image: ghostunnel/ghostunnel:v1.9.0
imagePullPolicy: Always
name: ghostunnel
ports:
- containerPort: 8443
protocol: TCP
volumeMounts:
- mountPath: /run/spire/sockets
name: spiffe-socket
serviceAccount: server-sa
serviceAccountName: server-sa
volumes:
- hostPath:
path: /run/spire/agent-sockets
type: Directory
name: spiffe-socket
```
Comment thread
tossmilestone marked this conversation as resolved.

Apply with: `kubectl apply -f server-workload.yaml`

#### Deploy Client Workload

Create the client workload that connects to the server using mTLS:

```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: client-sa
namespace: example
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: client-workload
namespace: example
spec:
replicas: 1
selector:
matchLabels:
app: client-workload
template:
metadata:
labels:
app: client-workload
spec:
serviceAccountName: client-sa
containers:
- args:
- client
- --listen
- 127.0.0.1:8080
- --target
- server-workload:8443
- --use-workload-api-addr=unix:/run/spire/sockets/api.sock
- --verify-uri
- spiffe://example.org/ns/example/sa/server-sa
image: ghostunnel/ghostunnel:v1.9.0
imagePullPolicy: IfNotPresent
name: ghostunnel
ports:
- containerPort: 8080
protocol: TCP
volumeMounts:
- mountPath: /run/spire/sockets
name: spiffe-socket
volumes:
- hostPath:
path: /run/spire/agent-sockets
type: Directory
name: spiffe-socket
```

Apply with: `kubectl apply -f client-workload.yaml`

### Verification

Test the mTLS authentication with success and failure scenarios.

#### Authentication Success

Execute from the client workload pod:

```bash
kubectl exec -n example deploy/client-workload -c ghostunnel -- curl -s http://127.0.0.1:8080
```

**Expected Output:**

```
Authenticated SPIFFE Content
```

This confirms successful mTLS authentication.

#### Authentication Failure

To demonstrate proper security enforcement, modify the client's `--verify-uri` parameter to an incorrect URI (e.g., `spiffe://example.org/ns/example/sa/server` instead of `spiffe://example.org/ns/example/sa/server-sa`):

```bash
kubectl exec -n example deploy/client-workload -c ghostunnel -- curl -s http://127.0.0.1:8080
```
Comment on lines +298 to +302
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make the failure test reproducible with an explicit mutation step.

Line 299 says to change --verify-uri, but the command shown is unchanged curl invocation. Add a concrete patch/redeploy step before re-running curl.

🧪 Suggested doc addition
 To demonstrate proper security enforcement, modify the client's `--verify-uri` parameter to an incorrect URI (e.g., `spiffe://example.org/ns/example/sa/server` instead of `spiffe://example.org/ns/example/sa/server-sa`):
+
+```bash
+kubectl -n example patch deploy client-workload --type='json' -p='[
+  {"op":"replace","path":"/spec/template/spec/containers/0/args/8","value":"spiffe://example.org/ns/example/sa/server"}
+]'
+kubectl -n example rollout status deploy/client-workload
+```
 
 ```bash
 kubectl exec -n example deploy/client-workload -c ghostunnel -- curl -s http://127.0.0.1:8080
</details>

<!-- suggestion_start -->

<details>
<summary>📝 Committable suggestion</summary>

> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

```suggestion
To demonstrate proper security enforcement, modify the client's `--verify-uri` parameter to an incorrect URI (e.g., `spiffe://example.org/ns/example/sa/server` instead of `spiffe://example.org/ns/example/sa/server-sa`):

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/security/workload_security/spire.mdx` around lines 299 - 303, The
docs tell readers to change the client's --verify-uri but do not show the actual
mutation and redeploy step; update the example to patch the client deployment's
ghostunnel container args to the incorrect URI and wait for rollout before
re-running the curl check. Specifically, add a kubectl patch/replace targeting
the client-workload deployment's container args (ghostunnel container, e.g.,
modify args index used for --verify-uri) and then run kubectl rollout status
deploy/client-workload prior to the kubectl exec ... curl command so the failure
is reproducible.


**Expected Result:**

- Command terminates with exit code 56
- Client logs show: `unauthorized: invalid principal, or principal not allowed`

This demonstrates that SPIRE properly enforces authentication and rejects unauthorized connections.

</Steps>