Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions docs/incident_detection/resources/htpasswd-secret.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# HTPasswd Secret for creating a test user
#
# Before applying, generate the htpasswd file:
# htpasswd -c -B -b users.htpasswd testuser password123
#
# Then base64 encode it:
# base64 -w0 users.htpasswd
#
# Replace the data.htpasswd value below with your encoded content

apiVersion: v1
kind: Secret
metadata:
name: htpass-secret
namespace: openshift-config
type: Opaque
data:
# base64 encoded: testuser:$2y$05$fBn5ChTgiV0A/6HEfoNKleU3CLVIWuV2816XVIsmmhwAz.fBpDObe
htpasswd: dGVzdHVzZXI6JDJ5JDA1JGZCbjVDaFRnaVYwQS82SEVmb05LbGVVM0NMVklXdVYyODE2WFZJc21taHdBei5mQnBET2JlCg==
Comment on lines +4 to +19
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove checked-in default credentials from the manifest.

Line 4 and Line 19 effectively publish reusable credentials. Please keep this file templated and require per-environment secret generation.

Proposed change
-#   htpasswd -c -B -b users.htpasswd testuser password123
+#   htpasswd -c -B -b users.htpasswd <username> <strong-random-password>
@@
-  # base64 encoded: testuser:$2y$05$fBn5ChTgiV0A/6HEfoNKleU3CLVIWuV2816XVIsmmhwAz.fBpDObe
-  htpasswd: dGVzdHVzZXI6JDJ5JDA1JGZCbjVDaFRnaVYwQS82SEVmb05LbGVVM0NMVklXdVYyODE2WFZJc21taHdBei5mQnBET2JlCg==
+  # base64 encoded content of users.htpasswd
+  htpasswd: <REPLACE_WITH_BASE64_HTPASSWD_CONTENT>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# htpasswd -c -B -b users.htpasswd testuser password123
#
# Then base64 encode it:
# base64 -w0 users.htpasswd
#
# Replace the data.htpasswd value below with your encoded content
apiVersion: v1
kind: Secret
metadata:
name: htpass-secret
namespace: openshift-config
type: Opaque
data:
# base64 encoded: testuser:$2y$05$fBn5ChTgiV0A/6HEfoNKleU3CLVIWuV2816XVIsmmhwAz.fBpDObe
htpasswd: dGVzdHVzZXI6JDJ5JDA1JGZCbjVDaFRnaVYwQS82SEVmb05LbGVVM0NMVklXdVYyODE2WFZJc21taHdBei5mQnBET2JlCg==
# htpasswd -c -B -b users.htpasswd <username> <strong-random-password>
#
# Then base64 encode it:
# base64 -w0 users.htpasswd
#
# Replace the data.htpasswd value below with your encoded content
apiVersion: v1
kind: Secret
metadata:
name: htpass-secret
namespace: openshift-config
type: Opaque
data:
# base64 encoded content of users.htpasswd
htpasswd: <REPLACE_WITH_BASE64_HTPASSWD_CONTENT>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/incident_detection/resources/htpasswd-secret.yaml` around lines 4 - 19,
Remove the checked-in default credentials by deleting the base64 value under the
Secret named "htpass-secret" (metadata.name) and replacing the static data key
"htpasswd" with a templated placeholder or removing the data section entirely so
each environment must generate the secret; update the manifest to either use
stringData with a placeholder (e.g., <REPLACE_WITH_BASE64_OR_GENERATE>) or
include comments directing operators to create the secret per-environment (e.g.,
generate htpasswd and kubectl create secret) rather than committing the hashed
credential in the file, ensuring the Secret still targets the same namespace
"openshift-config".

31 changes: 31 additions & 0 deletions docs/incident_detection/resources/limited-permissions-user.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Namespace and RoleBinding for testing limited permissions
#
# This creates a namespace with cluster-monitoring enabled and grants
# the testuser only monitoring-rules-view access (no full monitoring access)
#
# The user will receive 403 Forbidden when accessing:
# - /api/prometheus/api/v1/rules
# - /api/alertmanager/api/v2/silences
# - /api/prometheus/api/v1/query_range
# - /api/prometheus/api/v1/query

apiVersion: v1
kind: Namespace
metadata:
name: namespace-a
labels:
openshift.io/cluster-monitoring: "true"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: testuser-monitoring-view
namespace: namespace-a
subjects:
- kind: User
name: testuser
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: monitoring-rules-view
apiGroup: rbac.authorization.k8s.io
17 changes: 17 additions & 0 deletions docs/incident_detection/resources/oauth-htpasswd.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# OAuth configuration to use HTPasswd identity provider
#
# This configures OpenShift to authenticate users from the htpass-secret
# Apply htpasswd-secret.yaml first before applying this

apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: my_htpasswd_provider
mappingMethod: claim
type: HTPasswd
htpasswd:
fileData:
name: htpass-secret
Comment on lines +10 to +17
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check the actual file content and surrounding context
cat -n docs/incident_detection/resources/oauth-htpasswd.yaml

Repository: openshift/monitoring-plugin

Length of output: 610


🏁 Script executed:

# Check if there are other files in this directory and understand the context
ls -la docs/incident_detection/resources/

Repository: openshift/monitoring-plugin

Length of output: 395


🏁 Script executed:

# Search for how this file is referenced or used in the codebase
rg "oauth-htpasswd" --type yaml --type md

Repository: openshift/monitoring-plugin

Length of output: 224


🏁 Script executed:

# Look for any documentation about OAuth configuration
find docs -type f \( -name "*.md" -o -name "*.yaml" \) | xargs grep -l "oauth\|identityProviders" 2>/dev/null | head -20

Repository: openshift/monitoring-plugin

Length of output: 188


🏁 Script executed:

# Check the test documentation file that references this
cat -n docs/incident_detection/tests/3.api_calls_data_loading_flows.md | head -100

Repository: openshift/monitoring-plugin

Length of output: 6680


🏁 Script executed:

# Search for any warnings or guidance about OAuth configuration
rg "OAuth\|identityProviders\|oauth" docs/incident_detection/tests/ --type md -A 3 -B 3

Repository: openshift/monitoring-plugin

Length of output: 53


🏁 Script executed:

# Check if there's a README or setup guide in the resources directory
find docs/incident_detection -name "README*" -o -name "*setup*" -o -name "*guide*" | xargs cat 2>/dev/null

Repository: openshift/monitoring-plugin

Length of output: 53


🏁 Script executed:

# Check if there are any other identity provider examples or merge-based approaches
rg "identityProviders|mergeStrategy|patch" docs/ --type yaml --type md

Repository: openshift/monitoring-plugin

Length of output: 145


Add explicit warning that this OAuth manifest replaces all identity providers.

The existing comment implies fresh setup, but doesn't clearly warn that applying this manifest will completely replace any existing identity providers on the cluster. Even for documentation/test setup, add a prominent warning at the top of the file (e.g., "⚠️ WARNING: This manifest replaces all OAuth identity providers. Do not apply to clusters with existing authentication configured.") to prevent accidental lockout.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/incident_detection/resources/oauth-htpasswd.yaml` around lines 10 - 17,
Add a prominent warning at the top of the OAuth manifest stating that applying
this spec.identityProviders manifest will replace all existing OAuth identity
providers and can lock out clusters with existing authentication; update the
docs/incident_detection/resources/oauth-htpasswd.yaml file to include the
warning before the spec (e.g., reference the affected block like the
spec.identityProviders entry and the htpasswd provider named
my_htpasswd_provider with htpasswd.fileData.name set to htpass-secret) so
readers clearly know not to apply this to production clusters.

101 changes: 101 additions & 0 deletions docs/incident_detection/simulate_scenarios/100-alerts-14-days.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
start,end,alertname,namespace,severity,silenced,labels
7791,13621,KubeNodeUnreachable_001,openshift-monitoring,warning,false,{"component": "etcd"}
7434,12671,KubeNodeNotReady_002,openshift-storage,critical,false,{"component": "monitoring"}
1738,18654,KubePodCrashLooping_003,openshift-logging,critical,false,{"component": "storage"}
1524,20019,ThanosQueryDown_004,openshift-etcd,critical,false,{"component": "Others"}
1923,18016,KubeNodeNotReady_005,openshift-logging,info,false,{"component": "network"}
1309,19658,KubePodCrashLooping_006,openshift-monitoring,info,false,{"component": "version"}
1449,19662,PrometheusDown_007,openshift-network,info,false,{"component": "storage"}
958,19475,KubeDeploymentReplicasMismatch_008,openshift-storage,critical,false,{"component": "compute"}
1682,18597,AlertmanagerDown_009,openshift-machine-api,warning,false,{"component": "etcd"}
6959,16519,HighMemoryUsage_010,openshift-machine-api,warning,false,{"component": "api-server"}
1693,19790,AlertmanagerDown_011,openshift-machine-api,warning,false,{"component": "version"}
2799,14348,HighCPUUsage_012,openshift-machine-api,warning,false,{"component": "monitoring"}
1917,19276,ThanosQueryDown_013,openshift-etcd,warning,false,{"component": "monitoring"}
868,18682,KubePodCrashLooping_014,openshift-network,critical,false,{"component": "Others"}
17,19188,ThanosQueryDown_015,openshift-monitoring,warning,false,{"component": "compute"}
862,19193,HighMemoryUsage_016,openshift-network,warning,false,{"component": "etcd"}
232,18233,EtcdMembersDown_017,openshift-storage,critical,false,{"component": "Others"}
708,19723,KubePodNotReady_018,openshift-kube-apiserver,info,false,{"component": "etcd"}
747,19284,KubeNodeNotReady_019,openshift-kube-apiserver,info,false,{"component": "compute"}
76,19563,KubeCPUOvercommit_020,openshift-machine-api,info,false,{"component": "network"}
1566,18597,KubeDeploymentReplicasMismatch_021,openshift-monitoring,critical,false,{"component": "compute"}
472,18197,APIServerErrorsHigh_022,openshift-machine-api,warning,false,{"component": "api-server"}
975,19547,ThanosQueryDown_023,openshift-etcd,warning,false,{"component": "api-server"}
1473,18723,APIServerLatency_024,openshift-operators,info,false,{"component": "version"}
5642,16725,CephClusterWarning_025,openshift-monitoring,critical,false,{"component": "api-server"}
700,18591,NetworkLatencyHigh_026,openshift-operators,warning,false,{"component": "network"}
326,18545,EtcdHighCommitDurations_027,openshift-operators,warning,false,{"component": "api-server"}
74,19235,PrometheusDown_028,openshift-logging,critical,false,{"component": "network"}
1762,19690,KubePodCrashLooping_029,openshift-operators,warning,false,{"component": "monitoring"}
6443,6720,ThanosQueryDown_030,openshift-logging,info,false,{"component": "version"}
1103,20061,KubePodCrashLooping_031,openshift-storage,critical,true,{"component": "monitoring"}
1847,19801,KubeNodeUnreachable_032,openshift-logging,warning,false,{"component": "etcd"}
4202,13627,KubePodNotReady_033,openshift-storage,warning,false,{"component": "storage"}
1172,19317,EtcdHighCommitDurations_034,openshift-operators,critical,false,{"component": "etcd"}
363,19725,HighCPUUsage_035,openshift-kube-apiserver,critical,false,{"component": "version"}
1794,18628,APIServerLatency_036,openshift-etcd,warning,false,{"component": "monitoring"}
782,19732,PersistentVolumeErrors_037,openshift-logging,warning,false,{"component": "version"}
430,18586,KubePodCrashLooping_038,openshift-etcd,warning,false,{"component": "version"}
1260,18218,ClusterOperatorDegraded_039,openshift-storage,warning,false,{"component": "network"}
5466,12367,KubePodNotReady_040,openshift-operators,warning,true,{"component": "compute"}
6449,18321,KubeCPUOvercommit_041,openshift-logging,critical,false,{"component": "api-server"}
493,19194,HighCPUUsage_042,openshift-machine-api,info,false,{"component": "Others"}
150,19380,CephClusterWarning_043,openshift-logging,critical,true,{"component": "version"}
658,18963,PrometheusDown_044,openshift-network,info,false,{"component": "Others"}
1318,19654,KubeDeploymentReplicasMismatch_045,openshift-kube-apiserver,warning,false,{"component": "etcd"}
1808,11121,ThanosQueryDown_046,openshift-network,warning,false,{"component": "monitoring"}
5183,13768,APIServerLatency_047,openshift-operators,info,true,{"component": "storage"}
1192,11469,KubePodCrashLooping_048,openshift-operators,critical,false,{"component": "version"}
1399,18148,KubeMemoryOvercommit_049,openshift-etcd,info,false,{"component": "etcd"}
821,18726,EtcdHighCommitDurations_050,openshift-network,warning,true,{"component": "compute"}
802,19873,PrometheusTargetDown_051,openshift-storage,info,false,{"component": "Others"}
4593,11031,KubePodNotReady_052,openshift-operators,warning,false,{"component": "api-server"}
671,19852,CephClusterWarning_053,openshift-etcd,critical,false,{"component": "network"}
813,19728,APIServerErrorsHigh_054,openshift-machine-api,critical,false,{"component": "version"}
969,19155,KubeMemoryOvercommit_055,openshift-etcd,warning,true,{"component": "compute"}
606,18620,HighMemoryUsage_056,openshift-monitoring,critical,true,{"component": "Others"}
379,19131,NetworkLatencyHigh_057,openshift-storage,warning,false,{"component": "version"}
440,19274,EtcdNoLeader_058,openshift-monitoring,info,false,{"component": "Others"}
1137,18076,KubeDeploymentReplicasMismatch_059,openshift-logging,warning,false,{"component": "compute"}
6200,16600,CephClusterWarning_060,openshift-monitoring,warning,false,{"component": "etcd"}
168,18071,PersistentVolumeErrors_061,openshift-etcd,warning,false,{"component": "monitoring"}
1187,18305,EtcdNoLeader_062,openshift-operators,warning,false,{"component": "etcd"}
152,18342,ClusterOperatorDegraded_063,openshift-logging,warning,false,{"component": "api-server"}
4,18156,KubeCPUOvercommit_064,openshift-machine-api,warning,false,{"component": "compute"}
1546,18785,CephClusterWarning_065,openshift-logging,critical,false,{"component": "Others"}
688,18035,PersistentVolumeErrors_066,openshift-etcd,warning,false,{"component": "storage"}
6258,12288,EtcdHighCommitDurations_067,openshift-network,warning,false,{"component": "Others"}
518,19559,DNSLatencyHigh_068,openshift-etcd,critical,false,{"component": "Others"}
645,18003,CephClusterWarning_069,openshift-operators,critical,false,{"component": "version"}
1785,19437,CephClusterWarning_070,openshift-storage,info,false,{"component": "version"}
122,19575,KubeNodeNotReady_071,openshift-kube-apiserver,info,false,{"component": "api-server"}
1987,18033,DNSLatencyHigh_072,openshift-storage,warning,true,{"component": "storage"}
2394,12595,EtcdMembersDown_073,openshift-kube-apiserver,warning,false,{"component": "network"}
933,19901,EtcdNoLeader_074,openshift-machine-api,info,false,{"component": "storage"}
946,19710,KubeDeploymentReplicasMismatch_075,openshift-machine-api,critical,false,{"component": "monitoring"}
531,18796,EtcdMembersDown_076,openshift-kube-apiserver,warning,false,{"component": "version"}
14687,15006,KubeNodeNotReady_077,openshift-etcd,info,false,{"component": "compute"}
1722,19888,KubeCPUOvercommit_078,openshift-logging,warning,false,{"component": "network"}
1592,18315,HighCPUUsage_079,openshift-network,critical,false,{"component": "api-server"}
808,18252,KubeMemoryOvercommit_080,openshift-etcd,info,true,{"component": "Others"}
841,18753,EtcdNoLeader_081,openshift-logging,warning,false,{"component": "version"}
265,18155,EtcdMembersDown_082,openshift-machine-api,critical,false,{"component": "api-server"}
1456,18239,APIServerLatency_083,openshift-machine-api,critical,false,{"component": "compute"}
6459,14355,PrometheusDown_084,openshift-logging,warning,false,{"component": "etcd"}
1787,18753,KubePodCrashLooping_085,openshift-logging,info,false,{"component": "etcd"}
17524,17625,AlertmanagerDown_086,openshift-kube-apiserver,warning,false,{"component": "etcd"}
4192,4457,KubePodCrashLooping_087,openshift-monitoring,warning,false,{"component": "api-server"}
5,19339,KubeNodeUnreachable_088,openshift-storage,warning,false,{"component": "compute"}
1187,11825,DiskSpaceLow_089,openshift-storage,critical,false,{"component": "Others"}
698,18146,NetworkLatencyHigh_090,openshift-kube-apiserver,warning,true,{"component": "compute"}
1622,19711,KubePodNotReady_091,openshift-network,warning,false,{"component": "api-server"}
733,18541,AlertmanagerDown_092,openshift-etcd,critical,false,{"component": "monitoring"}
110,18407,NetworkLatencyHigh_093,openshift-storage,warning,false,{"component": "monitoring"}
1845,19244,CephClusterWarning_094,openshift-kube-apiserver,critical,false,{"component": "monitoring"}
76,18403,KubeDeploymentReplicasMismatch_095,openshift-operators,critical,false,{"component": "storage"}
4622,11280,DiskSpaceLow_096,openshift-kube-apiserver,warning,false,{"component": "api-server"}
1307,20050,KubePodCrashLooping_097,openshift-storage,warning,false,{"component": "Others"}
53,18791,KubeCPUOvercommit_098,openshift-operators,critical,false,{"component": "etcd"}
7420,15350,KubeMemoryOvercommit_099,openshift-network,warning,false,{"component": "Others"}
1681,19730,KubePodCrashLooping_100,openshift-etcd,critical,false,{"component": "compute"}
Loading