Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 46 additions & 31 deletions content/6-development/6-4-using-the-crs-sandbox.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ The sandbox is located at https://sandbox.coreruleset.org/.

An easy way to use the sandbox is to send requests to it with `curl`, although you can use any HTTPS client.

The sandbox has many options, which you can change by adding HTTP headers to your request. One is very important so we will explain it first; this is the `X-Format-Output: txt-matched-rules` header. If you add this header to your request, the sandbox will parse the WAFs output, and return to you the matched CRS rule IDs with descriptions, and the score for your request.
The sandbox has many options, which you can change by adding HTTP headers to your request. One is very important so we will explain it first; this is the `X-Format-Output: txt-matched-rules` header. If you add this header to your request, the sandbox will parse the WAF's output, and return to you the matched CRS rule IDs with descriptions, and the score for your request.

### Example

Expand All @@ -43,21 +43,22 @@ In this example, we sent `?file=/etc/passwd` as a GET payload. The CRS should ca

You can send anything you want at the sandbox, for instance, you can send HTTP headers, POST data, use various HTTP methods, et cetera.

The sandbox will return a 200 response code, no matter if an attack was detected or not.
If no attack is detected, the sandbox returns an empty result in the requested format (e.g., an empty JSON array `[]` for `json-matched-rules`, or an empty response for `txt-matched-rules`).

The sandbox also adds a `X-Unique-Id` header to the response. It contains a unique value that you can use to refer to your request when communicating with us. With `curl -i` you can see the returned headers.

### Example showing the response headers

```bash
curl -i -H 'x-format-output: txt-matched-rules' 'https://sandbox.coreruleset.org/?test=posix_uname()'
curl -i -H 'x-format-output: txt-matched-rules' \
'https://sandbox.coreruleset.org/?test=posix_uname()'
HTTP/1.1 200 OK
Date: Tue, 25 Jan 2022 13:53:07 GMT
Content-Type: text/plain
Transfer-Encoding: chunked
Connection: keep-alive
X-Unique-ID: YfAAw3Gq8uf24wZCMjHTcAAAANE
x-backend: apache-3.3.2
x-backend: apache-latest

933150 PL1 PHP Injection Attack: High-Risk PHP Function Name Found
949110 PL1 Inbound Anomaly Score Exceeded (Total Score: 5)
Expand All @@ -66,31 +67,35 @@ x-backend: apache-3.3.2

## Default options

Its useful to know that you can tweak the sandbox in various ways. If you dont send any `X-` headers, the sandbox will use the following defaults.
It's useful to know that you can tweak the sandbox in various ways. If you don't send any `X-` headers, the sandbox will use the following defaults.

- The default backend is _Apache 2 with ModSecurity 2.9_.
- The default CRS version is the _latest release version_, currently 4.22.0.
- The default CRS version is the _latest release version_.
- The default Paranoia Level is 1, which is the least strict setting.
- By default, the response is the full audit log from the WAF, which is verbose and includes unnecessary information, hence why `X-Format-Output: txt-matched-rules` is useful.

## Changing options

Lets say you want to try your payload on different WAF engines or CRS versions, or like the output in a different format for automated usage. You can do this by adding the following HTTP headers to your request:
Let's say you want to try your payload on different WAF engines or CRS versions, or like the output in a different format for automated usage. You can do this by adding the following HTTP headers to your request:

- `x-crs-version`: will pick another CRS version. Available values are `4.22.0` (default) and `3.3.8`.
- `x-crs-version`: will pick another CRS version. Available values are `latest` (default, currently the latest release) and any supported semver version (e.g. `3.3.8`).
- `x-crs-paranoia-level`: will run CRS in a given paranoia level. Available values are `1` (default), `2`, `3`, `4`.
- `x-crs-mode`: can be changed to return the http status code from the backend WAF. Default value is blocking (`On`), and can be changed using `detection` (will set engine to `DetectionOnly`). Values are case insensitive.
- `x-crs-inbound-anomaly-score-threshold`: defines the inbound anomaly score threshold. Valid values are any integer > 0, with `5` being the CRS default. ⚠️ Anything different than a positive integer will be taken as 0, so it will be ignored. This only makes sense if `blocking` mode is enabled (the default now).
- `x-crs-outbound-anomaly-score-threshold`: defines the outbound anomaly score threshold. Valid values are any integer > 0, with `4` being the CRS default. ⚠️ Anything different than a positive integer will be taken as 0, so it will be ignored. This only makes sense if `blocking` mode is enabled (the default now).
- `x-backend` allows you to select the specific backend web server
- `x-backend` allows you to select the specific backend web server:
- `apache` (default) will send the request to **Apache 2 + ModSecurity 2.9**.
- `nginx` will send the request to **Nginx + ModSecurity 3**.
- `coraza` will send the request to **Coraza WAF on Caddy**.
- `x-format-output` formats the response to your use-case (human or automation). Available values are:
- omitted/default: the WAFs audit log is returned unmodified as JSON
- omitted/default: the WAF's audit log is returned unmodified as JSON
- `txt-matched-rules`: human-readable list of CRS rule matches, one rule per line
- `txt-matched-rules-extended`: same but with explanation for easy inclusion in publications
- `json-matched-rules`: JSON formatted CRS rule matches
- `csv-matched-rules`: CSV formatted
- `html-matched-rules`: HTML page with a styled table of matched rules

Invalid `x-format-output` values default to `json-matched-rules`.

The header names are case-insensitive.

Expand All @@ -100,13 +105,13 @@ If you work with JSON output (either unmodified or matched rules), `jq` is a use

### Advanced examples

Lets say you want to send a payload to an old CRS version **3.2.1** and choose **Nginx + ModSecurity 3** as a backend, because this is what you are interested in. You want to get the output in JSON because you want to process the results with a script. (For now, we use `jq` to pretty-print it.)
Let's say you want to send a payload to CRS version **3.3.8** and choose **Nginx + ModSecurity 3** as a backend, because this is what you are interested in. You want to get the output in JSON because you want to process the results with a script. (For now, we use `jq` to pretty-print it.)

The command would look like:

```bash
curl -H "x-backend: nginx" \
-H "x-crs-version: 3.2.1" \
-H "x-crs-version: 3.3.8" \
-H "x-format-output: json-matched-rules" \
https://sandbox.coreruleset.org/?file=/etc/passwd | jq .

Expand All @@ -129,14 +134,23 @@ curl -H "x-backend: nginx" \
]
```

Let’s say you are working on a vulnerability publication and want to add a paragraph to explain how CRS protects (or doesn’t!) against your exploit. Then the `txt-matched-rules-extended` can be a useful format for you.
You can also test the same payload across different WAF engines to compare detection behavior:

```bash
# Test on Coraza (Caddy-based WAF)
curl -H "x-backend: coraza" \
-H "x-format-output: txt-matched-rules" \
'https://sandbox.coreruleset.org/?q=<script>alert(1)</script>'
```

Let's say you are working on a vulnerability publication and want to add a paragraph to explain how CRS protects (or doesn't!) against your exploit. Then the `txt-matched-rules-extended` can be a useful format for you.

```bash
curl -H 'x-format-output: txt-matched-rules-extended' \
https://sandbox.coreruleset.org/?file=/etc/passwd

This payload has been tested against OWASP CRS
web application firewall. The test was executed using the apache engine and CRS version 3.3.2.
This payload has been tested against the OWASP CRS
web application firewall. The test was executed using the apache engine and CRS version latest.

The payload is being detected by triggering the following rules:

Expand All @@ -159,34 +173,35 @@ All requests sent to the sandbox are logged and processed by the sandbox infrast

## Architecture

The sandbox consists of various parts. The frontend that receives the requests runs on Openresty. It handles the incoming request, chooses and configures the backend running CRS, proxies the request to the backend, and waits for the response. Then it parses the WAF audit log and sends the matched rules back in the format chosen by the user.
The sandbox consists of various parts. The frontend that receives the requests runs on OpenResty (Nginx with Lua). It handles the incoming request, selects and configures the backend running CRS based on the request headers, proxies the request to the backend, and waits for the response. Then it parses the WAF audit log and sends the matched rules back in the format chosen by the user.

There is a backend container for every engine and version. For instance, one Apache with CRS 4.22.0, one with CRS 3.3.8, et cetera... These are normal webserver installations with a WAF and the CRS.
```text
Client → OpenResty (Lua routing) → WAF Backend → Mirror Backend
ModSecurity/Coraza CRS
Audit Logs → Filebeat → Elasticsearch/S3
```

The backend writes their JSON logs to a volume to be read by a collector script and sent to S3 bucket and Elasticsearch.
There is a backend container for every engine and version. The current backends are:

The logs are parsed, and values like User-Agent and geolocation are extracted. We use Kibana to keep an overview of how the sandbox is used, and hopefully gain new insights about attacks.
| Backend | Engine | Container |
|---------|--------|-----------|
| Apache + ModSecurity 2.9 | `apache` | `apache-latest`, `apache-3_3_8` |
| Nginx + ModSecurity 3 | `nginx` | `nginx-latest`, `nginx-3_3_8` |
| Coraza WAF on Caddy | `coraza` | `coraza-latest` |

The backend writes their JSON audit logs to a shared volume. OpenResty reads the per-transaction audit log file to extract matched rules and format the response. Logs are also collected by Filebeat and sent to an S3 bucket and Elasticsearch for monitoring.

## Known issues

In some cases, the sandbox will not properly handle and finish your request.

- **Malformed HTTP requests:** The frontend, Openresty, is itself a HTTP server which performs parsing of the incoming request. The backend servers running CRS are regular webservers such as Apache and Nginx. Either one of these may reject a malformed HTTP request with an error 400 before it is even processed by CRS. This happens for instance when you try to send an Apache 2.4.50 attack that depended on a URL encoding violation. If you receive an error 400, your request was rejected by the frontend or a backend, and it was not scanned by CRS.
- **Malformed HTTP requests:** The frontend, OpenResty, is itself a HTTP server which performs parsing of the incoming request. The backend servers running CRS are regular webservers such as Apache and Nginx. Either one of these may reject a malformed HTTP request with an error 400 before it is even processed by CRS. This happens for instance when you try to send an Apache 2.4.50 attack that depended on a URL encoding violation. If you receive an error 400, your request was rejected by the frontend or a backend, and it was not scanned by CRS.
- **ReDoS:** If your request leads to a ReDoS and makes the backend spend too much time to process a regular expression, this leads to a timeout from the backend server. The frontend will cancel the request with an error 502. If you have to wait a long time and then receive an error 502, there was likely a ReDoS situation.

## Questions and suggestions

If you have any issues with the CRS sandbox, please open a GitHub issue at [https://github.com/coreruleset/coreruleset/issues](https://github.com/coreruleset/coreruleset/issues) and we will help you as soon as possible.

If you have suggestions for extra functionality, a GitHub issue is appreciated.

## Working on the sandbox: adding new backends

The following notes are handy for our team maintaining the sandbox.

To add a new backend:

- Each backend has its own IP address.
- docker-compose: copy-paste a back-end container. Give it a new unused IP address in the 10.5.0.\* virtual network.
- The frontend needs to know how to reach the desired backend. There is a hardcoded list in openresty/conf/access.lua with the target IP address.
- httpd-vhosts.conf needs to be changed.
Loading