Conversation
Deploying opensafely-docs with
|
| Latest commit: |
6c41322
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://27ea5f2a.opensafely-docs.pages.dev |
| Branch Preview URL: | https://andrewscolm-patch-3.opensafely-docs.pages.dev |
| The general principle is that **any statistic describing 7 or fewer patients, either directly or indirectly, should be redacted or combined into other statistics**. This includes: | ||
| The general principle is that **any statistic describing 5 or fewer patients, either directly or indirectly, should be redacted or combined into other statistics**. This includes: | ||
|
|
||
| * Redacting counts <=7 in frequency tables. Row and column totals should be recalculated after you have redacted the cell values, to ensure that the redacted values can not be inferred from the totals. |
There was a problem hiding this comment.
Rounding to the nearest 5 offers protection against this
docs/outputs/sdc.md
Outdated
| In general, good SDC is consistent with good statistics: many observations, no influential outliers, well-behaved distributions etc both prevent disclosure and increase confidence in the statistics. The one area to be wary of is where you can say something for certain about entire groups (‘all patients presenting with X also needed treatment for Y’). Be cautious about statements like this. | ||
|
|
||
| To understand what checks have to be made to outputs it is important to understand the **attribute types** that exist in data and how these could lead to **primary or secondary disclosure**. Importantly, OpenSAFELY requires that researchers redact any outputs based on counts <= 7 before they can be released. | ||
| To understand what checks have to be made to outputs it is important to understand the **attribute types** that exist in data and how these could lead to **primary or secondary disclosure**. Importantly, OpenSAFELY requires that researchers redact any outputs that can identify <=5 individuals. In order to achieve this for counts rounded to the nearest 5 counts of 7 or fewer must be redacted before rounding. |
There was a problem hiding this comment.
A comma would help:
In order to achieve this for counts rounded to the nearest 5, counts of 7 or fewer must be redacted before rounding.
However, I don't think this is correct. We don't have to redact (= completely remove a value) if the rounding precision doesn't lead to a rounding band with width <5. For example, if I round everything to the nearest 20, then we have [-9, 9], [10, 29], [30, 49],... mapping to values 0, 20, 40, ..., which is allowed, and doesn't require any redaction. Similarly for midpoint-5 and above.
There was a problem hiding this comment.
I have reworded for clarity:
Importantly, OpenSAFELY requires that researchers redact any outputs that can identify <=5 individuals. For example, if you plan to round your counts to the nearest 5, you would need to redact counts of 7 or fewer before rounding.
Add clarification to rounding requirements and redaction of values <=7