Personally Identifiable & Protected Information Guide


The goal of this document is to help community users understand how we prefer to approach personally identifying and protected information on

Controlling how personally identifiable information (PII) are displayed online is important. PII refers to emails, phone numbers, full names, IPs, and other information that offers a direct connection between a user and who they are in real life. This information may also include private and protected data used a reproducible example, for example, protected health information (PHI).

Sharing Your Own Personally Identifying Information in Threads:
For example, including email address, IP address, phone number, mailing address in a topic or reply.

  • PII should not appear in topics or reply posts.
  • PII should only appear in user profiles. If a user would like to refer to their personal information, we ask they do so with a reference to the information in their profile or via a private message.
  • There are certainly exceptions. For example, consultants referring business, businesses with job posts, people who are aware of the guidelines but feel a need to give PII anyway.
  • Please flag all cases where PII appears in a reply. If necessary, moderators may manually delete the offending information and let the user know.

Sharing Protected or Private Data
For example, protected health information (PHI).

  • It is never okay to post private data online.
  • If you have anonymized/simulated your data but still refer to variables that suggest it may be protected, please explicitly call out your anonymization.
    • For example "These data mask all identifiers" or "All patient data has been simulated"
    • Simply masking identifiers over real data is not permitted by most privacy policies. It is suggested to simulate new data on a similar data schema.
  • If you suspect data in a thread may include protected private data, please flag it, and add a custom message with your concern.

Motivation for guideline:

  • This guideline hopes to offer people better control over their personal information. PII only appears in one place, which can change or be deleted. Should someone wish to leave and remove their account, the default method is to anonymize the user and all posts. If one posted PII in a reply, they still have a link between who they are IRL and this anonymous user. (Users are free to delete posts, but they must do so manually.)
  • This guideline hopes to protect users and people in protected datasets.'s full privacy policy may be found here.

Draft Notes:

  • This page is for PII (ie including ip addresses or emails or phones), and also protected data that might have been included in a reprex.
  • Should we specifically called out "protected health information (PHI)" and other commonly used terms? That is, helping with search? Probably.

How to flag posts with personal information?
Community Sustainer (Moderator) Guide
Shiny debugging and reprex guide
FAQ: What's a reproducible example (`reprex`) and how do I do one?

From @Leon on this, on private health info;

The way I "learned" it, was basically if there is an ID it's not anonymised. This includes if a random ID has been generated, with no connection to the patient.

If a data set - no matter how anonymised and how small - is part of a real data set, then it should not be permitted.

In these cases, the only way the poster can get input, is to create a dummy data set with random values with no connection to the actual data.

IE, perhaps the current version is too soft. The rule should be to use the same schema, and you much definitively use and say you're using simulated data?



Thanks, I more explicitly called out this under " Sharing Protected or Private Data"