Maintaining patient privacy while geocoding patient addresses: Do Not Use R to Geocode!

Imagine if a clinical researcher were to disclose a list of patient addresses to a third-party – government agency, for profit company or not-for-profit entity – that was outside of their hospital or health system. Imagine the researcher then publicly announced they disclosed the addresses to the third party, that the addresses belonged to patients with a specific disease, and that those patients were being treated at a specific hospital. The researcher’s Institutional Review Board (IRB) and Health Insurance Portability and Accountability Act (HIPAA) compliance office would be outraged at these violations of patient privacy. Yet this sequence of events can happen inadvertently when studying how neighborhood conditions such as access to medical facilities or neighborhood food environments affect clinical outcomes in specific patient populations. A quick search of Google Scholar shows many articles that, through this sequence of events, have disclosed patient health data.

In a recent pre-press publication Rundle and colleagues show how geocoding patient or study subject addresses using a variety of R packages, STATA, SAS and QGIS can set of a cascade of events that discloses Personal Identifying Information (PII) and Protected Heath Information (PHI) in violation of usual IRB and HIPAA rules. We also show the flaws in several approaches proposed to protect PII and PHI in neighborhood health effects research and propose best practices to protect patient and study subject confidentiality in studies on neighborhood health effects.

This entry was posted in Info-Graphix, Methods, Privacy. Bookmark the permalink.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.