University of Sussex
249-Article Text-681-1-10-20170530 (1).pdf (204.16 kB)

For the greater good? Patient and public attitudes to use of medical free text data in research

Download (204.16 kB)
conference contribution
posted on 2023-06-09, 05:51 authored by Elizabeth FordElizabeth Ford, Jessica Stockdale, Richard Jackson, Jackie Cassell
Objectives: Electronic health records (EHRs) contain rich information for understanding health conditions and their treatment. A large proportion of clinical information in EHRs is stored in narrative free text. This text is currently under-utilised due to privacy concerns, as it is harder to remove patient identifiers from text than from structured data. Automated de-identification of clinical text is now possible using heuristic or machine-learning-based systems. We conducted a review of the literature on patient and public understanding and attitudes towards the use of patients’ medical data for research, particularly seeking views on free text. The aim was to inform and develop a governance framework for the de-identification and use of medical free text for research, and to instigate a wider discussion on the topic. Approach: We undertook a systematic search in Web of Science and ScienceDirect with terms such as “public attitudes” and “electronic health records”. 3480 results were sifted by title, abstract and full text. Forty-two articles were retained for review, these reported on studies of patient and public perceptions, understanding and attitudes towards the use of patients’ medical data in research. Results: Research participants were positively inclined towards information in records being used in research “for the greater good”. However, no clear patterns by age, ethnicity, education level or SES emerged as to who was more favourable to data use. Participants generally trusted health care professionals and public sector researchers with de-identified medical data, whereas government health agencies and commercial entities were not trusted. No explicitly feared harms associated with data use were articulated. However the general objections appeared to be a dislike of personal data being exploited for commercial gain, and a dislike of personal data being moved around and used without personal knowledge or consent. Notably the use of EHR medical text for research did not emerge as a specific patient/public concern. De-identification was important to participants but text was not identified as a distinct privacy risk. Conclusion: This review demonstrates that transparency about data usage, and working “for the greater good” rather than financial gain, appear to be the most important public concerns to be addressed when using patients’ medical data. Governance frameworks for using EHRs must now be enhanced to provide for the use of medical text. This will involve informing both regulators and the public about the current capabilities of automated de-identification, and developing other assurances to safeguard patients’ privacy.


Publication status

  • Published

File Version

  • Published version


International Journal of Population Data Science




Swansea University





Page range


Event name

International Population Data Linkage Network (IPDLN) 2016 Conference

Event location

Swansea, Wales

Event type


Event date

24-26 August 2016

Department affiliated with

  • Primary Care and Public Health Publications

Full text available

  • Yes

Peer reviewed?

  • Yes

Legacy Posted Date


First Open Access (FOA) Date


First Compliant Deposit (FCD) Date


Usage metrics

    University of Sussex (Publications)


    No categories selected


    Ref. manager