Challenges encountered and lessons learned when using a novel anonymised linked dataset of health and social care records for public health intelligence: the Sussex integrated dataset

Ford, Elizabeth; Tyler, Richard; Johnston, Natalie; Spencer Hughes, Vicki; Evans, Graham; Elsom, Jon; Madzvamuse, Anotida; Clay, Jacqueline; Gilchrist, Kate; Rees-Roberts, Melanie

information-14-00106.pdf (1.59 MB)

Challenges encountered and lessons learned when using a novel anonymised linked dataset of health and social care records for public health intelligence: the Sussex integrated dataset

journal contribution

posted on 2023-06-10, 06:11 authored by Elizabeth FordElizabeth Ford, Richard Tyler, Natalie Johnston, Vicki Spencer Hughes, Graham Evans, Jon Elsom, Anotida Madzvamuse, Jacqueline Clay, Kate Gilchrist, Melanie Rees-Roberts

Background: In the United Kingdom National Health Service (NHS), digital transformation programmes have resulted in the creation of pseudonymised linked datasets of patient-level medical records across all NHS and social care services. In the Southeast England counties of East and West Sussex, public health intelligence analysts based in local authorities (LAs) aimed to use the newly created “Sussex Integrated Dataset” (SID) for identifying cohorts of patients who are at risk of early onset multiple long-term conditions (MLTCs). Analysts from the LAs were among the first to have access to this new dataset. Methods: Data access was assured as the analysts were employed within joint data controller organisations and logged into the data via virtual machines following approval of a data access request. Analysts examined the demographics and medical history of patients against multiple external sources, identifying data quality issues and devel-oping methods to establish true values for cases with multiple conflicting entries. Service use was plotted over timelines for individual patients. Results: Early evaluation of the data revealed mul-tiple conflicting within-patient values for age, sex, ethnicity and date of death. This was partially resolved by creating a “demographic milestones” table, capturing demographic details for each patient for each year of the data available in the SID. Older data (=5y) was found to be sparse in events and diagnoses. Open-source code lists for defining long-term conditions were poor at identifying the expected number of patients, and bespoke code lists were developed by hand and validated against other sources of data. At the start, the age and sex distribution of patients submitted by GP practices were substantially different from those published by NHS Digital, and errors in data processing were identified and rectified. Conclusions: While new NHS-linked da-tasets appear a promising resource for tracking multi-service use, MLTCs and health inequalities, substantial investment in data analysis and data architect time is necessary to ensure high enough quality data for meaningful analysis. Our team made conceptual progress in identifying the skills needed for programming analyses and understanding the types of questions which can be asked and answered reliably in these datasets.

History

Publication status

Published

File Version

Published version

Journal

Information

ISSN

2078-2489

Publisher

MDPI

External DOI

https://doi.org/10.3390/info14020106

Issue

2

Volume

14

Page range

1-15

Department affiliated with

Primary Care and Public Health Publications

Full text available

Yes

Peer reviewed?

Yes

Legacy Posted Date

2023-02-08

First Open Access (FOA) Date

2023-02-16

First Compliant Deposit (FCD) Date

2023-02-08

Usage metrics

Keywords

health data electronic health records data linkage data quality public health

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Challenges encountered and lessons learned when using a novel anonymised linked dataset of health and social care records for public health intelligence: the Sussex integrated dataset

History

Publication status

File Version

Journal

ISSN

Publisher

External DOI

Issue

Volume

Page range

Department affiliated with

Full text available

Peer reviewed?

Legacy Posted Date

First Open Access (FOA) Date

First Compliant Deposit (FCD) Date

Usage metrics

Categories

Keywords

Licence

Exports