Using data to support public health strategies is nothing new. An article published in the Bulletin of the World Health Organization asserts, "The concept of using mortality and morbidity data as a basis for public health action arose in Europe some 600 years ago with the emergence of scientific thought during the Renaissance, and subsequently spread to the Americas with the European settlers."1
During the great plague of 1662, Londoner John Graunt analyzed the "Bills of Mortality" that recorded deaths to quantify and compare death rates among different parts of the population. He is credited with influencing the thinking of mathematicians, doctors and demographers with his approach, providing a "template for numerical analysis of demographic and health data and [initiating] the concepts of statistical association, statistical inference and population sampling" that are the basis of modern work in epidemiology. 2
Although the basic concepts remain the same, technological improvements like machine learning and professional practices have significantly changed the collection, dissemination and analysis of health data to improve health outcomes at the population level. This post will explore how health data is collected and analyzed today, examining current challenges and how they are being addressed.
Using Data to Predict Public Health Outcomes
Health data analytics, a term encompassing data collection, storage and analysis, can provide evidence-based insights into population health trends, disparities, risk factors and the effectiveness of policies and intervention strategies.
Epidemiological modeling is a form of predictive analytics that uses mathematical models and techniques like machine learning to create simplified representations of the spread of infectious diseases. Modeling can be used with public health surveillance data to monitor disease outbreaks and trends and provide early warning systems for health emergencies.
The COVID-19 pandemic is one of the largest outbreaks of infectious diseases in the last hundred years, but public health management professionals monitor and defend against more than 200 communicable diseases. Learn more about the wide variety of public health threats that health data analytics can help address.
Using Evidence-Based Decision-Making in Public Health Policy
Predictive analytics, including epidemiological modeling and public health surveillance, support data-driven decisions that can help improve health outcomes at individual and population levels and support more equitable and effective public health policies.
Intervention and Prevention Strategies
Analyzing data on disease prevalence, risk factors, health behaviors and outcomes can support individual patients, healthcare workers and policymakers in making informed decisions. For example, study data could help a person with hypertension learn about effective treatment options and preventive care for heart attack and stroke.
At the population level, data analytics can help identify strategies for scaling up effective interventions. Less effective interventions can be modified or discontinued to optimize public health impact.
Health data analysis can also help identify populations at higher risk for certain diseases or health conditions. By analyzing demographic, geographic and socioeconomic data, public health officials can identify disparities in health outcomes and allocate resources effectively to address the needs of vulnerable populations and reduce health inequalities.
Health Data Collection and Sources
Data for public health analysis comes from a few primary sources:
- Population health data, such as that from public health surveillance systems or individual studies
- Data generated in the course of healthcare research
- Data associated with the health care of individual patients
Clinical Data
Clinical data refers to information collected during the provision of healthcare services. This data can include:3
- Patient demographics
- Medical history
- Diagnostic test results
- Treatment plans
- Outcomes
Electronic Health Records
Today, Healthcare providers document clinical data in electronic health records (EHRs). EHRs can facilitate the collection and sharing of clinical data among healthcare providers and organizations. This helps with continuity of patient care, clinical decision-making and quality improvement efforts.3
EHRs were first introduced in the early 2000s, and most health systems have adopted them. "As of 2021, nearly 4 in 5 office-based physicians (78%) and nearly all non-federal acute care hospitals (96%) adopted a certified EHR."4
Healthcare Information Exchanges
Government agencies and private industry have worked to develop healthcare information exchanges (HIEs) to facilitate the interchange of EHRs between different organizations. In April 2024, the Office of the National Coordinator for Health Information Technology (ONC) within the US Department of Health and Human Services released The Trusted Exchange Framework and Common AgreementSM (TEFCASM) to simplify connectivity between providers and make it easier for individuals to access their records by establishing " a universal governance, policy, and technical floor for nationwide interoperability."5
Researchers and policymakers can gain valuable insights from EHR data and retrieve anonymized data through healthcare information exchanges.
Epidemiological Data
Epidemiological data focuses on patterns and determinants of disease occurrence and distribution within populations and is used to protect the public’s health and safety. Data may include disease incidence, prevalence, risk factors, transmission routes and outcomes.
Epidemiological data sources include:6
- Mortality statistics
- Notifiable diseases reporting
- Population-based surveys
- Laboratory data
- Environmental exposure data
Disease Registries
Disease registries are centralized databases that contain information on individuals with specific diseases or conditions and play an important role in collecting and maintaining epidemiological data. Examples include cancer registries, diabetes registries and Alzheimer’s registries. Registries providing comprehensive data on disease prevalence, treatment outcomes, and long-term follow-up are valuable resources for epidemiological research, clinical trials, healthcare planning and quality improvement initiatives.7
Other Public Health Surveillance Systems
Several national and international systems track population health to identify potential threats. Some are tightly focused on a specific type of health research, while others are more generalized. Examples include:
- The National Notifiable Diseases Surveillance System (NNDSS), operated by the U.S. Centers for Disease Control and Prevention (CDC), tracks instances of infectious diseases, foodborne illnesses like E. coli and environmentally caused conditions, such as lead poisoning.8
- The Canadian Antimicrobial Resistance Surveillance System (CARSS), operated by the Public Health Agency of Canada, integrates and synthesizes information on antimicrobial resistance and antimicrobial use surveillance in Canada.9
- The Global Influenza Surveillance and Response System (GISRS), operated by the World Health Organization (WHO), tracks data from member states on seasonal, pandemic and zoonotic influenza outbreaks to facilitate preparedness and response.10
Behavioral Data
Behavioral data capture individuals’ health-related behaviors, lifestyle factors and social determinants of health. The purpose of the data is to find a connection between behavior and health.11 These data may include dietary habits, physical activity levels, smoking status, substance use, sexual behaviors and socioeconomic indicators. Behavioral data are typically collected through surveys, interviews, self-reported assessments and observational studies. Professionals use this data to better understand health behaviors in the hopes of informing health promotion initiatives and developing targeted interventions.12
Surveys and questionnaires are commonly used to collect data on individuals’ health behaviors, perceptions and experiences. These data collection methods may include:7
- National health surveys
- Community health assessment
- Health risk assessments
Example: The WHO Framework Convention on Tobacco Control
The World Health Organization Framework Convention on Tobacco Control (WHO FCTC) collects data on the consequences of tobacco use and exposure. This data-driven approach has led to the implementation of evidence-based policies to decrease smoking rates and improve public health among member states.13
Challenges in Data Collection and Management
Given the scope and multiple sources of data used in the health sector, challenges in managing the data are inevitable. General areas of concern include:
- Quality and consistency: Data quality is the accuracy or reliability of the data. A patient’s EHR can be long and filled with treatment plans, medications, tests, x-rays, and more. Poor data processes can lead to errors, degrading the quality of patient care and potentially posing severe health risks, including mortality.
- Interoperability: One of the benefits of EHRs is the ability to see a patient’s treatment plan across platforms. At the same time, this can be a challenge because data does not always transfer seamlessly from system to system.14
- Security and Privacy: Healthcare data contains extremely sensitive information about patients and their health. Data breaches can harm patients and cause organizations to suffer legal, financial and reputational ramifications.
Addressing Data Challenges with Standards and Governance
Public and private entities across the healthcare sector, including software developers, professional associations, healthcare services providers and others, are working to address the various challenges associated with collecting, storing and disseminating health data.
HIPAA, HITECH and the previously mentioned TEFCASM are all examples of governmental efforts to improve health data collection and management. The 1996 Health Insurance Portability and Accountability Act (HIPAA) and the Health Information Technology for Economic and Clinical Health Act (HITECH) primarily address patient data security and patient access rights.15 TEFCASM addresses data standardization and interoperability issues.
Health Level Seven International (HL7) and the Healthcare Information and Management Systems Society (HIMSS) are two nonprofit organizations addressing these issues. HIMSS advocates for a unified, global approach to health cybersecurity and data privacy, including "use cases and implementation guidance that are scalable for a wide range of healthcare organizations and inclusive to all care settings," and to the development of a cyberthreat intelligence information-sharing pipeline.16 HL7 is an ANSI-accredited standards-developing organization dedicated to providing a comprehensive framework and related standards for the exchange, integration, sharing, and retrieval of electronic health information that supports clinical practice and the management, delivery and evaluation of health services."17
Data Mining and Analysis Techniques for Public Health
Public health relies on big data and big data analytics to generate research insights and help formulate evidence-based policies. Therefore, public health professionals must develop data analysis, interpretation and communication skills. Kent State University's online Master of Public Health (MPH) program includes courses in the core curriculum and the Epidemiology specialization to train students to use large public health datasets.
The core course, Biostatistics in Public Health, familiarizes you with basic statistical methods in public health research. It also teaches you how to use statistical analysis software and interpret and present your results to public health professionals and educated lay audiences.
Applied Regression Analysis of Public Health Data is part of the Epidemiology specialization course sequence and can be taken as an elective in other Kent State MPH specializations. In this course, you'll gain proficiency in building and evaluating regression models for public health studies. You'll learn about exploratory and descriptive methods, simple and multiple linear regression models, predictor selection, binary and multinomial logistic regression models, survival analysis, repeated measures and generalized linear models.
Gain the Data Skills to Support Population Health with Kent State
Are you ready to impact public health? Kent State University's online MPH prepares you with theoretical knowledge and practical skills, including data analysis skills, to practice effectively in your choice of public health areas. Complete the affordable program on your schedule, learning in convenient asynchronous classes. Stand out in this vital part of the healthcare sector with specialized expertise in health policy and management, social and behavioral sciences, and epidemiology. Schedule a call with an admissions outreach advisor to learn more.
- Retrieved on April 29, 2024, from ncbi.nlm.nih.gov/pmc/articles/PMC2486528/pdf/bullwho00413-0101.pdf
- Retrieved on April 29, 2024, from https://pubmed.ncbi.nlm.nih.gov/35167377/
- Retrieved on April 29, 2024, from gdc.cancer.gov/Encyclopedia/pages/Clinical_Data/
- Retrieved on April 29, 2024, from healthit.gov/data/quickstats/national-trends-hospital-and-physician-adoption-electronic-health-records
- Retrieved on April 29, 2024, from healthit.gov/topic/interoperability/policy/trusted-exchange-framework-and-common-agreement-tefca
- Retrieved on April 29, 2024, from cdc.gov/eis/field-epi-manual/chapters/collecting-data.html
- Retrieved on April 29, 2024, from kms-healthcare.com/types-of-healthcare-data/
- Retrieved on April 29, 2024, from cdc.gov/nndss/docs/NNDSS-Overview-Fact-Sheet-508.pdf
- Retrieved on April 29, 2024, from canada.ca/en/public-health/services/surveillance.html
- Retrieved on April 29, 2024, from who.int/initiatives/global-influenza-surveillance-and-response-system
- Retrieved on April 29, 2024, from salveohealth.org/what-is-the-difference-between-mental-and-behavioral-health/
- Retrieved on April 29, 2024, from ourworldindata.org/how-do-researchers-study-the-prevalence-of-mental-illnesses
- Retrieved on April 29, 2024, from who.int/activities/monitoring-tobacco-use
- Retrieved on April 29, 2024, from healthitanalytics.com/news/top-10-challenges-of-big-data-analytics-in-healthcare
- Retrieved on April 29, 2024, from hhs.gov/hipaa/for-professionals/index.html
- Retrieved on April 29, 2024, from himss.org/what-we-do-public-policy-advocacy/policy-center
- Retrieved on April 29, 2024, from hl7.org/about/index.cfm