91心頭利

Skip to main content
Close menu 91心頭利

Finding Data for Research

Looking for data for a research project? There is so much data to choose from that it can be overwhelming. To help students get started, we’ve created a list of commonly-used datasets that are accessible to most students, organized by the unit of analysis. If this list does not have what you’re looking for, scroll down for links to searchable websites. 

Data on U.S. States

  • Centers for Disease Control and Prevention (CDC)

    -  
    Measures:  Alcohol consumption, cholesterol awareness, chronic health indicators, colorectal cancer screening, e-cigarette use, days of poor health, demographics, fruits/vegetables, health care access/coverage, health status, HIV/AIDS, hypertension awareness, immunization, injury, oral health, overweight/obesity, physical activity, prostate cancer, tobacco use, and women’s health (available years vary by indicator)

    -  
    Measures: Cancer incidence, cancer mortality, for all cancer and cancer by type (2017; 2013-2017)


    Measures: Burden/magnitude, preventive care practices, health status/disability, risk factors for complications, end-stage renal disease, risk factors for diabetes (available years vary by indicator)


    Measures: Behaviors that contribute to unintentional injuries and violence, sexual behaviors, alcohol/other drug use, tobacco use, unhealthy dietary behaviors, and inadequate physical activity (available years vary by indicator)


  • Measures: Medicare spending per beneficiary, state averages (quality measures, staffing, fine amount, number of deficiencies), unplanned hospital visits, inpatient psychiatric facility quality measure data, patient survey, payment, timely and effective care, ambulatory surgical center quality measures, complications/deaths, outpatient imaging efficacy, health care associated infections, patient survey PPS-exempt cancer hospital, outpatient/ambulatory consumer assessment of healthcare providers and systems, home healthcare, and dialysis facility (years vary by indicator)


  • Measures: Medicare reimbursements, primary care access/quality measures, end-of-life chronic disease, mortality, hospital and post-acute care, health care for an aging population, prescription drug use (available years vary by indicator)


  • Measures:  Demographics and the economy, disparities, health costs/budgets, health coverage/uninsured, health insurance/managed care, health reform, health status, HIV/AIDS, Medicaid/CHIP, Medicare, providers/service use, and women’s health


  • Measures:  Demographics, screening and risk factors, incidence, prevalence, mortality (available years vary by indicator)

Data on Virginia Counties/Cities


  • Measures:  Cancer incidence and cancer mortality for all cancer and cancer by type (2017; 2013-2017)


  • Measures:  Health outcomes (length of life, quality of life), health behaviors, clinical care, social/economic factors, and physical environment (2010-2020)


  • Measures: Medicare spending (2003-2017), selected primary care access/quality measures (2003-2015), hospital post-discharge events (2004, 2009-2017)


  • Measures:  Demographics, screening and risk factors, incidence, mortality (available years vary by indicator)


  • Measures:  Communicable diseases (2008-2018), demographics (1999-2016), health behaviors (available years vary by indicator), injury and violence (2016-2018), maternal and child health (2006-2017), environmental health (years depend on indicator), social determinants of health (poverty [2016] and unemployment [2000-2016])
    NOTE:  Each portal page provides guidance to researchers on how to access CSV and Excel files for their own use.

Data on Individuals/Households


  • Includes data from annual surveys of adults between 1985-2019. Survey measures include:  alcohol consumption, cholesterol awareness, chronic health indicators, colorectal cancer screening, e-cigarette use, days of poor health, demographics, fruits/vegetables, health care access/coverage, health status, HIV/AIDS, hypertension awareness, immunization, injury, oral health, overweight/obesity, physical activity, prostate cancer, tobacco use, and women’s health (available years vary by indicator)


  • Includes data from biennial surveys of older adults. Survey measures include demographic traits, health status, health care utilization, health care costs, cognition, functional limitations, expectations, family structure, housing income, current and last job, job history, retirement and pension, social security, disability, health and life insurance, widowhood, divorce, internet use, physical measures, Covid-19
    NOTE:  W&M researchers interested in these data should contact Jennifer Mellor at jmmell@wm.edu

  • Includes data from the surveys of nearly 3 million persons each year from 2005-2019.  The ACS, developed by the U.S. Census Bureau, “provides an annual snapshot of the American population.” This survey provides the Census Bureau with important data on sources of health insurance coverage and the uninsured in the U.S. population. Other information is available on family interrelationships, demographic, race/ethnicity, education, work, income, socioeconomic, migration, activity five years ago, disability, veteran status, and place of work, among others.  To get the data, visit . 


  • Includes data from “surveys of families and individuals, their medical providers (doctors, hospitals, pharmacies, etc.), and employers across the United States. MEPS collects data on the specific health services that Americans use, how frequently they use them, the cost of these services, and how they are paid for, as well as data on the cost, scope, and breadth of health insurance held by and available to U.S. workers.”


  • Includes data under restricted use from "a continuous, multipurpose survey of a nationally representative sample of the Medicare population, conducted by the Office of Enterprise Data and Analytics (OEDA) of the Centers for Medicare & Medicaid Services (CMS) through a contract with NORC at the University of Chicago." Available measures include expenditures and sources of payment for all services used by Medicare beneficiaries (co-payments, deductibles, and non-covered services); all types of health insurance coverage; and outcomes over time (health status  and the impacts of Medicare program changes on satisfaction with care).  
    NOTE:  W&M researchers interested in these restricted data should contact Jennifer Mellor at jmmell@wm.edu


  • Is a "longitudinal study of a nationally representative sample of over 20,000 adolescents who were in grades 7-12 during the 1994-95 school year, and have been followed for five waves to date, most recently in 2016-18." Public use data are available for waves 1-4 from three different sources.  Add Health datasets for public use "contain all the survey data from In-Home Interviews but only for a subset of the full Add Health sample....Public-use data doesn’t contain ID numbers of friends, siblings or romantic partners, nor does it contain files on Obesity and Neighborhood Environment, genetics, disposition, political context and alcohol density."  


  • Is a set of surveys, sponsored by the U.S. Bureau of Labor Statistics, which is "designed to gather information at multiple points in time on the labor market activities and other significant life events of several groups of men and women." Public use data are available for the following seven cohorts: 
    National Longitudinal Survey of Youth 1997 (NLSY97), National Longitudinal Survey of Youth 1979 (NLSY79), NLSY79 Child and Young Adult, Older Men, Mature Women, Young Men, and Young Women.


  • Includes patient level information such as  demographic, clinical and financial information for every discharge that occurs in Virginia hospitals (most years from 2008-2015)
    NOTE:  W&M researchers interested in these data should contact Jennifer Mellor at jmmell@wm.edu.

Are you looking for something different? Here are some places to try:

  • -- Contains information on over 4,700 health-related datasets, and searchable by keyword, agency, or dataset name.
  • – Maintained by the U.S. Census Bureau with links to data on health insurance, disability employment, health care industries, disability, fertility, HIV/AIDS, small area health insurance estimates, among others.
  •  -- Maintained by the University of California at Berkeley, with links to national and international datasets as well as California datasets.
  • – Maintained by the University of Michigan, with the ability to browse data by topic, series title/description, thematic data collections, geography, restriction type, data format, time period, funding agency, among other things.
  • – Maintained by Dartmouth with links to national, state, and local datasets as well as datasets specific to health spending, utilization, and quality of care; food and drugs; and workplace injuries. Additional links to global health data and disease/condition data, and interdisciplinary health data.