top of page
Database Name/Link
Data Description
Category
Subcategory
SEER (Surveillance, Epidemiology, and End Results)
Cancer incidence and survival data by race, ethnicity, and geography.
Disease-Specific Data
Cancer
STRIDES
The NIH STRIDES Initiative provides cloud-based access to large biomedical datasets, computational tools, and infrastructure for NIH-funded researchers
General Research Repositories
Large Dataset Distribution
STRING (Functional Protein Association Networks)
A database for protein-protein interactions that supports functional genomics research.
Proteomics
Protein -Protein
Sleep-EDF
Sleep study dataset.
Clinical & Cohort Data
Sleep
Small Molecule Pathway Database (SMPDB)
A comprehensive, high-quality, freely accessible, online database containing more than 600 small molecule (i.e., metabolic) pathways found in humans, supporting pathway elucidation and discovery in metabolomics, transcriptomics, proteomics, and systems biology.
Metabolomics
Large Dataset Distribution
SmedGD
A genomic database for planarians.
Genomics & Multi-Omics
Parasites
Social Determinants of Health (SDOH) Data - CDC
CDC's database on SDOH factors such as housing, income, and healthcare access.
Public Health & Epidemiology
Health Disparities
Social Determinants of Health Database - AHRQ
Provides variables across five key SDOH domains, including social and economic contexts, education, physical infrastructure, and healthcare, linkable by geography.
Public Health & Epidemiology
Health Disparities
Social Determinants of Health by U.S. Census Tract
Offers SDOH data elements for each U.S. census tract, based on ACS data and rural commuting area definitions.
Disease-Specific Data
Cancer
Social Vulnerability Index (SVI)
CDC tool to assess community-level vulnerability based on social determinants
Public Health & Epidemiology
Health Disparities
Stanford Center for Artificial Intelligence in Medicine & Imaging (AIMI) Datasets
Provides datasets such as gated coronary CT images with corresponding segmentations and scores, and CT pulmonary angiography for patients susceptible to pulmonary embolism.
Clinical & Cohort Data
Heart Disease
Study on Global Ageing and Adult Health (SAGE)
Conducted by the WHO, SAGE compiles longitudinal data on health and well-being of adult populations and the aging process across different countries.
Public Health & Epidemiology
Aging
Summary of Datasets with Social Determinants of Health Indicators - NY State
Provides a summary of datasets with SDOH indicators across New York State, including education, employment, and health statistics.
Public Health & Epidemiology
Health Disparities
Supercentenarian Genome Project
A unique genome dataset of people aged 110+ to uncover genetic markers for extreme longevity.
Genomics & Multi-Omics
Aging
SureChEMBL
A patent informatics resource integrating chemical structures from patents with bioactivity data, supporting drug discovery and development.
Drug Discovery
Patents

 

© 2025 by Center of Excellence – Consortium of Educators.

 

bottom of page