top of page
Database Name/Link
Data Description
Category
Subcategory
1000 Genomes Project
Deep catalog of human genetic variation.
Genomics & Multi-Omics
Human Genomics
ADNI (Alzheimer Disease Neuroimaging Initiative)
Neuroimaging and biomarker database with Parkinson AD sub-studies.
Disease-Specific Data
Alzheimers Disease
AGRIS (FAO International System for Agricultural Science and Technology)
A database providing information on food systems, agriculture, and related nutritional outcomes.
Agriculture & Food Systems
Food Systems
AHRQ's SDOH Data
Provides details across five key SDOH domains: Social context, such as age, race/ethnicity, veteran status. Economic context, such as income, unemployment rate. Education Physical infrastructure, such as housing, crime, transportation. Healthcare context, such as health insurance.
Public Health & Epidemiology
Health Disparities
AIMI Dataset Index:
Managed by the Stanford Center for Artificial Intelligence in Medicine and Imaging, this repository features curated and annotated clinical imaging data across various modalities, including echocardiograms, brain CT scans, MRIs, radiographs, and ultrasounds.
Imaging & AI Datasets
Clinical
AMP PD (Parkinson's Disease)
Public-private partnership providing large-scale Parkinson disease omics data.
Disease-Specific Data
Parkinsons Disease
AMP-AD (Alzheimer Disease)
Precision medicine partnership integrating multi-omics Alzheimer data.
Disease-Specific Data
Alzheimers Disease
AYUSH Research Portal
A governmental database offering information on research in Ayurveda, Yoga & Naturopathy, Unani, Siddha, and Homeopathy, including details on medicinal plants and their therapeutic uses.
Natural Medicines
Ayurveda / AYUSH
Academic Torrents
A distributed system for sharing enormous datasets, including those used in machine learning research
General Research Repositories
Large Dataset Distribution
All of Us Research Program
A national project collecting diverse health and genetic data.
Public Health & Epidemiology
Large Dataset Distribution
Allen Brain Atlas
A comprehensive mapping of gene expression in the human and mouse brain.
Neuro Data
Brain Gene Expression
AlphaFold Protein Structure Database
Provides AI-predicted 3D structures of proteins, facilitating advancements in drug discovery and synthetic biology.
Drug Discovery
Protein Interactions
Alzheimer's Disease Neuroimaging Initiative (ADNI)
Longitudinal study tracking Alzheimer's disease progression and risk factors.
Disease-Specific Data
Alzheimers Disease
Alzheimer's Disease Sequencing Project (ADSP)
NIH initiative sequencing genomes of Alzheimer? patients and controls.
Disease-Specific Data
Alzheimers Disease
America's Health Rankings
Provides comprehensive data on various health measures, including behaviors, community and environment, policy, clinical care, and outcomes, offering insights into how dietary habits influence health across different populations.
Public Health & Epidemiology
Diet and Health
American Community Survey (ACS)
The American Community Survey (ACS) helps local officials, community leaders, and businesses understand the changes taking place in their communities. It is the premier source for detailed population and housing information about our nation.
General Research Repositories
Population Trends
American Gut Project
Crowdsourced study of the human microbiome.
Immunology
Microbiome
An Open-Source Dataset on Dietary Behaviors and DASH Eating Plan Optimization Constraints
A dataset based on dietary behaviors, demographics, and pre-existing conditions, suitable for input to linear optimization models.
Public Health & Epidemiology
Diet and Health
Area Deprivation Index (ADI)
Neighborhood-level measure of socioeconomic disadvantage linked to health outcomes.
Public Health & Epidemiology
Health Disparities
Area Deprivation Index Datasets
Measures socio-economic disadvantage at the neighborhood level, widely used in health disparities research.
Public Health & Epidemiology
Health Disparities
ArrayExpress
A repository for functional genomics experiments.
Genomics & Multi-Omics
Human Genomics
Atlas of Genetics and Cytogenetics in Oncology and Haematology
An online journal and database covering chromosomes, genes, and cancers, integrating various types of knowledge in a single resource.
Genomics & Multi-Omics
Cancer
Australian Antarctic Data Centre (AADC)
A repository for Antarctic research data.
General Research Repositories
Large Dataset Distribution
Awesome Healthcare Datasets
Curated list of healthcare datasets for ML in clinical data, imaging, and genomics.
Clinical & Cohort Data
Large Dataset Distribution
BRENDA (The Comprehensive Enzyme Information System)
The comprehensive enzyme information system, providing data on enzyme functions, structures, and properties, supporting research in metabolism, drug development, and biotechnology.
Biological Data
Enzyme Function
Behavioral Risk Factor Surveillance System (BRFSS)
State-based surveillance system tracking chronic diseases and health behaviors.
Public Health & Epidemiology
Large Dataset Distribution
Berkeley Single Cell Computational Microscopy (BSCCM) Dataset
Contains over 12 million images of individual white blood cells, captured with multiple illumination patterns on an LED array microscope, aimed at advancing computational microscopy and computer vision applications.
Immunology
White Blood Cells
Bgee
Offers gene expression data across species and conditions, aiding AI-driven research in developmental biology and disease modeling.
Genomics & Multi-Omics
Human Genomics
BigBrain Atlas
Ultra-high-resolution human brain atlas for neuroscience research.
Neuro Data
Brain Atlas
BigQuery Public Datasets (Healthcare & Life Sciences)
Google Cloud's BigQuery provides large-scale, AI-ready datasets for healthcare and clinical trial analytics.
Clinical & Cohort Data
Multidisciplinary Data
BindingDB
A public, web-accessible database of measured binding affinities, focusing on the interactions of proteins considered to be drug-targets with small, drug-like molecules.
Proteomics
Binding Affinities
BioCyc Database Collection
A collection of Pathway/Genome Databases (PGDBs) that provide reference to genome and metabolic pathway information for thousands of organisms, supporting multiomics analyses.
Biological Data
Pathway Databases
BioGRID
A biomedical interaction repository with data compiled through comprehensive curation efforts, encompassing protein-protein interactions, genetic interactions, chemical interactions, and post-translational modifications across multiple species.
Biological Data
Large Dataset Distribution
BioStudies
A repository for biological study data.
Biological Data
Large Dataset Distribution
Biological General Repository for Interaction Datasets (BioGRID)
A protein-protein interaction database.
Proteomics
Protein -Protein
BiomedCLIP Dataset
A multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs, supporting various biomedical imaging tasks and applications.
Imaging & AI Datasets
Multimodal Imaging
Black Women's Health Study (BWHS)
A long-term observational study initiated in 1995, following 59,000 Black women to investigate health issues, including maternal health disparities, with the goal of improving health outcomes.
Public Health & Epidemiology
Health Disparities
Brain Genomics Superstruct Project (GSP)
Brain imaging and cognitive data for genetic and neuroscience research.
Neuro Data
Multimodal Imaging
CARDIA (Coronary Artery Risk Development in Young Adults) - Obesity Data
Study tracking obesity and heart disease risk from young adulthood.
Public Health & Epidemiology
Heart Disease
CARDIA (Coronary Artery Risk Development in Young Adults) - Obesity Data
Study tracking obesity and heart disease risk from young adulthood.
Public Health & Epidemiology
Obesity
CARDIOGRAMplusC4D Consortium
Large-scale genetic database of cardiovascular diseases.
Genomics & Multi-Omics
Heart Disease
CDC Environmental Health Tracking Network (EPHT)
National environmental public health tracking program providing health and exposure data.
Public Health & Epidemiology
Environmental
CDC Social Determinants of Health Database
Links between social factors (income, housing, transport) and disease.
Public Health & Epidemiology
Large Dataset Distribution
CDC Social Vulnerability Index (SVI)
Measures community vulnerability to disasters and pandemics using 15 U.S. Census social factors (poverty, housing, minority status, access to transportation, etc.).
Public Health & Epidemiology
Health Disparities
CDC's Behavioral Risk Factor Surveillance System (BRFSS)
Large dataset for public health and disease risk modeling.
Public Health & Epidemiology
Disease Risk
CFDE DATA PORTAL Search Common Fund Programs' Metadata and Processed Datasets.
CFDE DATA PORTAL Search Common Fund Programs' Metadata and Processed Datasets.
General Research Repositories
Large Dataset Distribution
CKAN
An open-source data portal platform used by governments and organizations to manage and publish collections of data, powering numerous data portals worldwide
Public Health & Epidemiology
Open Data Portals Global
CLSA (Canadian Longitudinal Study on Aging)
Tracks 50,000+ Canadians over 20 years to analyze aging, genetics, lifestyle, and environmental factors.
Public Health & Epidemiology
Aging
CMS Medicare & Medicaid Data
Includes Medicare and Medicaid claims data, utilization, and provider information.
Public Health & Epidemiology
Medical Insurance
COSMIC (Catalogue of Somatic Mutations in Cancer)
Catalog of somatic mutations in cancer from sequencing studies.
Disease-Specific Data
Cancer
COVID-19 Community Vulnerability Index (CCVI)
Social and healthcare factors contributing to COVID-19 disparities.
Public Health & Epidemiology
Covid 19
CRISPR Screen Data from Broad Institute
Gene knockout & perturbation datasets for understanding disease pathways and drug discovery.
Genomics & Multi-Omics
CRISPR and RNAi screening
CXR8 Chest X-ray Dataset
112,000+ labeled chest X-ray images with 14 different pathologies, ideal for AI-based disease detection models.
Clinical & Cohort Data
Multimodal Imaging
CalorieKing Food Database
A trusted food database offering nutrition facts for favorite brands and fast-food restaurants, along with tools like a free online calorie counter to assist with dietary tracking.
Agriculture & Food Systems
Nutrition Data
CalorieNinjas - Nutrition Facts and Recipe API
Provides an easy-to-use nutrition facts and recipe API, offering nutritional information for a vast array of foods, including fast-food items, to support dietary tracking and analysis.
Agriculture & Food Systems
Nutrition Data
Cambridge Centre for Ageing and Neuroscience (Cam-CAN)
Brain imaging dataset for aging and cognitive neuroscience research.
Neuro Data
Multimodal Imaging
Canadian Longitudinal Study on Aging (CLSA)
A national, longitudinal study following approximately 50,000 men and women aged 45 to 85 at recruitment for at least 20 years to collect information on aging.
Public Health & Epidemiology
Aging
Cancer Cell Line Encyclopedia (CCLE)
Large-scale database of cancer cell lines for drug response and genomic profiling.
Disease-Specific Data
Cancer
Cancer Dependency Map (DepMap)
Systematic mapping of cancer cell dependencies and drug targets.
Disease-Specific Data
Cancer
Cancer Imaging Archive (TCIA)
A massive public repository of medical imaging data for training AI-driven cancer detection models.
Disease-Specific Data
Cancer
Centers for Medicare & Medicaid Services (CMS)
Healthcare utilization data, including disparities in service use among Medicaid and Medicare populations
Public Health & Epidemiology
Health Disparities
ChEMBL
A manually curated chemical database of bioactive molecules with drug-like properties, maintained by the European Bioinformatics Institute (EBI), providing information on compound bioactivity data against drug targets.
Drug Discovery
Chemistry
ChEMBL
Manually curated database of bioactive molecules with drug-like properties.
Drug Discovery
Chemistry
Chan Zuckerberg Biohub Cell Atlas
Comprehensive cell atlas mapping human cells to understand disease biology.
Biological Data
Large Dataset Distribution
ChemSpider
A free chemical structure database providing fast access to over 100 million structures, properties, and associated information.
Drug Discovery
Chemistry
Chinese Health and Retirement Longitudinal Study (CHARLS)
A nationally representative longitudinal study of Chinese residents aged 45 and older, collecting a wide range of information on their health and economic status.
Public Health & Epidemiology
China
ClinVar
A freely available resource on clinically relevant genetic variants.
Clinical & Cohort Data
Human Genomics
Clinical Cohort at BioLINCC
Clinical and genetic disease datasets hosted by NHLBI BioLINCC.
Clinical & Cohort Data
Human Genomics
Clinical Pharmacogenetics Implementation Consortium (CPIC) Database
A curated pharmacogenomic database providing guidelines for drug-gene interactions in personalized medicine.
Drug Discovery
Drug - Gene
ClinicalTrials.gov
A registry of publicly and privately funded clinical trials worldwide.
Clinical & Cohort Data
Large Dataset Distribution
ClinicalTrials.gov Diabetes Studies
Searchable database of diabetes-focused clinical trials.
Disease-Specific Data
Diabetes
Collaborative Drug Discovery (CDD) Vault
A web-based database solution for managing drug discovery data, focusing on small molecules and associated bio-assay data, facilitating collaboration among research teams.
Drug Discovery
Large Dataset Distribution
Common Crawl
A nonprofit organization that crawls the web and freely provides its archives and datasets to the public, consisting of petabytes of data collected since 2008
General Research Repositories
Large Dataset Distribution
Comparative Toxicogenomics Database (CTD)
A publicly available resource that curates scientific data describing relationships between chemicals, genes, and diseases, including information on environmental exposures linked to neurodegenerative diseases like Alzheimer's and Parkinson's.
Disease-Specific Data
Parkinsons Disease
Comparative Toxicogenomics Database (CTD)
A publicly available resource that curates scientific data describing relationships between chemicals, genes, and diseases, including information on environmental exposures linked to neurodegenerative diseases like Alzheimer's and Parkinson's.
Disease-Specific Data
Alzheimers Disease
ConnectomeDB
Repository of structural and functional connectivity MRI studies.
Imaging & AI Datasets
MRI
County Health Rankings & Roadmaps
Database ranking health disparities and social determinants at the county level.
Public Health & Epidemiology
Health Disparities
DGV (Database of Genomic Variants)
A catalog of genomic structural variations in humans.
Genomics & Multi-Omics
Human Genomics
DIAAS Dataset (Digestible Indispensable Amino Acid Score)
Global reference database for protein quality assessment based on amino acid digestibility.
Agriculture & Food Systems
Proteins
DNA DataBank of Japan (DDBJ)
A DNA sequence repository.
Genomics & Multi-Omics
Human Genomics
Data Commons
An open-source platform created by Google that provides an open knowledge graph, combining economic, scientific, and other public datasets into a unified view
General Research Repositories
Large Dataset Distribution
Data Sharing for Demographic Research (DSDR)
Advances research on maternal and child health by making demographic data discoverable and accessible for secondary analysis, adhering to FAIR principles.
Public Health & Epidemiology
Maternal and Child Health
Data for Global Health Equity Repository
Provides over 70 datasets on SDOH across multiple countries, aiming to inform policies and programs to reduce health inequities.
Public Health & Epidemiology
Health Disparities
Data.gov
The U.S. government's open data site, providing access to datasets from various federal agencies.
General Research Repositories
Large Dataset Distribution
DataONE
A network of interoperable data repositories facilitating data sharing, discovery, and open science, particularly in the Earth and environmental sciences
Agriculture & Food Systems
Environmental
Database of Genomic Variants Archive (DGVa)
A repository for genomic structural variation.
Genomics & Multi-Omics
Human Genomics
Database of Interacting Proteins (DIP)
A database of experimentally validated protein interactions.
Proteomics
Protein Interactions
DeepChem
An open-source toolkit integrating deep learning with chemistry, providing datasets and models to accelerate drug discovery using AI.
Drug Discovery
Chemistry
DeepChem AI for Drug Discovery
Open-source AI/ML models and datasets for automated drug discovery and predictive modeling.
Drug Discovery
Predictive Modeling
Demographic and Health Surveys (DHS)
Data on maternal/child health, nutrition, and infectious diseases in developing countries.
Public Health & Epidemiology
Infectious Disease
Demographic and Health Surveys (DHS)
Data on maternal/child health, nutrition, and infectious diseases in developing countries.
Public Health & Epidemiology
Maternal and Child Health
Diabetes Genes Database (T2D-Genes)
Genetic database linking Type 2 diabetes to genetic variations.
Disease-Specific Data
Diabetes
Diabetes Prevention Program (DPP)
Longitudinal study tracking the effectiveness of diabetes prevention programs.
Disease-Specific Data
Diabetes
Dietary Supplement Ingredient Database (DSID)
Developed by the U.S. Department of Agriculture in collaboration with the National Institutes of Health, the DSID provides estimated levels of ingredients in dietary supplement products sold in the United States, aiding in the assessment of nutrient intake from supplements.
Agriculture & Food Systems
Nutrition Data
DisGeNET
A discovery platform integrating information on gene-disease associations, aiding in the exploration of genetic and environmental factors contributing to chronic diseases.
Public Health & Epidemiology
Disease Risk
Dr. Duke's Phytochemical and Ethnobotanical Databases
Developed by Dr. James A. Duke at the USDA, this database provides detailed information on the phytochemical constituents of plants, their ethnobotanical uses, and associated biological activities. It serves as a valuable resource for exploring the chemical compounds in plants and their traditional medicinal applications.
Agriculture & Food Systems
Plant Chemicals
Drug-Induced Liver Injury Network (DILIN)
Research network focused on drug-induced liver injury.
Drug Discovery
Drug Induced Liver Injury
DrugBank
A unique bioinformatics and cheminformatics resource that combines detailed drug data with comprehensive drug target information, supporting pharmaceutical research and drug development.
Drug Discovery
Chemistry
DrugBank
Comprehensive resource for in silico drug discovery and exploration.
Drug Discovery
Chemistry
Dryad
An international open-access repository of research data, particularly data underlying scientific and medical publications, making data discoverable, freely reusable, and citable.
General Research Repositories
Multidisciplinary Data
Dryad Digital Repository
An open data repository for scientific and medical research data.
General Research Repositories
Multidisciplinary Data
ECHO Normal Database
Normal echocardiography database for reference values.
Imaging & AI Datasets
Heart Disease
ECHO-NET Dynamic
Deep learning database for echocardiography analysis.
Imaging & AI Datasets
Heart Disease
EMBL Nucleotide Sequence Database (ENA)
A nucleotide sequence database.
Genomics & Multi-Omics
Human Genomics
EMory BrEast imaging Dataset (EMBED)
A racially diverse dataset of 3.5 million screening and diagnostic mammograms from 116,000 women, including annotated lesions linked to imaging descriptors and pathologic outcomes.
Public Health & Epidemiology
breast cancer
ENCODE
Encyclopedia of DNA elements, providing functional genomic data.
Genomics & Multi-Omics
Human Genomics
ENCODE (Encyclopedia of DNA Elements)
A public research project that aims to build a comprehensive parts list of functional elements in the human genome, providing data on genome-wide mapping of regulatory elements, transcription factor binding sites, histone modifications, chromatin accessibility, and RNA transcripts.
Genomics & Multi-Omics
Human Genomics
EPA Integrated Risk Information System (IRIS)
A database of risk assessments for environmental substances, including their immunotoxicological effects.
Immunology
Disease Risk
Edamam Food Database API
Offers a food database and nutrition data API, providing detailed nutritional information for various foods, including fast-food items, to support health and wellness applications.
Agriculture & Food Systems
Nutrition Data
Electronic Medical Records and Genomics (eMERGE) Network
Genomic and electronic health record integration for cardiovascular studies.
Public Health & Epidemiology
Heart Disease
English Longitudinal Study of Ageing (ELSA)
A multidisciplinary study collecting data on the health, social, wellbeing, and economic aspects of aging in England.
Public Health & Epidemiology
Aging
English Longitudinal Study of Ageing (ELSA)
A longitudinal study collecting multidisciplinary data from a representative sample of the English population aged 50+, focusing on aging, health trajectories, and socioeconomic factors.
Public Health & Epidemiology
Aging
Ensembl Genome Browser
Genome browser for exploring annotated genes and variants.
Genomics & Multi-Omics
Human Genomics
Ensembl Parkinson Disease Genomics
Genomic database integrating Parkinson disease-associated mutations.
Disease-Specific Data
Parkinsons Disease
Environmental Data Initiative Repository
A repository for environmental research data.
Agriculture & Food Systems
Environmental
Environmental Genome Project (EGP)
Focuses on understanding the impact of environmental exposures on human disease by studying genetic susceptibility, including genes involved in immune responses.
Public Health & Epidemiology
Disease Risk
Environmental Influences on Child Health Outcomes (ECHO) Program
ECHO investigates how environmental exposures in early development?from conception through early childhood?influence child health outcomes, including pregnancy outcomes.
Public Health & Epidemiology
Disease Risk
Environmental Justice Screening and Mapping Tool (EJSCREEN)
EPA tool mapping environmental and social justice health disparities.
Public Health & Epidemiology
Health Disparities
Environmental Public Health Tracking Network (EPHTN)
A system by the CDC providing data on environmental exposures and health outcomes, facilitating research on environmental factors in chronic disease prevention.
Public Health & Epidemiology
Disease Risk
Environmental Risk Factors for Alzheimer's and Parkinson's Diseases Database
A database compiling information on environmental risk factors, such as air pollution and pesticide exposure, associated with the incidence and progression of Alzheimer's and Parkinson's diseases, supporting research into environmental determinants of these neurodegenerative conditions.
Disease-Specific Data
Parkinsons Disease
Eukaryotic Pathogen Database Resources (EuPathDB)
A genomic database for eukaryotic pathogens.
Genomics & Multi-Omics
Pathogens
European Genome-Phenome Archive (EGA)
A European repository for genotype and phenotype data.
Genomics & Multi-Omics
Human Genomics
European Longitudinal Study of Pregnancy and Childhood (ELSPAC)
ELSPAC is a longitudinal study that investigates the health and development of children in relation to environmental factors during pregnancy and early childhood across several European countries.
Public Health & Epidemiology
Maternal and Child Health
European Prospective Investigation into Cancer and Nutrition (EPIC)
A large cohort study investigating the relationships between diet, nutritional status, lifestyle, environmental factors, and the incidence of chronic diseases.
Public Health & Epidemiology
Disease Risk
European Union Open Data Portal
European Union institutions and bodies.
General Research Repositories
Large Dataset Distribution
ExRNA Atlas for Parkinson? Disease
Atlas of extracellular RNA biomarkers for neurodegenerative diseases.
Neuro Data
Neurodegeneration
Exposome Explorer
A database of biomarkers related to environmental exposures and their potential immunological impacts.
Immunology
Disease Risk
FDA Adverse Event Reporting System (FAERS)
Database containing information on adverse event and medication error reports submitted to the FDA.
Drug Discovery
Side Effects Drugs
FDA OTC Database
The FDA Over-the-Counter (OTC) Database provides information on approved OTC drugs, including active ingredients, formulations, labeling, and regulatory status.
Drug Discovery
OTC Database
FDA Open Data
A collection of publicly available datasets from the U.S. FDA.
General Research Repositories
Large Dataset Distribution
FRAILOMIC (Frailty & Aging Biomarkers)
A multi-omics dataset designed to predict frailty, cognitive decline, and healthy aging biomarkers.
Genomics & Multi-Omics
Neurodegeneration
Fast Food Nutrition Dataset - Kaggle
A comprehensive dataset providing nutritional information for various fast food products from popular chains, including calorie counts, macronutrients, and micronutrients.
Public Health & Epidemiology
Nutrition Data
Fast Food Nutrition Facts
Provides nutritional facts, Weight Watchers points, allergens, and ingredients for menu items from various fast food restaurants, allowing users to make informed dietary choices.
Public Health & Epidemiology
Nutrition Data
Fast Food Nutritional Database - GitHub
An ETL project compiling nutritional information from several national U.S. fast-food chains into a relational database, facilitating analysis and comparison of nutritional content across different restaurants.
Public Health & Epidemiology
Nutrition Data
FatSecret Platform API
Provides access to a vast dataset of global food nutrition information, including data from fast-food franchises, supporting applications in meal planning and dietary analysis.
Public Health & Epidemiology
Nutrition Data
Figshare
An open-access repository where researchers can preserve and share their research outputs, including datasets, images, and videos
General Research Repositories
Multidisciplinary Data
FinnGen Diabetes Data
Finnish biobank with genetic and clinical diabetes data.
Disease-Specific Data
Diabetes
FinnGen Parkinson? Disease Data
Finnish biobank study linking Parkinsons to genetic and clinical data.
Disease-Specific Data
Parkinsons Disease
Firebrowse (Broad Institute TCGA Data Access)
TCGA dataset access tool with curated genomic and clinical data.
Genomics & Multi-Omics
Cancer
Florida Alzheimer Disease Research Center (ADRC)
The National Alzheimer痴 Coordinating Center (NACC) functions as the centralized data repository, and collaboration and communication hub, for the National Institute on Aging痴 (NIA) Alzheimer痴 Disease Research Centers (ADRC) Program.
Disease-Specific Data
Alzheimers Disease
Florida Behavioral Risk Factor Surveillance System (BRFSS)
Health risk behavior data from Florida's adult population.
Public Health & Epidemiology
Population Health
Florida Birth Defects Registry (FBDR)
Registry tracking birth defects and congenital anomalies in Florida.
Public Health & Epidemiology
Maternal and Child Health
Florida CHARTS (Community Health Assessment Resource Tool Set)
Statewide public health data including mortality, disease, and demographics.
Public Health & Epidemiology
Large Dataset Distribution
Florida COVID-19 Data and Surveillance
COVID-19 surveillance and case reporting in Florida.
Public Health & Epidemiology
Covid 19
Florida Cancer Data System (FCDS)
Statewide cancer registry collecting incidence and survival data.
Disease-Specific Data
Cancer
Florida Department of Health - Public Health Statistics
Comprehensive health statistics and disease surveillance for Florida.
Public Health & Epidemiology
Florida Only Data
Florida Environmental Public Health Tracking
Tracking environmental factors and their effects on public health in Florida.
Public Health & Epidemiology
Florida Only Data
Florida HIV/AIDS Surveillance Program
HIV/AIDS case surveillance and epidemiological tracking in Florida.
Public Health & Epidemiology
Florida HIV
Florida Health Data Warehouse
Centralized warehouse for Florida public health datasets.
Public Health & Epidemiology
Florida Only Data
Florida Health Equity and Disparities Data
Health equity data tracking disparities among Florida communities.
Public Health & Epidemiology
Health Disparities
Florida Injury Surveillance System (FL-ISS)
Statewide injury surveillance and prevention data.
Public Health & Epidemiology
Florida Only Data
Florida Medicaid Data
Medicaid health data for policy and healthcare analysis.
Public Health & Epidemiology
Florida Medicare/Medicaid
Florida Prescription Drug Monitoring Program (PDMP)
Prescription drug monitoring program to reduce opioid misuse.
Public Health & Epidemiology
Florida Prescription Drug Use
Florida Rural Health Research Data
Health data focused on rural populations in Florida.
Public Health & Epidemiology
Florida Rural
Florida Trauma Registry
Statewide trauma registry collecting injury and emergency response data.
Public Health & Epidemiology
Florida Only Data
Florida Vital Statistics
Florida birth, death, and marriage records for health research.
Public Health & Epidemiology
Florida Only Data
FlyBase
A database for Drosophila genetics and research.
Genomics & Multi-Omics
Drosophila
FooDB
A database containing detailed compositional data on food and its metabolites.
Agriculture & Food Systems
Food Science
FooDB
The world's largest resource on food constituents, chemistry, and biology, providing detailed information about chemical compounds found in food.
Agriculture & Food Systems
Food Science
FooDB (The Food Database)
A comprehensive resource detailing the chemical composition of foods, including information on macronutrients and micronutrients such as vitamins and minerals, as well as their known health effects.
Agriculture & Food Systems
Food Science
Food Access Research Atlas (USDA)
Provides data on food deserts and food insecurity by census tract.
Agriculture & Food Systems
food insecurity
Food Frequency Questionnaire (FFQ) Data
FFQs are standardized questionnaires used to assess habitual dietary intake over a specified period, providing valuable data for studying associations between dietary patterns and health outcomes.
Public Health & Epidemiology
Disease Risk
Food Metabolome Repository
A repository for food metabolome data obtained using liquid chromatography-mass spectrometry (LC-MS).
Agriculture & Food Systems
Microbiome
Food and Microbiome Longitudinal Investigation
A dataset from NYU researchers focusing on the longitudinal study of diet and its impact on the human microbiome.
Immunology
Microbiome
FoodRepo: An Open Food Repository
An open food repository of barcoded food items, programmatically accessible through an API, suitable for large-scale studies in digital nutrition.
Agriculture & Food Systems
Food Science
Framingham Heart Study
Longitudinal cardiovascular study tracking risk factors for heart disease.
Public Health & Epidemiology
Heart Disease
Functional Connectomes Project International Neuroimaging Data-Sharing Initiative (FCP/INDI)
A neuroimaging data-sharing platform.
Neuro Data
Multimodal Imaging
GEO (Gene Expression Omnibus)
A repository of functional genomics datasets.
Genomics & Multi-Omics
Human Genomics
GISAID
Global Initiative on Sharing Avian Influenza Data, focused on SARS-CoV-2 genome sequences.
Public Health & Epidemiology
Covid 19
GISAID (Genomic Epidemiology of Viruses)
Public Health & Epidemiology Data
Public Health & Epidemiology
Viral Infections
GMrepo
A curated human gut microbiome database with a focus on disease markers and cross-dataset comparisons.
Immunology
Microbiome
GPM DB
A proteomics database for protein identification.
Proteomics
Protein identification
GTEx (Genotype-Tissue Expression Project)
A resource linking genetic variation with gene expression in multiple tissues.
Genomics & Multi-Omics
Human Genomics
GenBank
A comprehensive sequence database.
Genomics & Multi-Omics
Human Genomics
Gene Expression Omnibus (GEO)
A gene expression repository.
Genomics & Multi-Omics
Transcriptomics
Genetic Perturbation Platform (GPP)
Provides resources for analyzing CRISPR and RNAi screening data.
Genomics & Multi-Omics
CRISPR and RNAi screening
Genetics of Alzheimer's Disease Data (NIAGADS)
Genetic repository for Alzheimer? disease GWAS and sequencing data.
Disease-Specific Data
Alzheimers Disease
Genetics of Type 2 Diabetes (GoT2D)
Comprehensive genetic study of Type 2 diabetes risk factors.
Disease-Specific Data
Diabetes
Genome Aggregation Database (gnomAD)
A large-scale reference database of human genetic variation.
Genomics & Multi-Omics
Human Genomics
Genome Aggregation Database (gnomAD)
A resource for aggregated human genetic variation data from multiple sequencing projects.
Genomics & Multi-Omics
Human Genomics
GenomeRNAi
A database for RNAi screening data.
Genomics & Multi-Omics
CRISPR and RNAi screening
Genomic Data Commons (GDC)
Centralized repository for cancer genomics data.
Disease-Specific Data
Cancer
Genomic Data Commons (GDC) Data Portal
NCI's unified data repository for cancer genomic research.
Disease-Specific Data
Cancer
Genomics of Parkinson? Disease (GP2)
Global Parkinsons genomics project analyzing genetic risk factors.
Disease-Specific Data
Parkinsons Disease
German Neuroinformatics Node/G-Node (GIN)
A neuroscience research data repository.
Neuro Data
Large Dataset Distribution
GeroSense Wearable Data
AI-based biological aging clocks from real-world wearable devices (Fitbit, Apple, Garmin).
Public Health & Epidemiology
Wearable Devices
GigaDB
A database for large-scale biological data.
Biological Data
Large Dataset Distribution
Global Biodiversity Information Facility (GBIF)
A global biodiversity database.
General Research Repositories
Health Disparities
Global Burden of Disease Study (GBD)
A comprehensive regional and global research program assessing mortality and disability from major diseases, injuries, and risk factors, providing insights into environmental determinants of chronic diseases.
Public Health & Epidemiology
Disease Risk
Global Diabetes Footprint Dataset
Global study tracking the impact of diabetes worldwide.
Disease-Specific Data
Diabetes
Global Health Observatory (WHO)
Global health inequalities across regions.
General Research Repositories
Health Disparities
Global Microbiome Dataset (GMrepo)
Multi-continent human microbiome sequencing dataset for gut health research and immune system studies.
Immunology
Microbiome
Global Network Maternal and Newborn Health Registry
Registry tracking maternal and newborn health outcomes in low-resource settings.
Public Health & Epidemiology
Maternal and Child Health
Global Nutrition Report Dataset
Contains data for all indicators used in country profiles, compiled from sources like UNICEF, WHO, and the World Bank.
Public Health & Epidemiology
Large Dataset Distribution
Global Virus Network (GVN)
Early-warning system for new pandemics & viral outbreaks, providing surveillance of emerging infectious diseases.
Health Surveillance
Viral Infections
Glycemic Index Research Database
Database on the glycemic index of foods and their impact on blood sugar.
Agriculture & Food Systems
Glycemin Index
Google BigQuery Public Datasets
A collection of open datasets optimized for cloud-based big data analysis.
General Research Repositories
Large Dataset Distribution
Google Dataset Search
A tool that enables users to discover datasets stored across the web, covering various disciplines and topics.
General Research Repositories
Large Dataset Distribution
Google DeepMind?s AlphaFold Protein Structure Database
AI-driven protein structure predictions, accelerating drug discovery and protein engineering.
Drug Discovery
Protein Interactions
Google Genomics
A cloud-based platform for storing and analyzing genomic data.
Genomics & Multi-Omics
Human Genomics

 

© 2025 by Center of Excellence – Consortium of Educators.

 

bottom of page