DNA and Protein Sequence Databases

Primary Source - Sequence Data Repositories

The International Nucleotide Sequence Database Collaboration is comprised of the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI. These three organizations exchange sequence data on a daily basis, and provide a variety of search and analysis tools for nucleotide sequences.

Sequence Resources linked within the SBKB

The SBKB has collected annotations from the following DNA and protein sequence resources, accessible through a sequence or ID-based search. Click the [+] symbol to learn more about a resource, or follow the hotlinked name to take you directly to the resource's homepage.

AGDAshbya and yeast dbgenome/transcriptome database containing gene annotation and high-density oligonucleotide microarray expression data for protein-coding genes from Ashbya gossypii and Saccharomyces cerevisiae. • [+]


BioCycmetabolic pathwayseach database in the BioCyc collection describes the genome and metabolic pathways of a single organism. • [+]


CleanExgene expression portalportal which provides access to multiple curated public gene expression data resources • [+]


CYGDyeast dbcomprehensive yeast genome database • [+]


DictybaseDictyostelid genomicsfull genomics, material, and networking resource for the Dictyostelid community • [+]


EchoBaseE. coliK-12 strain dbintegrated post-genomic database for Escherichia coli K-12 strain MG1655 • [+]


EcoGeneE. coli dbdatabase of Escherichia coli Sequence and Function • [+]


euHCVdbhepatitis C database euHCVdb is oriented towards protein sequence, structure and function analyses and structural biology of HCV • [+]


EvoTracephylogenic mappingcreates an integrated report about the evolutionary propensity of individual residues • [+]


FlyBasefruit fly dbgenome annotation and phenotype image database for Drosophila melanogastor (fruit fly) • [+]


GeneCardshuman gene dbsearchable, integrated database of human genes that provides concise genomic related information on all known and predicted human genes • [+]


GeneDBpathogen dbProvides access access to the latest sequence data and annotation/curation of over 40 pathogenic organisms • [+]


GeneFarmwatercress dbgenome annotation database for Arabidopsis thaliana (watercress) • [+]


GenoListmulti microbial dbsintegrated environment for comparative exploration of over 700 microbial genomes • [+]


Gramenegrasses dbcomparative genome annotation database for several Grass species • [+]


HAMAPprotein family dbsemi-automatic annotation of proteins that are part of well-conserved families or subfamilies • [+]


HGNChuman gene namingunique gene symbols and names to over 33,000 human loci • [+]


HInv-DBhuan gene dbcurated annotations of human genes  • [+]


HOGENOMhomologous human genes dbdatabase of complete genome homologous genes families • [+]


KEGGgenes and functionsdatabase resource for understanding high-level functions and utilities of the biological system • [+]


MaizeGDBcorn family dbcommunity-oriented informatics service to researchers focused on the crop plant and model organism Zea mays • [+]


MEROPSpeptidase dbinformation resource for peptidases (also termed proteases, proteinases and proteolytic enzymes) and the proteins that inhibit them • [+]


MGDmouse genomics dbintegrated, community-driven data resource on mouse genes, genome features, and phenotypes • [+]

NMPDRmicrobial pathogen dbcurated annotations of food-based and sexually-transmitted pathogens  • [+]


NCBI Nucleotide dbNCBI's dbcollection of sequences from several sources, including GenBank, RefSeq, TPA and PDB. Genome, gene and transcript sequence data provide the foundation for biomedical research and discovery. • [+]


NCBI RefSeqnon-redundant DNA datacomprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins. • [+]


PANTHERprotein classification Protein ANalysis THrough Evolutionary Relationships, library of protein families and subfamilies indexed by function • [+]


PCCDBcircular dichroism dbdata repository and searchable archive for Circular Dichroism spectra from proteins. • [+]


PeptideAtlaspeptide proteomics dbmulti-organism proteomics data • [+]


PeroxiBaseperoxidase dbdatabase of manual annotation of peroxidase superfamilies encoding sequences • [+]


Pfamsequence family dbdatabase that curates protein sequence families, each represented by multiple sequence alignments and hidden Markov models (HMMs). • [+]


PhosphoSitePluspost-translational dbsystems biology resource providing comprehensive information and tools for the study of protein post-translational modifications • [+]


PlasmoDBmalaria proteomicsgenomic and proteomic data for different species of the parasitic eukaryote Plasmodium, the cause of Malaria. • [+]


PptaseDBphosphatase dbprokaryotic protein phosphatase database • [+]


PRINTSprotein fingerprint dbcompendium of conserved sequence motifs used to characterise a protein family • [+]


ProDomdomain family dbcomprehensive set of protein domain families automatically generated from the SWISS-PROT and TrEMBL sequence databases • [+]


ProMEXtryptic digest dbtryptic peptide fragmentation mass spectra derived from plants • [+]


ProSitedomain classificationprotein domains, families and functional sites as well as associated patterns and profiles to identify them • [+]


PseudoCAPPseudomonas dbcomparative analysis of Pseudomonas aeruginosa with other species • [+]


RGDrat genome dbintegrated, community-driven data resource on rat genes,genome features, and phenotypes • [+]


SGDSaccharomyces genome db comprehensive integrated biological information for budding yeast along with search and analysis tools to explore these data • [+]


TAIRwatercress dbdatabase of genetic and molecular biology data for the model higher plant Arabidopsis thaliana  • [+]


NCBI Taxonomytaxonomic classificationcurated classification and nomenclature for all of the organisms in the public sequence databases • [+]


TIGR/JCVImulti genome db Genome annotation projects from the J. Craig Venter Institute • [+]


UniGenegene transcript predictorcomputationally identifies transcripts from the same locus; analyzes expression by tissue, age, and health status; and reports related proteins (protEST) and clone resources • [+]


UniProtuniversal protein resourceUniProt (Universal Protein Resource) is the world's most comprehensive catalog of information on proteins. • [+]


VectorBasehuman pathogen dbNIAID Bioinformatics Resource Center for Invertebrate Vectors of Human Pathogens • [+]


World-2DPagegel-based proteomics portalknown 2-D PAGE database servers, as well as to 2-D PAGE related servers and services. • [+]


WormBaseworm genome dbcommunity-driven integrated resource of biology and genome of C. elegans • [+]


ZFINzebrafish genome dbcommunity-driven integrated resource of biology and genome of D. rerio • [+]

