Multiple sequence alignment is an extension of pairwise alignment to incorporate more than two sequences at a time. It is often used to identify conserved sequence regions which are assumed to be evolutionarily related. By construct phylogenetic trees, it aids in establishing evolutionary relationships.
The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.
BLAST-Protein (blastp) is used for both identifying a query amino acid sequence and for finding similar sequences in protein databases. Like other BLAST programs, blastp is designed to find local regions of similarity. When sequence similarity spans the whole sequence, blastp will also report a global alignment, which is the preferred result for protein identification purposes.
Cytoscape is an open source bioinformatics software platform for visualizing molecular interaction networks and biological pathways and integrating these networks with annotations, gene expression profiles and other state data.
EHCO (Encyclopedia of Hepatocellular Carcinoma genes Online) is an integrative platform to systematically collect, organize and compare the pileup of unsorted HCC-related studies by using natural language processing and softbots.
FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which base pairs or amino acids are represented using single-letter codes. The simplicity of FASTA format makes it easy to manipulate and parse sequences using text-processing tools (e.g. BLAST).
Conserved domains database (CDD) is a protein annotation resource that consists of a collection of well-annotated multiple sequence alignment models for ancient domains and full-length proteins. These are available as position-specific score matrices (PSSMs) for fast identification of conserved domains in protein sequences via RPS-BLAST (Reverse Position-Specific).
HomoloGene database identifies homologs among the annotated genes of more than a dozen completely sequenced eukaryotic genomes using an automated procedure. It can find gene homologs (paralogs, orthologs, homologs) based on protein sequence similarity and access pre-computed multiple alignments of homologous proteins.
The Reference Sequence (RefSeq) database is a non-redundant collection of richly annotated DNA, RNA, and protein sequences from diverse taxa. RefSeq also reports the information about exon and intron boundaries and length.
Map Viewer allows you to view and search complete genome of an organism, display chromosome maps, and zoom into progressively greater levels of detail, down to the sequence data for a region of interest.
ORF Finder identifies all possible ORFs in a DNA sequence by locating the standard and alternative stop and start codons. The deduced amino acid sequences can then be used to BLAST against GenBank.
Gene pathway represents molecular pathways for metabolism, genetic information processing, environmental information processing, other cellular processes, human diseases, and drug development.
dbSNP resource serves both as a repository for genomic variation data (including single nucleotide polymorphisms, microsatellites and small insertion/deletion mutations) and as a computational analysis resource. dbSNP data will provide the researcher with extensive data and information about variations, evolution, disease and more.
GeneCards® is an integrated database of human genes that includes automatically-mined genomic, proteomic and transcriptomic information, as well as orthologies, disease relationships, SNPs, gene expression, gene function, and service links for ordering assays and antibodies.
The Gene Expression Omnibus (GEO) is a public repository that stores original submitter-supplied curated gene expression DataSets. This video shows you how to enter search terms to locate experiments of interest and interpret GEO DataSets results pages.
The Human Protein Reference Database represents a centralized platform to visually depict and integrate information pertaining to domain architecture, post-translational modifications, interaction networks and disease association for each protein in the human proteome.
OMIM (Online Mendelian Inheritance in Man) is a catalog of human genes and genetic disorders, with links to literature references, sequence records, maps, and related databases. It is based on the book, Mendelian Inheritance in Man. The online version is updated daily.
Protein-protein interactions (PPIs) are critical to every aspect of biological processes. Even though a number of software tools are available to facilitate PPI network analysis, an integrated tool is crucial to alleviate the burden on querying across multiple web servers and software tools. POINeT, an integrated web service, have been constructed to simplify the process of PPI searching, analysis, and visualization.
"Entrez Structure", also known as Molecular Modeling DataBase (MMDB), is a database of experimentally determined structures obtained from the RCSB Protein Data Bank (PDB). It provides a wealth of information on the biological function, on mechanisms linked to the function, and on the evolutionary history of relationships between macromolecules.
PubMed is a free literature database which comprises more than 20 million citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full-text content from PubMed Central and publisher web sites.