Proteogenomics and Gene Annotation

530 Words2 Pages

Proteogenomics and Gene Annotation
Introduction
Proteogenomics is a kind of science field that includes proteomics and genomics. Proteomic consists of protein sequence information and genomic consists of genome sequence information. It is used to annotate whole genome and protein coding genes. Proteomic data provides genome analysis by showing genome annotation and using of peptides that is gained from expressed proteins and it can be used to correct coding regions.Identities of protein coding regions in terms of function and sequence is more important than nucleotide sequences because protein coding genes have more function in a cell than other nucleotide sequences. Genome annotation process includes all experimental and computational stages.These stages can be identification of a gene ,function and structure of a gene and coding region locations.To carry out these processes, ab initio gene prediction methods can be used to predict exon and splice sites. Annotation of protein coding genes is very time consuming process ,therefore gene prediction methods are used for genome annotations. Some web site programs provides these genome annotations such as NCBI and Ensembl. These tools shows sequenced genomes and gives more accurate gene annotations. However, these tools may not explain the presence of a protein. Main idea of proteogenomic methods is to identify peptides in samples by using these tools and also with the help of mass spectrometry.Mass spectrometry searches translation of genome sequences rather than protein database searching. This method also annotate protein protein interactions.MS/MS data searching against translation of genome can determine and identify peptide sequences.Thus genome data can be understood by using genomic and transcriptomic information with this proteogenomic methods and tools. Many of proteomic information can be achieved by gene prediction algorithms, cDNA sequences and comparative genomics. Large proteomic datasets can be gained by peptide mass spectrophotometry for proteogenomics because it uses proteomic data to annotate genome. If there is genome sequence data for an organism or closely related genomes are present,proteogenomic tools can be used. Gained proteogenomic data provides comparing of these data between many related species and shows homology relationships among many species proteins to make annotations with high accuracy.From these studies, proteogenomic data demonstrates frame shifts regions, gene start sites and exon and intron boundaries , alternative splicing sites and its detection , proteolytic sites that is found in proteins, prediction of genes and post translational modification sites for protein.

Open Document