Minisymposium 23: Genome Annotation
Add
this abstract to my Itinerary
Abs #
M2303: New Gene discovery in un-annotated regions in Arabidopsis genome
|
|
Presenter: |
Xiao, Yongli Contact Presenter |
Authors | Xiao, Yongli (A) Moskal, William A. (A) Wu, Hank (A) Underwood, Beverly A. (A) Monaghan, Erin L. (A) Town, Christopher D. (A) | | Affiliations: |
(A): The Institute for Genomic Research
|
|
|
The whole genome sequence of Arabidopsis with its annotation was completed at end of 2000. After that, four versions of genome re-annotation were released sequentially based on new experimental evidence and improved computational analysis. Evidence based mainly on sequence conservation indicated the existence of more genes still in un-annotated regions in Arabidopsis genome. TwinScan and EuGene are two relatively new gene prediction programs that incorporate comparative genomic information. Compared with the genome re-annotation release version 5, EuGene predicts 1559 and TwinScan predicts 1440 genes in intergenic regions of the Arabidopsis genome with 365 predictions in common. In order to verify the novel genes predicted by EuGene and TwinScan, a high throughput method of rapid amplification of cDNA ends (RACE) using cDNA from 11 diverse RNA populations was applied to 918 predictions in intergenic regions. We recovered transcripts from 429 predictions that yielded 323 novel full-length cDNAs. In addition, a comparative study of Arabidopsis and Brassica yielded a number of conserved Arabidopsis genome regions (CAGS) and 28% of which aligned to un-annotated regions suggesting the presence of un-annotated novel genes. We similarly targeted 192 intergenic CAGSs by the RACE pipeline and found an additional 25 un-annotated genes. Therefore, a total 454 of novel genes in the intergenic regions of Arabidopsis has been recovered and their open reading frame (ORF) clones from the ones with the full-length cDNAs will be generated in Gateway compatible vector and provided to the research community.