Drosophila sechellia

From AAAWiki

Abbreviation: dsec

Contents

Annotation of Broken or Missequenced Genes

We are currently undertaking a computational analysis of all D. sechellia homologues of D. melanogaster genes. We began by locating and aligning D. sechellia genomic sequence to the D. melanogaster gene regions. After extracting the apparent coding domains, we analyzed the genes for premature stop codons, possible frame shifting insertions or deletions by comparing the melanogaster, simulans and sechellia sequences. The work is ongoing, but we have prepared an initial list of affected genes. This list is likely to expand as the analysis continues.

Alignments to D. simulans and D. melangoaster

We recently created alignments between D. sechellia and D. simulans genomic sequence and D. melanogaster CDS sequences. Using the resulting 14674 alignments (note: alternative transcripts are included in this number) we identified potential problems in sequence alignment computationally.

The results revealed 2349 alignments with significant problems or alignment errors. Of these alignments 318 appeared to have single base insertions or deletions. Of the remaining 12643 well aligned genes, 1164 (or 9.2 percent) also had single base insertions or deletions. These latter indels are likely due to assembly errors, since the alignment quality is otherwise quite good.

More analysis shows that 7052 of the genes have truly excellent alignments, with no single base insertions or deletions or premature stop codons. Many of these genes, however, do show insertions or deletions of whole codons, which we hope to explore further in a later analysis. We have lists of genes in each category. Please contact Corbin Jones with questions.

Branch lengths

Using PHYLIP and dodgy alignments of D. sechellia, D. simulans, and D. melanogaster, we calculated branchlengths from the common ancestor of D. sechellia and D. simulans (the red dot on the three species tree) to the extant species. We WILL redo this analysis with the new data sometime soon.

Image:node.jpg

Heat Map

On the graphs below, top is D. sechellia, bottom is D. simulans. Blue= short branch length; Red = long branch length. Gene order goes from left to right. Chromosome order is X, 2L, 2R, 3L, & 3R. Red areas are showing regions of rapid sequence divergence.

Cheers, Corbin Jones and William Jeck

X

Image:Xsechsim.jpg

2L

Image:2Lsechsim.jpg

2R

Image:2Rsechsim.jpg

3L

Image:3Lsechsim.jpg

3R

Image:3Rsechsim.jpg