02250: Introduction to Computational Molecular Biology
Spring 2016, Homework 5, Your name:
Your gene: bRAF
Choosing model organism is a complex task, some of the part we did before using
the protein sequence similarity. Let evaluate the possible model organisms from the
whole genome view. Particularly taking into account that practically no one gene act
along, on contrary usually there is a cluster of genes working together, and in many cases
they are located close to each other on the genome. Let’s do some research to find out
about other genes related to your gene and the disease it involves.
1. (1 pts) Locate your gene in human genome, choose the latest build. On which
chromosome and over which positions it is found?
2. (2 pts.) Which genes lie within close distance of the your gene? Find at least 10 closely
locates genes. Find an image of the genome in that region showing the gene locations.
List all the genes.
3. (5 pts) Explore each gene. Based on the known functions of these genes, which would
be the most plausible candidate(s) for the disease associated with your gen, and can be
also a target for the mutation and why? There can be several, depending on there
functional relations to the disease and to your gene; choose best two. These genes
together with your gene will be your test set.
4. (3 pts) Evaluate the quality of each gene in your test set. How was the intron/exon
structure of these genes determined? Given your answer, should we have high confidence
that the gene’s structure is annotated correctly? (Hint: The mapview entry for the mRNAs
of the gene will have details about the source of the sequence.)
5. (4 pts) Using genome browser (UCSC Genome Browser will be better for this task)
identify best model organism, which can represent all the genes in your test set and also
intergenic regions. Locate every gene from your selected list in the genomes and plot the
region, setting the “RefSeq Genes,” “Other RefSeq,” “NSCAN,” and “Conservation”
tracks to “full”. (Hint: The resulting pictures for each gene may look like the one on the
lecture slide with conservation.)
6. (3 pts) We would expect that introns will evolve more quickly than exons, and
therefore we can find likely exon positions by looking for more conserved nucleotides.
Based on your plot for problem 5, does conservation within the organisms at the base
level provide support for the specific exon locations in the gene model? Why or why not?
7. (1 pts) Again looking at your plot from problem 5, what organism provides the best
evidence from conservation for or against the specific human gene structure model?
Comments
Content
02250: Introduction to Computational Molecular Biology
Spring 2016, Homework 5, Your name:
Your gene: bRAF
Choosing model organism is a complex task, some of the part we did before using
the protein sequence similarity. Let evaluate the possible model organisms from the
whole genome view. Particularly taking into account that practically no one gene act
along, on contrary usually there is a cluster of genes working together, and in many cases
they are located close to each other on the genome. Let’s do some research to find out
about other genes related to your gene and the disease it involves.
1. (1 pts) Locate your gene in human genome, choose the latest build. On which
chromosome and over which positions it is found?
2. (2 pts.) Which genes lie within close distance of the your gene? Find at least 10 closely
locates genes. Find an image of the genome in that region showing the gene locations.
List all the genes.
3. (5 pts) Explore each gene. Based on the known functions of these genes, which would
be the most plausible candidate(s) for the disease associated with your gen, and can be
also a target for the mutation and why? There can be several, depending on there
functional relations to the disease and to your gene; choose best two. These genes
together with your gene will be your test set.
4. (3 pts) Evaluate the quality of each gene in your test set. How was the intron/exon
structure of these genes determined? Given your answer, should we have high confidence
that the gene’s structure is annotated correctly? (Hint: The mapview entry for the mRNAs
of the gene will have details about the source of the sequence.)
5. (4 pts) Using genome browser (UCSC Genome Browser will be better for this task)
identify best model organism, which can represent all the genes in your test set and also
intergenic regions. Locate every gene from your selected list in the genomes and plot the
region, setting the “RefSeq Genes,” “Other RefSeq,” “NSCAN,” and “Conservation”
tracks to “full”. (Hint: The resulting pictures for each gene may look like the one on the
lecture slide with conservation.)
6. (3 pts) We would expect that introns will evolve more quickly than exons, and
therefore we can find likely exon positions by looking for more conserved nucleotides.
Based on your plot for problem 5, does conservation within the organisms at the base
level provide support for the specific exon locations in the gene model? Why or why not?
7. (1 pts) Again looking at your plot from problem 5, what organism provides the best
evidence from conservation for or against the specific human gene structure model?