Archive for the ‘critsum0mg’ Category

PLOS Genetics: A Massively Parallel Pipeline to Clone DNA Variants and Examine Molecular Phenotypes of Human Disease Mutations

Saturday, February 7th, 2015

Massively Parallel Pipeline to Clone DNA Variants & Examine…Disease
Mutations http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1004819 CloneSeq leverages NextGen sequencing

With the advance of sequencing technologies, tens of millions of genomic variants have been discovered in the human population. However, there is no available method to date that is capable of determining the functional impact of these variants on a large scale, which has increasingly become a huge bottleneck for the development of population genetics and personal genomics. Clone-seq and comparative interactome-profiling pipeline is a first to address this issue.

Can be coupled to many readouts.

Price AL, Kryukov GV, de Bakker PI, Purcell SM, Staples J, Wei LJ, Sunyaev SR. Pooled association tests for rare variants in exon-resequencing studies. American Journal of Human Genetics (2010) 86: 832-838.

Sunday, February 1st, 2015

Pooled association tests for rare variants in exon-resequencing http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3032073 Simulation shows advantage of mult. rarity thresholds

Price AL, Kryukov GV, de Bakker PI, Purcell SM, Staples J, Wei LJ,
Sunyaev SR. Pooled association tests for rare variants in
exon-resequencing studies. American Journal of Human Genetics (2010)
86: 832-838.

SUMMARY

Multiple studies indicate strong association between rare variants and
resulting phenotype. This paper describes a population-genetics
simulation framework to study the influence of variant allele
frequency on the corresponding phenotype. In a prior study, causal
relationship between variants and phenotype was resolved by performing
association test on set of variants having allele frequency below a
fixed threshold. However, here it is observed that simulation
frameworks based on a variable allele frequency threshold provide
higher accuracy in association test compared to the fixed allele
frequency model. In addition, inclusion of predicted functional
effects of variants (Polyphen-2 scores) increases the accuracy of the
variable frequency threshold model. Overall, this paper describes a novel methodology, which can be
used to explore the association between rare variants and various
diseases.

PLOS Genetics: Statistical Estimation of Correlated Genome Associations to a Quantitative Trait Network

Sunday, December 28th, 2014

Correlated Genome Associations to Quantitative Trait #Network (QTN) http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1000587
Uses fused #lasso for estimation of relationships

Kim & Xing (’09) provide a new method for calculating how genetic
markers associate with phenotypes by incorporating phenotype
connectivity features into the correlation structure between markers
and phenotypes. Their model attempts to quantify pleiotropic
relationships between different phenotypes and assumes a common
genotypic origin for the existence of clusters of correlated
phenotypes, which their algorithm uses to reduce the number of
significant genetic markers. In particular, Kim and Xing present a
method for performing quantitative trait analysis that implements two
novel approaches to inferring the contribution of a
[marker/allele/SNP/gene/locus] to a quantitative trait. The first is
organization of traits into a quantitative trait network (QTN). The
second is the utilization of fused lasso, a variation of multivariate
regression that seeks to minimize the number of non-zero coefficients
and least squared error. These two approaches are combined in an
attempt to minimize noise (in the form of small coefficients for SNP’s
that don’t really make a contribution) and focus on truly relevant
SNP’s while dealing with the correlated nature of quantitative
traits. Based on two datasets – simulated HapMap data and
data from the Severe Asthma Research Program – the authors show marked
improvement in accuracy and reduction of false positives over simpler
multivariate regression methods.

Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. A method and server for predicting damaging missense mutations. Nature Methods (2010) 7: 248-249.

Saturday, October 11th, 2014

Server for predicting damaging missense #mutations
http://www.nature.com/nmeth/journal/v7/n4/full/nmeth0410-248.html Polyphen2 uses both structure & sequence (eg ASA & conservation)

http://www.ncbi.nlm.nih.gov/pubmed/20354512

Polyphen2 includes both structural and sequence features to predict the effect of nonsynonymous substitutions on protein function. Similar to many other methods, Polyphen2 uses evolutionary conservation as one of the features to identify functionally important residues. Integration of 3D-structure, membrane-specific features (PHAT matrix for TM regions) and other features such as protein-domain and active-site are the strengths of Polyphen2 compared to other sequence-based software making it a good tool for prediction.

Kiezun A, Garimella K, Do R, Stitziel NO, Neale BM, McLaren PJ, Gupta N, Sklar P, Sullivan PF, Moran JL, Hultman CM, Lichtenstein P, Magnusson P, Lehner T, Shugart YY, Price AL, de Bakker PI, Purcell SM, Sunyaev SR. Exome sequencing and the genetic…

Sunday, July 20th, 2014

#Exome sequencing & #genetic basis of complex traits
http://www.nature.com/ng/journal/v44/n6/full/ng.2303.html Key pt: amt of rare variants exceeds that from neutral model

Kiezun A, Garimella K, Do R, Stitziel NO, Neale BM, McLaren PJ, Gupta N, Sklar P, Sullivan PF, Moran JL, Hultman CM, Lichtenstein P, Magnusson P, Lehner T, Shugart YY, Price AL, de Bakker PI, Purcell SM, Sunyaev SR. Exome sequencing and the genetic basis of complex traits. Nature Genetics (2012) 44: 623-630

SUMMARY

This article serves as part review, and part research article, focusing on using exome sequencing to detect associations between variants and complex traits.

An important fact they point out, with a wide range of implications for studying disease, is that the number of rare variants exceeds the number predicted by the neutral model. Figure 1 illustrates nicely this excess of rare variants.

I agree with their statement that the majority of these mutations are not “neutral”. They attribute this excess to population expansion or purifying selection, but a plausible explanation that explains this excess, which is found in all organisms regardless of demographic history, is linked selection.

The authors compare statistics derived before and after filtering exome sequencing data of 438 individuals (HIV and Scizophrenia data-sets), illustrating the importance of filtering in obtaining high quality calls. WGS (CGI data on 37 individuals) was used as a benchmark for the number of called SNP counts of different categories (silent, missense, nonsense).

They then proceed to analyze the affect of population stratification on significance values by combining different ratios of individuals from the European-American HIV cohort and the Swedish schizophrenia cohort. (Theory predicts that older populations should have more rare variants because recombination has had more time to break up linkage blocks, and because newer populations have most likely gone through homogenizing bottlenecks.) They find that calculating p-values using a permutation test provides fewer type I errors (false positives), and that this technique can competently deal with population
stratification when conducting association studies.

The draft genome of sweet orange (Citrus sinensis) – Nat Genet.

Friday, January 24th, 2014

The draft #genome of sweet orange: Nearly 30K genes in only ~370 Mb + #RNAseq to find key Vitamin C genes
http://www.nature.com/ng/journal/v45/n1/full/ng.2472.html

The authors present a draft genome of sweet orange (Citrus sinensis) which covers 87.3% of the relatively compact orange genome
(approximately 367 Mb). Self-alignment of the citrus genome sequences identified one ancient triplication event, which was shared with a number of diverse plants including Arabidopsis thaliana, and no recent whole genome duplication events partially explaining the compact size of its genome. A combination of short sequence repeat (SSR) and SNP markers revealed that sweet orange is an interspecific hybrid between pummelo and mandarin (1:3 in genome composition with female of pummelo origin). Characterization of the unique protein coding genes in the citrus genome and the transcriptome analysis (RNA-Seq and RNA-PET) derived from different tissues in the citrus plant were used to identify the specific genes that are involved in the accumulation of Vitamin C in its fruit (the rate limiting GalUR in the galacturonate pathway is present in 12 copies which are developmentally regulated). Overall, the genome has almost 30,000 genes.

The draft genome of sweet orange (Citrus sinensis).
Xu Q, Chen LL, …., Ruan Y.
Nat Genet. 2013 Jan;45(1):59-66.
PMID: 23179022

Comprehensive long-span paired-end-tag mapping reveals characteristic patterns of structural variations in epithelial cancer genomes – Genome Res.

Friday, December 27th, 2013

Long-span PET mapping reveals characteristic patterns of #SVs in… cancer [v norm] genomes, but no MEIs or small events
http://genome.cshlp.org/content/early/2011/04/05/gr.113555.110.abstract

The described study used long paired-end-tags (PET) to analyze and compare SVs in cancer and normal genomes. It determined the prevalence of different types of SVs in normal and cancer sample. Overall, the results are interesting and convincing on a qualitative level; however, for the reasons outlined below, more precise and quantitative delineation of the observed effects is highly desirable.

1) Small sample size of normal genomes (only 2 normal genomes)

2) Validation rate was low (< 77%) for everything except deletions, and for singletons it was even lower. .

3) Long PET is not good for finding smaller events (few kbps). Thus, this analysis missed smaller scale SVs and cancer rearrangements.

4) While there is a discussion about breakpoints and associated repeats, it is not very informative as breakpoint locations were not determined to basepair resolution.

5) No MEI were considered — particularly, no cancer MEI were considered in the analysis, while recently it was found that somatic retrotransposition occurs in cancer (Lee et al., PMID: 22745252)..

Comprehensive long-span paired-end-tag mapping reveals characteristic patterns of structural variations in epithelial cancer genomes –

Hillmer AM, Yao F, Inaki K, Lee WH, Ariyaratne PN, Teo AS, Woo XY, Zhang Z, Zhao H, Ukil L, Chen JP, Zhu F, So JB, Salto-Tellez M, Poh WT, Zawack KF, Nagarajan N, Gao S, Li G, Kumar V, Lim HP, Sia YY, Chan CS, Leong ST, Neo SC, Choi PS, Thoreau H, Tan PB, Shahab A, Ruan X, Bergh J, Hall P, Cacheux-Rataboul V, Wei CL, Yeoh KG, Sung WK, Bourque G, Liu ET, Ruan Y.

Genome Res. 2011 May;21(5):665-75. doi: 10.1101/gr.113555.110. Epub 2011 Apr 5.

Evidence of Abundant Purifying Selection in Humans for Recently Acquired Regulatory Functions

Friday, December 6th, 2013

Evidence of Abundant Purifying Selection in Humans for Recently Acquired Regulatory Functions
L Ward & M Kellis
http://www.sciencemag.org/content/337/6102/1675.abs

In general we know that conservation across species and within humans are correlated. In this paper the authors focus on emphasize the exceptions to this trend. They show that although only ~5% of the human genome is conserved across mammals, regulatory regions in an additional 4% of the genomes are conserved amongst humans. They also show that some elements are conserved across mammals but lack functional activity from ENCODE data and also do not show purifying selection amongst humans. The authors pinpoint regulatory regions near color vision and nerve-growth genes for that show human-specific constraint. This has been criticized in various publications since there are other genes that are higher up in the authors’ list but harder to explain for lineage-specific constraint.

Epigenomic alterations in localized and advanced prostate cancer – Neoplasia

Wednesday, November 27th, 2013

Summary for:

“Epigenomic Alterations in Localized and Advanced Prostate Cancer” Lin PC, Giannopoulou E, Park K, Mosquera JM, Sboner A, Tewari AK, Garraway LA, Beltran H, Rubin MA*, Elemento O*. 2013. Epigenomic alterations in localized and advanced prostate cancer. Neoplasia

http://www.ncbi.nlm.nih.gov/pubmed/23555183

In this paper, the authors take advantage of new advances in reduced representation bisulfite sequencing, a method for measuring DNA methylation patterns genome-wide, with high coverage and
single-nucleotide resolution, to study methylation patterns in prostate cancer. Working with a prostate cancer cohort already studied with DNA-Seq and RNA-Seq analyses, the authors identified
differentially methylated regions (DMRs), comparing the methylation of prostate cancer samples to benign prostate samples. The analysis found an increase in DNA methylation in prostate cancer samples, and that the methylation was more diverse and heterogeneous compared to the patterns of benign samples. Furthermore, it was found that genes near hypermethylated DMRs tended to have decreased expression, while genes near hypomethylated DMRs tended to have increased expression. Additional analyses revealed that breakpoints associated with prostate-cancer-specific deletions, duplications, and translocations tended to be highly methylated in benign prostate tissue. Finally, a study of CpG islands at different stages of prostate cancer (benign vs. PCa vs. CRPC (castration-resistant prostate cancer)) revealed that certain islands become increasingly methylated with disease severity. The authors used this data as the basis for two classification models: one to discriminate between benign prostate tissue and PCa tissue, and another to discriminate between PCa tissue and CRPC tissue. Both models demonstrated high sensitivity and specificity, indicating that CpG islands with high discriminatory power could serve as a diagnostic basis for predicting disease aggressiveness. Finally, additional analyses revealed that breakpoints associated with
prostate-cancer-specific deletions, duplications, and translocations tended to be highly methylated in benign prostate tissue.

HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants

Wednesday, November 27th, 2013

http://nar.oxfordjournals.org/content/40/D1/D930.long

HaploReg explores functional annotations, such as chromatin states in varied cell types, sequence conservation, regulatory motif
alterations and eQTLs, of linked SNPs or indels within LD block of queried SNPs. The output provides a the guide to develop hypotheses of functional impact of non-coding variants, especially GWAS SNPs. HaploReg is currently limited to known variants (e.g. 1000 Genome variants and dbSNPs) and is unable to deal with private variants.