Posts Tagged ‘mining’

BIOKDD 2014

Saturday, June 7th, 2014

http://www.kdd.org/kdd2014/
8/24-8/27

13th International Workshop on Data Mining in Bioinformatics (BIOKDD’14) August 24, 2014 * New York City, NY, USA
http://home.biokdd.org/biokdd14

What is a support vector machine?

Wednesday, February 5th, 2014

What is a support vector machine? A nice overview w/o equations, just pictures. Great for #teaching!
http://www.nature.com/nbt/journal/v24/n12/abs/nbt1206-1565.html #SVM .@GenomeNathan YES, but see
http://noble.gs.washington.edu/papers/noble_what.html …, which has an expanded, “free” version.
http://noble.gs.washington.edu/papers/noble_what.html
http://www.nature.com/nbt/journal/v24/n12/abs/nbt1206-1565.html

Machine Learning Cheat Sheet (for scikit-learn)

Friday, January 17th, 2014

http://peekaboo-vision.blogspot.com/2013/01/machine-learning-cheat-sheet-for-scikit.html

http://scikit-learn.org/dev/

How Netflix Reverse Engineered Hollywood – Atlantic Mobile

Friday, January 17th, 2014

.@kdnuggets 77K genres from auto. & manual methods like
gene-classification approaches. How $NFLX Rev. Eng. Hollywood http://theatln.tc/1cKP4gx

http://m.theatlantic.com/technology/archive/2014/01/how-netflix-reverse-engineered-hollywood/282679/

Comprehensive long-span paired-end-tag mapping reveals characteristic patterns of structural variations in epithelial cancer genomes – Genome Res.

Friday, December 27th, 2013

Long-span PET mapping reveals characteristic patterns of #SVs in… cancer [v norm] genomes, but no MEIs or small events
http://genome.cshlp.org/content/early/2011/04/05/gr.113555.110.abstract

The described study used long paired-end-tags (PET) to analyze and compare SVs in cancer and normal genomes. It determined the prevalence of different types of SVs in normal and cancer sample. Overall, the results are interesting and convincing on a qualitative level; however, for the reasons outlined below, more precise and quantitative delineation of the observed effects is highly desirable.

1) Small sample size of normal genomes (only 2 normal genomes)

2) Validation rate was low (< 77%) for everything except deletions, and for singletons it was even lower. .

3) Long PET is not good for finding smaller events (few kbps). Thus, this analysis missed smaller scale SVs and cancer rearrangements.

4) While there is a discussion about breakpoints and associated repeats, it is not very informative as breakpoint locations were not determined to basepair resolution.

5) No MEI were considered — particularly, no cancer MEI were considered in the analysis, while recently it was found that somatic retrotransposition occurs in cancer (Lee et al., PMID: 22745252)..

Comprehensive long-span paired-end-tag mapping reveals characteristic patterns of structural variations in epithelial cancer genomes –

Hillmer AM, Yao F, Inaki K, Lee WH, Ariyaratne PN, Teo AS, Woo XY, Zhang Z, Zhao H, Ukil L, Chen JP, Zhu F, So JB, Salto-Tellez M, Poh WT, Zawack KF, Nagarajan N, Gao S, Li G, Kumar V, Lim HP, Sia YY, Chan CS, Leong ST, Neo SC, Choi PS, Thoreau H, Tan PB, Shahab A, Ruan X, Bergh J, Hall P, Cacheux-Rataboul V, Wei CL, Yeoh KG, Sung WK, Bourque G, Liu ET, Ruan Y.

Genome Res. 2011 May;21(5):665-75. doi: 10.1101/gr.113555.110. Epub 2011 Apr 5.

Why Everyone Will Totally Read This Column – WSJ.com

Tuesday, December 24th, 2013

How a #Gawker Editor Picks the ‘Viral’ Content Readers Can’t Resist: Human intuition trumps data #mining
http://m.us.wsj.com/articles/SB10001424052702304579404579231772007379090

Neetzan Zimmerman is the human intuition between Gawker’s successful shares Sort of like buZz feed….

How a Gawker Editor Picks the ‘Viral’ Content Readers Can’t Resist Sharing

Five Web-based Apps to help you visualize big data – TechRepublic

Monday, December 16th, 2013

Five Web-based Apps to help you #visualize big data: Many Eyes, ICharts, Visualize Free, Wolfram Alpha, Data Wrangler
http://www.techrepublic.com/blog/five-apps/five-web-based-apps-to-help-you-visualize-big-data ==
Also Visualize Free to complete the 5 MT @KirkDBorne 5 Web-based Apps to help you visualize #BigData
http://www.techrepublic.com/blog/five-apps/five-web-based-apps-to-help-you-visualize-big-data

Burkhard Bilger: Inside Google’s Driverless Car : The New Yorker

Friday, December 13th, 2013

A REPORTER AT LARGE

AUTO CORRECT

Has the self-driving car at last arrived?

BY BURKHARD BILGER

NOVEMBER 25, 2013

Has the self-driving car at last arrived? Almost, thanks to DARPA challenges, maps & machine learning
http://www.newyorker.com/reporting/2013/11/25/131125fa_fact_bilger via @gigajordan

Choosing the right estimator — scikit-learn 0.14 documentation

Sunday, December 8th, 2013

Might be good for course slides!

MT @sjackman @anshul
http://scikit-learn.org/stable/tutorial/machine_learning_map … is useful for #teaching, providing students a practical way to wade through all the approaches

Evidence of Abundant Purifying Selection in Humans for Recently Acquired Regulatory Functions

Friday, December 6th, 2013

Evidence of Abundant Purifying Selection in Humans for Recently Acquired Regulatory Functions
L Ward & M Kellis
http://www.sciencemag.org/content/337/6102/1675.abs

In general we know that conservation across species and within humans are correlated. In this paper the authors focus on emphasize the exceptions to this trend. They show that although only ~5% of the human genome is conserved across mammals, regulatory regions in an additional 4% of the genomes are conserved amongst humans. They also show that some elements are conserved across mammals but lack functional activity from ENCODE data and also do not show purifying selection amongst humans. The authors pinpoint regulatory regions near color vision and nerve-growth genes for that show human-specific constraint. This has been criticized in various publications since there are other genes that are higher up in the authors’ list but harder to explain for lineage-specific constraint.