Posts Tagged ‘data’

Big Data’s Promise and Limitations : The New Yorker

Saturday, May 4th, 2013

Facebook ‘Likes’ reveal more about you than you think | Detroit Free Press |

Sunday, March 17th, 2013

Twitter users forming tribes with own language, tweet analysis shows

Sunday, March 17th, 2013

Thoughts on “A few useful things to know about machine learning”

Thursday, February 14th, 2013

Some thoughts on a good paper giving intuition on machine learning approaches

In particular, the paper gives good intuition about:

– overfitting (e.g. how it’s related to multiple testing & bias v variance)
– the curse of dimensionality (in high-D all neighbors look the same)
– the non-practicality of theoretical guarantees
– how different frontiers can give the same prediction
– ensembles (which reduce variance greatly without increasing bias that much)
– ensembles vs Bayesian model averaging (which essentially select the best model)

Illumina Platinum Genomes

Sunday, February 10th, 2013
A family trio (NA12877, NA12878, and NA12882) sequenced on a HiSeq 2000 system. An individual (NA18507) sequenced on a HiSeq 2500 system.

A few useful things to know about machine learning

Saturday, February 9th, 2013