Posts Tagged ‘privaseq3’

GA4GH’s paper on data sharing

Saturday, July 28th, 2018

Perspective | OPEN | Published: 23 July 2018

Responsible sharing of biomedical data and biospecimens via the “Automatable Discovery and Access Matrix” (ADA-M)

J. Patrick Woolley,
Emily Kirby,
Josh Leslie,
Francis Jeanson,
Moran N. Cabili,
Gregory Rushton,
James G. Hazard,
Vagelis Ladas,
Colin D. Veal,
Spencer J. Gibson,
Anne-Marie Tassé,
Stephanie O. M. Dyke,
Clara Gaff,
Adrian Thorogood,
Bartha Maria Knoppers,
John Wilbanks &
Anthony J. Brookes

npj Genomic Medicinevolume 3, Article number: 17 (2018) | Download Citation

new paper in Nature Genomic Medicine:
https://www.nature.com/articles/s41525-018-0057-4

NOT-OD-17-110: Request for Comments: Proposal to Update Data Management of Genomic Summary Results Under the NIH Genomic Data Sharing Policy

Monday, December 4th, 2017

https://grants.nih.gov/grants/guide/notice-files/NOT-OD-17-110.html

differential privacy

Thursday, November 30th, 2017


Box 1 of the following paper has a nice definition for differential privacy in genomics sense (phenotypic differential privacy): http://www.cell.com/cell-systems/fulltext/S2405-4712(16)30121-1 “

Alignment-free sequence comparison: benefits, applications, and tools

Monday, November 13th, 2017

Might be useful for noncoding comparisons

Alignment-free seq. comparison: benefits, apps & tools
https://GenomeBiology.biomedcentral.com/articles/10.1186/s13059-017-1319-7 Great tidbits, viz: Shannon asked von Neumann what to call his info measure – “Why don’t you call it entropy…no one understands entropy…so in any discussion, you’ll be in a position of advantage.”

QT:{{”
“Reportedly, Claude Shannon, who was a mathematician working at Bell Labs, asked John von Neumann what he should call his newly developed measure of information content; “Why don’t you call it entropy,” said von Neumann, “[…] no one understands entropy very well so in any discussion you will be in a position of advantage […]” []. The concept of Shannon entropy came from the observation that some English words, such as “the” or “a”, are very frequent and thus unsurprising” ….
“The calculation of a distance between sequences using complexity (compression) is relatively straightforward (Fig. ). This procedure takes the sequences being compared (x = ATGTGTG and y = CATGTG) and concatenates them to create one longer sequence (xy = ATGTGTGCATGTG). If x and y are exactly the same, then the complexity (compressed length) of xy will be very close to the complexity of the individual x or y. However, if x and y are dissimilar, then the complexity of xy (length of compressed xy) will tend to the cumulative complexities of x and y.”

“Intriguingly, BLOSUM matrices, which are the most commonly used substitution matrix series for protein sequence alignments, were found to have been miscalculated years ago and yet produced significantly better alignments than their corrected modern version (RBLOSUM) []; this paradox remains a mystery.”
“}}