We applied a Hidden Markov Model to the 1 million exomes from the Regeneron Genetics Center and the 250 thousand exomes from the United States AllofUS project in order to identify regions of the genome that were likely to be constrained and to assign a probability of constraint to each position in the genome.
Dr. Scott Oshiro, a postdoc in the laboratory, with background in music theory and quantum computing decided to sonify constraint in the genome across these two cohorts.
Sonification is the mapping or conversion of data to audio and/or musical elements. For example, a collection of sensor data can be mapped to represent the frequency or pitch of a musical pattern, or its rhythm. This audio file is the sonification of 64 sequences of the human genome. For each genetic sequence there is a value between 0 and 1 representing the probability of a mutation for that particular sequence. These values are mapped to specific pitches for a particular musical scale.
A lot of information is compressed in wave data and I’m excited about the next iteration of our project.

Leave a Reply