New regulatory roles continue to emerge for both natural and engineered RNAs, many of which have specific structures essential to their function. This highlights a growing need to develop technologies that enable rapid and accurate characterization of structural features within complex RNA populations. Yet, available techniques that are reliable are also vastly limited by technological constraints, while the accuracy of popular computational methods is generally poor. These limitations thus pose a major barrier to comprehensive determination of structure from sequence.
To address this need, we have developed a high-throughput structure characterization assay, called SHAPE-Seq, which simultaneously measures quantitative nucleotide-resolution structural information for hundreds of distinct RNAs. SHAPE-Seq combines a novel chemistry with next-generation sequencing of its products. Following sequencing, we extract the structural information using a fully automated algorithmic pipeline that we developed. In this talk, I will focus on SHAPE-Seq's analysis methodology, which relies on a novel probabilistic model of a SHAPE-Seq experiment, adjoined by maximum-likelihood parameter estimation. I will demonstrate the accuracy, simplicity, and efficiency of our approach, and will then present an algorithm that uses SHAPE-Seq data to inform computational RNA secondary structure prediction.