How Amazon’s cloud unit is helping researchers analyze genetics
As healthcare becomes increasingly digitized, scientists, physicians and researchers must try to decipher unprecedented amounts of data to properly personalize care. The excess information available to these experts often exceeds their ability to consume and analyze it. AmazonThe cloud unit of has worked to close this gap.
Amazon Web Services recently launched general availability of Amazon Omics, which helps researchers store and analyze omics data such as DNA, RNA, and protein sequences. The service provides customers with the underlying infrastructure they need to make sense of large amounts of data so they can spend more time making new scientific discoveries.
AWS generates a substantial portion of Amazon’s revenue, generating $20.5 billion in the third quarter. Cloud computing business has grown in healthcare, and while AWS does not disclose revenue projections for particular services, the global genomic data analytics market size is expected to reach 2.15 billion by 2030, according to a report by Straits Research.
Dr. Taha Kass-Hout, chief medical officer at AWS, said the vast majority of healthcare data is unstructured in nature, meaning around 97% of it goes unused. Indexing and making sense of this information is a challenge, especially when researchers collect omics data from tens of thousands of patients.
Prior to his time at Amazon, Kass-Hout served two terms under President Barack Obama and served as the first Director of Health Information at the United States Food and Drug Administration.
Sequencing a human genome can require between 80 and 150 gigabytes of storage, Kass-Hout said, and some research projects deal with petabytes and exabytes of genomic information.
“You’re talking about almost nine Harry Potter if you want to print it on a printer,” Kass-Hout told CNBC. “And that’s just for a human being.”
Amazon Omics helps researchers sort their data by providing three components that they can leverage individually or collectively. Omics-compatible object storage helps researchers store and share raw sequence data; Omics Workflows allows running workflows that process raw sequence data at scale; and Omics Analytics simplifies sequence processing output.
More than a dozen customers and partners have tested a beta version of the service and are already using Amazon Omics.
For Jeffrey Pennington, director of research informatics at Children’s Hospital of Philadelphia, this has already had a noticeable impact.
Pennington works in the Department of Biomedical and Health Informatics, which uses data and technology to solve child health problems. He said the department spent five years expanding the infrastructure to analyze omics data, and now it’s not something they need to build or sustain.
“We’re a large pediatric academic medical center, but we’re not big enough yet to learn and build everything needed to productively use omics data,” Pennington said. “Our time and energy, our effort, our financial means are much better spent putting the puzzle together than generating those pieces in the first place.”
Amazon Omics also encourages collaboration between large research groups, smaller clinical groups, and intelligence and pharmaceutical companies, said Boris Oklander, co-founder and chief technology officer of C2i Genomics.
C2i is a biotechnology company working to use genomic data to develop personalized cancer treatments. Oklander said the company participated in the Amazon Omics beta after trying to develop its own data analytics technology.
He said Amazon Omics has created a collaborative ecosystem that eliminates the need for researchers to build complex technology from scratch.
“We are just democratizing,” he said. “This type of service is something that allows [us] to unlock the value of the investments that different players are making in this space.”
Other big tech companies have developed similar tools. MicrosoftAzure, Azure’s cloud computing platform, launched Microsoft Genomics in 2018 to help researchers interpret data generated by genomics technologies. GoogleCloud Life Sciences technology also allows researchers to process large-scale biomedical data.
Pennington said the Broad Institute and DNAnexus also offer popular genomic data analysis services, but said they can be difficult to maintain and can analyze fewer types of data than Amazon Omics.
Given the sensitive and deeply personal nature of omics data, Kass-Hout said protecting patient privacy and data is “task zero” for AWS. He said AWS uses more than 300 security, compliance, and governance services and supports 98 security standards and compliance certifications. In doing so, AWS goes “way beyond” regulatory compliance, Kass-Hout said, and also provides its customers with best-practice encryption resources and tools.
Customers are also responsible for building secure applications on top of Amazon Omics’ services, which prevent AWS from seeing or using the data.
Kass-Hout said that ultimately Amazon Omics serves as a way to efficiently index information so researchers can focus on real advances in precision medicine.
“While the past decade has been about the digitization that the health and life sciences industry has experienced, I truly believe the next decade will be about making sense of that data in a current way. [where] we can find new therapies, new diagnostics, more targeted therapies,” he said.