In a groundbreaking development poised to revolutionize health research, generative artificial intelligence has demonstrated a remarkable ability to analyze complex medical datasets at speeds far exceeding human capabilities. In an early real-world test, researchers at the University of California, San Francisco (UCSF) and Wayne State University found that generative AI could not only process vast quantities of medical data significantly faster than traditional computer science teams but also, in certain instances, yield superior outcomes. This stark contrast highlights the potential of AI to accelerate discoveries that previously required months of meticulous human analysis.

To directly quantify this performance disparity, researchers devised a comparative study. Identical analytical tasks were assigned to distinct groups. One set of teams relied solely on human expertise, while another comprised scientists collaborating with advanced AI tools. The central challenge was to develop predictive models for preterm birth, utilizing a comprehensive dataset encompassing information from over 1,000 pregnant women.

The results were striking, even for a junior research pairing. A UCSF master’s student, Reuben Sarwal, and a high school student, Victor Tarca, working with AI support, successfully developed functional prediction models. The AI system was capable of generating the necessary computer code within minutes—a task that typically consumes several hours or even days for experienced programmers. This efficiency stemmed from the AI’s inherent ability to write analytical code based on concise yet highly specific prompts, effectively translating natural language instructions into executable algorithms.

While not all AI systems proved equally adept, with only 4 out of 8 tested AI chatbots producing usable code, those that succeeded demonstrated remarkable autonomy. Crucially, these high-performing AI tools did not necessitate large, specialized teams to guide their development. This reduced reliance on human capital, particularly in the initial coding stages, further amplified the speed advantage.

The accelerated pace enabled by generative AI allowed the junior research team to expedite their experiments, rigorously verify their findings, and submit their results to a scientific journal within a mere few months. This rapid turnaround time is particularly significant given the urgency of many health research endeavors.

Dr. Marina Sirota, PhD, a professor of Pediatrics and interim director of the Bakar Computational Health Sciences Institute (BCHSI) at UCSF, and the principal investigator of the March of Dimes Prematurity Research Center at UCSF, emphasized the transformative potential of these AI tools. "These AI tools could relieve one of the biggest bottlenecks in data science: building our analysis pipelines," Dr. Sirota stated. "The speed-up couldn’t come sooner for patients who need help now." Dr. Sirota co-led the study, which was published in the prestigious journal Cell Reports Medicine on February 17th.

The significance of accelerating preterm birth research cannot be overstated. Preterm birth is a critical global health issue, standing as the leading cause of newborn mortality and a major contributor to long-term motor and cognitive challenges in children. In the United States alone, approximately 1,000 babies are born prematurely each day, underscoring the immediate need for improved diagnostic tools and preventative strategies.

Despite extensive research, the precise causes of preterm birth remain incompletely understood. To delve deeper into potential risk factors, Dr. Sirota’s team meticulously compiled microbiome data from around 1,200 pregnant women whose birth outcomes had been tracked across nine separate studies. "This kind of work is only possible with open data sharing, pooling the experiences of many women and the expertise of many researchers," commented Dr. Tomiko T. Oskotsky, MD, co-director of the March of Dimes Preterm Birth Data Repository, associate professor in UCSF BCHSI, and a co-author of the paper.

However, analyzing such an extensive and intricate dataset presented considerable challenges. To overcome this hurdle, the researchers had previously engaged with a global crowdsourcing competition known as DREAM (Dialogue on Reverse Engineering Assessment and Methods). Dr. Sirota co-led one of the three DREAM pregnancy challenges, specifically focusing on vaginal microbiome data. The competition saw the participation of over 100 teams worldwide, who developed machine learning models aimed at identifying patterns associated with preterm birth. While most groups completed their work within the three-month competition window, the subsequent consolidation of findings and publication process extended for nearly two years.

Intrigued by the possibility of significantly shortening such timelines, Dr. Sirota’s group collaborated with researchers led by Dr. Adi L. Tarca, PhD, a co-senior author and professor in the Center for Molecular Medicine and Genetics at Wayne State University in Detroit, MI. Dr. Tarca had previously spearheaded the other two DREAM challenges, which were dedicated to refining methods for estimating pregnancy stage.

In their joint endeavor, the researchers instructed eight distinct AI systems to independently generate algorithms using the same datasets from the three DREAM challenges, bypassing direct human coding. These AI chatbots were provided with carefully crafted natural language instructions. Similar to the interaction model of ChatGPT, the systems were guided through detailed prompts designed to steer their analysis of the health data in ways comparable to the original DREAM participants’ approaches.

The objectives presented to the AI systems mirrored those of the earlier human-led challenges. The AI systems were tasked with analyzing vaginal microbiome data to detect indicators of preterm birth and examining blood or placental samples to accurately estimate gestational age. Pregnancy dating is inherently an estimation, yet it critically dictates the type of medical care pregnant women receive as their pregnancies progress. Inaccurate dating can lead to difficulties in preparing for labor and delivery.

Following the AI systems’ code generation, researchers meticulously ran the AI-generated code against the DREAM datasets. The results indicated that only 4 out of the 8 AI tools produced models that matched the performance levels achieved by the human teams. However, in several instances, the AI-generated models demonstrated superior performance. Remarkably, the entire generative AI initiative, from its conception to the submission of a research paper, was completed in just six months—a fraction of the time taken for the original human-led DREAM challenges.

The scientists involved are keen to emphasize that while generative AI offers immense promise, it still necessitates careful human oversight. These systems can, under certain circumstances, produce misleading or erroneous results, underscoring the continued essential role of human expertise in interpreting findings and ensuring scientific rigor. Nevertheless, by rapidly sifting through massive health datasets, generative AI has the potential to significantly reduce the time researchers spend on tasks such as troubleshooting code, allowing them to dedicate more valuable time to interpreting results and formulating critical scientific questions.

Dr. Tarca articulated the broader implications of this advancement: "Thanks to generative AI, researchers with a limited background in data science won’t always need to form wide collaborations or spend hours debugging code," he explained. "They can focus on answering the right biomedical questions." This democratization of advanced data analysis could accelerate research across a wider spectrum of scientific inquiry.

The UCSF authors contributing to this study include Reuben Sarwal, Claire Dubin, Sanchita Bhattacharya, MS, and Atul Butte, MD, PhD. Other notable authors are Victor Tarca from Huron High School in Ann Arbor, MI; Nikolas Kalavros and Gustavo Stolovitzky, PhD, from New York University; Gaurav Bhatti from Wayne State University; and Roberto Romero, MD, D(Med)Sc, from the National Institute of Child Health and Human Development (NICHD).

Funding for this pivotal research was provided by the March of Dimes Prematurity Research Center at UCSF, and by ImmPort. The data utilized in this study was partly generated with support from the Pregnancy Research Branch of the NICHD, highlighting the collaborative nature of such significant scientific undertakings.