To rigorously compare performance, researchers designed a study where identical tasks were assigned to distinct groups. Some teams relied solely on human expertise, while others integrated AI tools into their workflows, with scientists collaborating with AI. The central challenge was to develop predictive models for preterm birth, utilizing data from over 1,000 pregnant women. This critical research area focuses on identifying factors that contribute to premature births, which are the leading cause of newborn mortality and a significant contributor to long-term developmental challenges in children. In the United States alone, approximately 1,000 babies are born prematurely each day, underscoring the urgency of this research.
Remarkably, even a junior research duo—comprising UCSF master’s student Reuben Sarwal and high school student Victor Tarca—achieved success in developing prediction models with AI assistance. The generative AI system was able to generate functional computer code within minutes, a task that would typically consume several hours or even days for experienced human programmers. This substantial advantage stemmed from the AI’s sophisticated ability to write analytical code based on concise yet highly specific prompts. It is important to note that not all AI systems performed optimally; only four out of the eight AI chatbots evaluated produced usable code. However, the successful systems did not necessitate large teams of specialists to guide their development.
The sheer speed afforded by generative AI allowed the junior researchers to swiftly complete their experiments, rigorously verify their findings, and submit their results to a journal within a few months. This compressed timeline is a stark contrast to traditional research methodologies. Marina Sirota, PhD, a professor of Pediatrics and interim director of the Bakar Computational Health Sciences Institute (BCHSI) at UCSF, and the principal investigator of the March of Dimes Prematurity Research Center at UCSF, emphasized the transformative potential of these tools. "These AI tools could relieve one of the biggest bottlenecks in data science: building our analysis pipelines," she stated. "The speed-up couldn’t come sooner for patients who need help now." Dr. Sirota is the co-senior author of the study, which was published in Cell Reports Medicine on February 17th.
The profound importance of accelerating preterm birth research cannot be overstated. Enhancing diagnostic tools for preterm birth could lead to earlier interventions and improved outcomes for both newborns and mothers. Despite extensive research, the precise causes of preterm birth remain incompletely understood. To investigate potential risk factors, Dr. Sirota’s team meticulously compiled microbiome data from approximately 1,200 pregnant women whose pregnancy outcomes were tracked across nine separate studies. This collaborative approach, involving the pooling of data from numerous women and the expertise of many researchers, is crucial for tackling such complex biological questions. Tomiko T. Oskotsky MD, co-director of the March of Dimes Preterm Birth Data Repository, associate professor in UCSF BCHSI, and a co-author of the paper, highlighted the significance of open data sharing in enabling this type of research.
However, analyzing such a vast and intricate dataset presented significant challenges. To address this, the researchers initially turned to a global crowdsourcing competition known as DREAM (Dialogue on Reverse Engineering Assessment and Methods). Dr. Sirota co-led one of the three DREAM pregnancy challenges, which specifically focused on vaginal microbiome data. The competition attracted over 100 teams worldwide, all developing machine learning models designed to identify patterns associated with preterm birth. While most teams completed their work within the three-month competition window, the process of consolidating findings and publishing the results took nearly two years.
Driven by the potential of generative AI to drastically shorten such timelines, Dr. Sirota’s group collaborated with researchers led by Adi L. Tarca, PhD, co-senior author and professor in the Center for Molecular Medicine and Genetics at Wayne State University in Detroit, MI. Dr. Tarca had previously led the other two DREAM challenges, which aimed to improve methods for estimating pregnancy stage. Together, the researchers instructed eight AI systems to independently generate algorithms using the same datasets from the three DREAM challenges, bypassing direct human coding.
The AI chatbots were provided with carefully crafted natural language instructions. Functioning similarly to advanced conversational AI like ChatGPT, these systems were guided through detailed prompts designed to direct their analysis of the health data in ways comparable to the original DREAM participants. Their objectives mirrored the earlier challenges: the AI systems were tasked with analyzing vaginal microbiome data to identify indicators of preterm birth and examining blood or placental samples to estimate gestational age. Accurate pregnancy dating is crucial, as it dictates the type of medical care pregnant women receive throughout their pregnancies, and inaccuracies can complicate labor preparation.
The researchers then executed the AI-generated code using the DREAM datasets. The results were promising: four out of the eight AI tools successfully produced models that matched or even surpassed the performance of the human teams. The entire generative AI initiative, from its inception to the submission of a research paper, was completed in a remarkably short six-month period.
Scientists involved in the study emphasize that despite these impressive advancements, AI still requires careful human oversight. These systems are capable of generating misleading results, and human expertise remains indispensable for interpretation and validation. Nevertheless, by rapidly sifting through massive health datasets, generative AI has the potential to significantly reduce the time researchers spend on coding and troubleshooting, allowing them to dedicate more effort to interpreting results and formulating critical scientific questions.
"Thanks to generative AI, researchers with a limited background in data science won’t always need to form wide collaborations or spend hours debugging code," stated Dr. Tarca. "They can focus on answering the right biomedical questions." This democratization of advanced data analysis capabilities could accelerate research across numerous fields.
The study’s authors include UCSF contributors Reuben Sarwal, Claire Dubin, Sanchita Bhattacharya, MS, and Atul Butte, MD, PhD. Additional authors are Victor Tarca (Huron High School, Ann Arbor, MI), Nikolas Kalavros and Gustavo Stolovitzky, PhD (New York University), Gaurav Bhatti (Wayne State University), and Roberto Romero, MD, D(Med)Sc (National Institute of Child Health and Human Development (NICHD)).
The research was funded by the March of Dimes Prematurity Research Center at UCSF and by ImmPort. The data utilized in this study was generated in part with support from the Pregnancy Research Branch of the NICHD. This multidisciplinary collaboration and funding underscore the significant investment and commitment to advancing preterm birth research and leveraging cutting-edge AI technologies.

