Research beyond silos: How CAIA is accelerating the fight against cancer

Every individual’s journey through the healthcare system represents a wealth of data. There’s a unique story in every test, scan, or treatment a patient receives. Rightfully, this data is closely safeguarded by healthcare institutions and is bound by essential privacy and regulatory guidelines. Unlocking clues in this data is one of the great opportunities in modern healthcare. AI has the potential to help, but only if it can learn from the experiences of millions. While each institution can build AI models from its own patients’ experiences, truly unique insights and patterns emerge only when millions of these stories are brought together.

These were the ideas that led to the formation of the Cancer AI Alliance (CAIA) in 2024. Four cancer centers joined forces with AI technology leaders to explore how researchers can collaborate across institutional silos while protecting patient privacy. CAIA’s goal is to accelerate the pace of cancer discovery tenfold by building a platform that allows researchers to securely share data and build AI models that will ultimately help diagnose, treat, and cure cancer.

How can some of the world’s leading cancer centers collaborate with technology pioneers who are aggressively competing to build the most powerful AI systems?

Text on a blue background with the Cancer AI Alliance Logo. Text reads: an alliance of this scale is possible because everyone involved is motivated by a single unifying idea cancer is nonpartisan—it touches us all.

An alliance built on trust

The idea of combining insights from the nation's top cancer centers was the first challenge that needed to be solved. Jeff Leek, who spearheaded early efforts to form the alliance, explains that when researchers have an idea and try to test a hypothesis, there is a massive amount of initial work involved. “It's all the build-up. It's getting a group of people together, it's getting the legal agreements in place. All of that groundwork is like the iceberg underneath the water,” says Jeff. That "iceberg" means that negotiating a single data-sharing agreement could take two to three years, dramatically slowing the pace of discovery.

CAIA came together in months rather than years through direct outreach to trusted colleagues. When the idea for an alliance between leading cancer centers came about, Jeff began making phone calls to data and scientific leaders at the cancer centers that now make up CAIA. At the same time, CAIA's early advisory board members reached out to their friends and counterparts at the world's leading technology companies, bringing them into the fold.

Bringing together experts in cancer research, data infrastructure, high-performance computing, and AI, CAIA focused on making data available across institutions for AI modeling without compromising patient privacy or other legal and regulatory restrictions.

This led to the development of a secure platform to enable AI at scale across the alliance via federated learning.

Diverse data, smarter models

Brian Bot, who directs the Strategic Coordinating Center at CAIA, explains that AI models are often limited by the data they are trained on.

The people who we see in Seattle are very different for a number of reasons than patients who are seen in Baltimore or who are seen in New York City.
— Brian Bot, CAIA

A model trained only in one location may not work well for patients elsewhere. By building a model that learns from a diverse and representative sample of patients across the country, CAIA ensures the insights are more robust and equitable.

With a federated learning approach, AI models learn from diverse data sources without the data ever leaving its home institution. Take a real-world research inquiry: a scientist wants to study treatment outcomes for esophageal cancer in patients over 70. A single cancer center may not have enough patients in this specific group to conduct a meaningful study. With federated learning, instead of pooling private patient files, a central AI model is sent to each institution. The model learns from the local data, and then as Brian explains, “instead of sending any patient-level data back, the center sends only summaries and model weights—mathematical updates representing what the model learned.”

This updated, more intelligent model is then sent back to the centers, and the cycle begins again. Each iteration makes the model smarter and more accurate, all while the patient records themselves remain safely behind each institution’s firewalls.

A rising tide for cancer research

This approach will be particularly transformative for studying rare cancers. An individual institution may only see a handful of patients with a rare malignancy, which isn't enough to conduct a meaningful study. But by combining the learnings from patients across all CAIA centers, researchers can achieve the scale needed to uncover new patterns and shed light on new therapies.

In the past 10 months, CAIA has focused on establishing its federated learning infrastructure, harmonizing data across the centers, and navigating complex legal and regulatory challenges. They have already begun training the first models for applications in clinical decision support and patient trajectory mapping. The goal is to deliver an initial proof of concept by the end of 2025—an incredible speed given the scale of the undertaking. From there, the ambition is to grow. "If you can get four cancer centers to agree, then you've shown that this is possible," says Jeff. "It makes adding every new cancer center much, much easier."

Building CAIA is a team sport

CAIA’s work is defined by its collaborative spirit. The project reflects the leadership, intellectual contribution, and effort from all partners, creating what Jeff calls a "true team sport" built on a shared commitment to a mission that transcends institutional boundaries.

The next phase of CAIA’s vision is to scale the alliance to dozens of institutions, creating a vast, secure laboratory that empowers the world's best scientific minds. By moving beyond institutional silos, CAIA is building a future where collaboration unlocks the next generation of cancer discovery, treatment, and cures.


If you’d like to learn more about CAIA, subscribe to our newsletter for updates and follow us on LinkedIn and X.

Previous
Previous

How CAIA harnesses federated learning for secure collaboration