The role of data federation in cancer research: 4 Insights from the 2025 Deloitte AI Forum
At the 2025 Deloitte AI Forum, Pete Shimer (CAIA Executive Committee Chair), chatted with Brian M. Bot (Director of the Strategic Coordinating Center for CAIA; Director of AI and Data Science Partnerships, Fred Hutch) about our organization's mission to drive breakthroughs in cancer research.
They discussed the critical role of data federation for unlocking progress in cancer research. Our approach to data federation involves enabling AI insights across CAIA’s founding cancer centers without compromising patient data, privacy, or security.
CAIA places technology at the intersection of medicine, science, and data, enabling cancer centers to focus on generating new insights and advancing cancer discovery.
Below, we offer 4 takeaways from Pete and Brian’s conversation on CAIA’s data federation framework.
1.Data federation enables AI insights without compromising privacy
Data federation, as implemented by CAIA, is a framework aimed at aggregating AI insights across a distributed network of cancer centers without compromising patient data, privacy, or security.
CAIA’s data federation framework is underpinned by an AI training approach called federated learning.
This training approach allows AI models to leverage data and insights from multiple institutions while keeping sensitive, individual patient-level data secure and private at its source:
The system works by establishing “edge nodes” at each cancer center, ensuring the individual patient data never leaves its firewall.
Instead of moving data, an AI model is sent from an orchestration layer and is trained locally on the data that resides there.
Only the model updates or the summaries of those models are then sent back to the orchestration layer.
This happens iteratively across the federated network to update a global model while keeping all individual data secure and private.
2. Federation fosters greater data diversity
Individual institutions may struggle to build effective AI models because they don't have sufficient data in terms of depth and diversity. Individual cancer centers only have information about patients who come to their clinics for treatment. Thus, models developed on data from a single institution are only able to learn about that subset of the population. This limits their models from being generalizable. For example, patients seen in New York City might differ significantly from those seen in Seattle.
Data federation in cancer research can overcome this limitation by enabling groups of researchers and institutions to work together with a more diverse data set. The key idea that led to the creation of the Cancer AI Alliance is that if cancer centers work together, then we can make demonstrable progress to accelerate treatments and cures.
As the old proverb says: to go fast, go alone; to go far, go together.
3. Data federation is a catalyst for collaboration
Healthcare institutions have a number of safeguards and regulations in place to secure patient data. The siloed nature of healthcare data has made cross-institutional collaboration challenging. Data federation allows for collaboration, helping organizations overcome institutional barriers and accelerate progress.
To set up our federated network, CAIA had to establish governance processes, legal agreements, technological infrastructure, and a common data mapping process across our founding cancer centers. By establishing this infrastructure to enable data federation, CAIA has achieved milestones that would be impossible through isolated efforts.
4. Data federation enables scalability and ease of expansion
Unlike centralized data projects that face massive logistical hurdles when adding new partners, the federated network is designed for expansion. Each Alliance member’s data source can be added to the federated framework as an additional edge node.
With data federation, expanding the network is the whole point: that's one of the benefits of CAIA’s federated framework. Adding individual edge nodes to the network still requires work, but the foundation that the initial CAIA institutions have laid will make adding new members easier. This groundwork will enable CAIA’s network to expand, making more data available to Alliance members and accelerating insights to fulfill its mission.
If you’d like to learn more about CAIA, subscribe to our newsletter for updates and follow us on LinkedIn and X.