Multi-cloud federated learning: The role of edge nodes at CAIA
CAIA’s multi-cloud federated learning approach allows participating cancer centers to retain control over their own data and enables them to operate on the cloud environments that work best for them.
In this blog post, we’ll take a look at the security, governance, and edge node architecture that make multi-cloud federated learning possible.
How edge nodes enable decentralized federated learning
While cancer centers maintain their own data and analytical environments, differing security and operating protocols prevent them from easily communicating. Traditionally, the solution has been to extract this data and pool it together in a centralized repository. However, this poses significant regulatory and legal risks to patient privacy.
CAIA’s multi-cloud federated learning approach flips this traditional script with an interoperable, decentralized model. The federated learning platform leaves the data exactly where it is and sends AI models to the data instead. Installing secure edge nodes at each participating center is a key part of this process.
An edge node is a secure compute environment (like a computer or a server) that serves as a gateway between the cancer center and the rest of the Alliance. In the CAIA context, edge nodes help manage information being sent to a cancer center and vice versa, facilitating communication and training AI models locally. To enable this flow, each edge node must actively connect to the central orchestration layer of CAIA’s federated learning platform. Additionally, each scientific project has its own access controls governing who can do what within the federated environments.
When an AI model is ready for federated learning, the central orchestration layer distributes the model code down to each individual edge node. The node then hosts and runs this code locally within each center’s environment, training the model directly on its secure, local data.
Once this local training is complete, the edge node creates a summary of its findings and transmits it back to the orchestration layer. The orchestration layer aggregates summaries from all participating edge nodes, strengthening the resulting global model.
By managing this exchange via edge nodes, CAIA ensures that a cancer center’s specific choice of cloud environment is not a barrier to collaborative cancer research.
Governance and security standards for multi-cloud AI training
To establish how the edge nodes would interact with the federated learning platform, the CAIA team conducted deep-dive information security reviews and coordinated with institutional IT, security, and legal offices at each cancer center to ensure compliance with regulatory standards. Additionally, CAIA developed a governance framework to identify specific use cases for the platform, assign personnel, and validate appropriate access.
Key Security Features of the CAIA Governance Framework
Data selection
Each research project has its own access controls and subsets of the data resources that are only available to the project teams
Granular access Control
Cancer centers can mandate that only pre-approved code is run, ensuring that only specific, validated AI models can be trained on their data.
Local compute execution
All model training is executed locally in cancer centers’ environments, governed by their existing security protocols.
Security and privacy
Additional enhanced protection technologies are available for specific projects to ensure the privacy and security of sensitive patient data and trained models — including differential privacy, model encryption, and confidential computing.
Exploratory analysis tools
Before any model training even begins, researchers perform secure, localized data preparation and exploratory analysis to ensure data compatibility across different systems by aligning with a common data model like OMOP. Participating teams can evaluate data distribution, identify potential outliers, and standardize variables across sites, to make sure all data is uniform and high-quality while keeping sensitive clinical data secure.
Activity logging
Edge nodes automatically maintain localized, complete logs of every single action performed on their data, providing cancer centers with the transparency needed for thorough monitoring and independent quality control.
If you’d like to learn more about CAIA, subscribe to our newsletter for updates and follow us on LinkedIn and X.