AI and Data Centres
The Artificial Intelligence revolution is powered by hyperscale data centres, which house massive clusters of specialized compute (GPUs, NPUs, FPGAs and other accelerators), storage arrays, and incredible volumes of training data. The security of this AI supply chain is paramount, from the silicon to the complex orchestration software. Equally important are the physical infrastructure and operational technologies providing power, cooling, and building management.
Security flaws in the firmware of servers, accelerators, and networking gear can expose highly valuable model weights and sensitive training data to theft, manipulation (like data poisoning), or sabotage. Given the immense financial and intellectual property value concentrated in these facilities, a foundational hardware or firmware vulnerability can lead to nation-state level breaches and catastrophic business disruption.
Meanwhile, the underlying physical infrastructure relies on power distribution (UPS), thermal management (liquid cooling), and Building Management Systems (BMS) running embedded firmware that is rarely audited to modern standards. In these complex “systems of systems”, a vulnerability in a $50 cooling controller can take a $500M compute cluster offline.
Services
Tetrel provides crucial assurance for vendors, cloud providers, and users operating in the AI and high-performance computing, storage, and networking accelerator spaces. We specialize in hardening the most fundamental components of the data center infrastructure on which the AI applications rely.
We deliver comprehensive services focused on foundational security for hardware and firmware, including:
- Deep-Dive Firmware and RTL Review: Assessing boot processes, trusted execution environments, confidential-compute, secure management access (e.g., BMC), and cryptographic implementations on all core devices.
- Unified Supply Chain Risk Assurance: Evaluating the security posture of third-party components and ensuring robust security implementations are in place. Tetrel provides security assurance that bridges the IT/OT divide:
- Facility-Level: Designing and auditing critical infrastructure networks to the IEC 62443 standard, establishing strict Security Level Targets (SL-T).
- Silicon-Level: Validating the exact components used in those facilities—servers, accelerators, cooling controllers, and more; through OCP S.A.F.E. and Caliptra Trademark Audit assessments.
Selected Publications
A curated list of publications and presentations by our team is provided for your review.
- Trends in Server Platform Security - Platform Security Summit 2019
- Secure Firmware Development Best Practices
- Much Ado About Hardware Implants
- A Case for a Trustworthy BMC (Cloud Security Industry Summit)
- Importance of Embedded Systems Security Requirements
- OCP Common Security Threats v1.0
- 2020 OCP Virtual Summit - Panel Discussion: CSIS Security
- Secure Device Manufacturing: Supply Chain Security Resilience