Role: Senior Cloud Architect - GPU Infrastructure & AI Platforms
Function: Cloud Infrastructure & Platform Engineering
Location: Singapore
Type: Full-time
Compensation: Not specified
Industry: Information Technology & Services
About Company
A deeptech startup building the next generation of AI-native infrastructure. As a Zero-Trust and Confidential-by-Design hyperscaler, we are developing the Kluisz Secure Fabric™️ to redefine how the world handles massive-scale compute.
We are looking for a visionary Senior Cloud Architect to lead the design, deployment, and operation of our large-scale GPU cloud infrastructure in Singapore. You will own the end-to-end GPU platform architecture—from hardware and cluster design to Kubernetes scheduling and customer solutioning.
Position Overview
You will architect and deploy 2,000+ GPU clusters that power the next generation of AI infrastructure. You own the end-to-end GPU platform architecture, bridging specialized hardware teams and AI engineers to deliver production-ready cloud solutions. This role defines the technical direction of a hyperscaler fleet from the ground up, with high ownership and direct impact on global cloud infrastructure.
Role & Responsibilities
- Design and evolve large-scale GPU clusters (2,000+ GPUs) for AI training and inference workloads
- Define GPU server topology including PCIe/NVLink configurations and high-speed networking architecture
- Architect and oversee Kubernetes-based GPU platforms with scheduling, isolation, and multi-tenant strategies
- Support deployment and optimization of AI models while troubleshooting complex distributed training issues
- Evaluate and select GPU, server, and networking vendors with technical input for RFPs
- Lead customer engagements for solution architecture and provide technical leadership
- Mentor infrastructure and platform engineers while driving capacity planning initiatives
Must Have Criteria
- 10-15 years of experience in cloud infrastructure, platform engineering, or systems architecture
- Proven experience designing or operating GPU clusters with at least 2,000 GPUs
- Strong production experience with Kubernetes specifically for GPU workloads and resource scheduling
- Deep understanding of GPU architecture and compute optimization across different GPU vendors
- Experience with high-performance networking protocols (InfiniBand or RoCE) for GPU clusters
- Hands-on experience with distributed storage systems optimized for AI workloads
- Experience in customer-facing solution architecture or technical leadership roles
Nice to Have
- Experience with AI/ML frameworks and large-scale training platforms (PyTorch, TensorFlow)
- Exposure to multi-cloud or hybrid cloud environments
- Track record of improving GPU utilization and cost efficiency at scale
- Experience with container orchestration tools beyond Kubernetes (Docker Swarm, Nomad)
- Background in hyperscaler or cloud service provider environments
What We Offer
- Opportunity to build next-generation AI infrastructure from the ground up
- Leadership role in a cutting-edge Zero-Trust and Confidential-by-Design hyperscaler
- Work with state-of-the-art GPU technology and secure fabric architecture
- High ownership and impact in shaping global cloud infrastructure
- Collaborative environment focused on redefining massive-scale compute
Apply Now
Share your details below to apply for this job.
Job Description
Role: Senior Cloud Architect - GPU Infrastructure & AI Platforms
Function: Cloud Infrastructure & Platform Engineering
Location: Singapore
Type: Full-time
Compensation: Not specified
Industry: Information Technology & Services
About Company
A deeptech startup building the next generation of AI-native infrastructure. As a Zero-Trust and Confidential-by-Design hyperscaler, we are developing the Kluisz Secure Fabric™️ to redefine how the world handles massive-scale compute.
We are looking for a visionary Senior Cloud Architect to lead the design, deployment, and operation of our large-scale GPU cloud infrastructure in Singapore. You will own the end-to-end GPU platform architecture—from hardware and cluster design to Kubernetes scheduling and customer solutioning.
Position Overview
You will architect and deploy 2,000+ GPU clusters that power the next generation of AI infrastructure. You own the end-to-end GPU platform architecture, bridging specialized hardware teams and AI engineers to deliver production-ready cloud solutions. This role defines the technical direction of a hyperscaler fleet from the ground up, with high ownership and direct impact on global cloud infrastructure.
Role & Responsibilities
- Design and evolve large-scale GPU clusters (2,000+ GPUs) for AI training and inference workloads
- Define GPU server topology including PCIe/NVLink configurations and high-speed networking architecture
- Architect and oversee Kubernetes-based GPU platforms with scheduling, isolation, and multi-tenant strategies
- Support deployment and optimization of AI models while troubleshooting complex distributed training issues
- Evaluate and select GPU, server, and networking vendors with technical input for RFPs
- Lead customer engagements for solution architecture and provide technical leadership
- Mentor infrastructure and platform engineers while driving capacity planning initiatives
Must Have Criteria
- 10-15 years of experience in cloud infrastructure, platform engineering, or systems architecture
- Proven experience designing or operating GPU clusters with at least 2,000 GPUs
- Strong production experience with Kubernetes specifically for GPU workloads and resource scheduling
- Deep understanding of GPU architecture and compute optimization across different GPU vendors
- Experience with high-performance networking protocols (InfiniBand or RoCE) for GPU clusters
- Hands-on experience with distributed storage systems optimized for AI workloads
- Experience in customer-facing solution architecture or technical leadership roles
Nice to Have
- Experience with AI/ML frameworks and large-scale training platforms (PyTorch, TensorFlow)
- Exposure to multi-cloud or hybrid cloud environments
- Track record of improving GPU utilization and cost efficiency at scale
- Experience with container orchestration tools beyond Kubernetes (Docker Swarm, Nomad)
- Background in hyperscaler or cloud service provider environments
What We Offer
- Opportunity to build next-generation AI infrastructure from the ground up
- Leadership role in a cutting-edge Zero-Trust and Confidential-by-Design hyperscaler
- Work with state-of-the-art GPU technology and secure fabric architecture
- High ownership and impact in shaping global cloud infrastructure
- Collaborative environment focused on redefining massive-scale compute
Apply Now
Share your details below to apply for this job.
Application Submitted Successfully!
Thank you for applying to Senior Cloud Architect - GPU Infrastructure & AI Platforms. We have received your application and will review it shortly.
You will be redirected shortly...