Role: Senior GPU Platform Engineer
Function: Infrastructure Engineering
Location: Singapore
Type: Full-time
Industry: Cloud Infrastructure, AI/ML, Enterprise Software
About Company
A deeptech startup developing an AI-native private cloud platform. The company automates orchestration, security, compliance, and workload management for enterprise environments.
Based in Bangalore with a team of 30 talented professionals. Backed by $9.6 million in seed funding from RTP Global.
Position Overview
You'll architect and optimize GPU infrastructure for an AI-native cloud platform that powers enterprise workloads at scale. You'll build next-generation cloud infrastructure from the ground up while working with cutting-edge GPU technologies. This role offers high ownership in designing distributed systems that handle large-scale GPU computing environments for performance, reliability, and cost efficiency.
Role & Responsibilities
- Design and implement scalable GPU infrastructure systems for AI workloads and high-performance computing environments
- Optimize distributed GPU computing environments for performance, reliability, and cost efficiency at enterprise scale
- Build monitoring, orchestration, and automation tools for large-scale GPU performance
- Develop infrastructure APIs and services for intelligent GPU resource allocation and workload scheduling
- Collaborate with global platform teams to integrate GPU resources with cloud orchestration systems
- Drive technical excellence in GPU platform architecture and distributed systems design patterns
- Implement performance optimization strategies for compute-intensive AI and ML workloads across hybrid cloud environments
Must Have Criteria
- 5-12 years of software engineering experience in HCP, network infrastructure, or cloud platforms
- Hands-on experience with GPU infrastructure management and optimization (NVIDIA CUDA, A100/H100, or Google TPUs)
- Proven experience building distributed systems handling 100K+ concurrent sessions or 30M+ monthly active users
- Strong programming skills in systems languages (Go, Java, Python, or Rust) with focus on performance optimization
- Production experience with container orchestration platforms (Kubernetes, Docker) and microservices architecture
- Experience at hyperscale technology companies (Meta, Google, AWS, Microsoft, Grab, GoTo, Traveloka)
- Proven track record of leading technical initiatives or mentoring engineering teams in infrastructure domains
Nice to Have
- Deep experience with NVIDIA GPU architectures (V100, A100, H100) and CUDA optimization techniques
- Knowledge of GPU cluster management and distributed training frameworks (Horovod, DeepSpeed, Ray)
- Experience with high-performance networking (InfiniBand, RDMA) for GPU clusters and data center interconnects
- Familiarity with GPU virtualization technologies (NVIDIA MIG, vGPU) and multi-tenancy solutions
- Experience with performance profiling and optimization tools for GPU workloads (Nsight, NVPROF)
What We Offer
- Opportunity to build cutting-edge AI-native cloud infrastructure from scratch with global impact
- High ownership and technical leadership in a fast-growing deeptech startup backed by top-tier VCs
- Work with seasoned tech veterans and industry experts on globally relevant engineering challenges
- Competitive compensation package with significant equity upside in a high-growth startup
- Flexible work environment with access to latest GPU technologies and cloud infrastructure tools
Apply Now
Share your details below to apply for this job.
Job Description
Role: Senior GPU Platform Engineer
Function: Infrastructure Engineering
Location: Singapore
Type: Full-time
Industry: Cloud Infrastructure, AI/ML, Enterprise Software
About Company
A deeptech startup developing an AI-native private cloud platform. The company automates orchestration, security, compliance, and workload management for enterprise environments.
Based in Bangalore with a team of 30 talented professionals. Backed by $9.6 million in seed funding from RTP Global.
Position Overview
You'll architect and optimize GPU infrastructure for an AI-native cloud platform that powers enterprise workloads at scale. You'll build next-generation cloud infrastructure from the ground up while working with cutting-edge GPU technologies. This role offers high ownership in designing distributed systems that handle large-scale GPU computing environments for performance, reliability, and cost efficiency.
Role & Responsibilities
- Design and implement scalable GPU infrastructure systems for AI workloads and high-performance computing environments
- Optimize distributed GPU computing environments for performance, reliability, and cost efficiency at enterprise scale
- Build monitoring, orchestration, and automation tools for large-scale GPU performance
- Develop infrastructure APIs and services for intelligent GPU resource allocation and workload scheduling
- Collaborate with global platform teams to integrate GPU resources with cloud orchestration systems
- Drive technical excellence in GPU platform architecture and distributed systems design patterns
- Implement performance optimization strategies for compute-intensive AI and ML workloads across hybrid cloud environments
Must Have Criteria
- 5-12 years of software engineering experience in HCP, network infrastructure, or cloud platforms
- Hands-on experience with GPU infrastructure management and optimization (NVIDIA CUDA, A100/H100, or Google TPUs)
- Proven experience building distributed systems handling 100K+ concurrent sessions or 30M+ monthly active users
- Strong programming skills in systems languages (Go, Java, Python, or Rust) with focus on performance optimization
- Production experience with container orchestration platforms (Kubernetes, Docker) and microservices architecture
- Experience at hyperscale technology companies (Meta, Google, AWS, Microsoft, Grab, GoTo, Traveloka)
- Proven track record of leading technical initiatives or mentoring engineering teams in infrastructure domains
Nice to Have
- Deep experience with NVIDIA GPU architectures (V100, A100, H100) and CUDA optimization techniques
- Knowledge of GPU cluster management and distributed training frameworks (Horovod, DeepSpeed, Ray)
- Experience with high-performance networking (InfiniBand, RDMA) for GPU clusters and data center interconnects
- Familiarity with GPU virtualization technologies (NVIDIA MIG, vGPU) and multi-tenancy solutions
- Experience with performance profiling and optimization tools for GPU workloads (Nsight, NVPROF)
What We Offer
- Opportunity to build cutting-edge AI-native cloud infrastructure from scratch with global impact
- High ownership and technical leadership in a fast-growing deeptech startup backed by top-tier VCs
- Work with seasoned tech veterans and industry experts on globally relevant engineering challenges
- Competitive compensation package with significant equity upside in a high-growth startup
- Flexible work environment with access to latest GPU technologies and cloud infrastructure tools
Apply Now
Share your details below to apply for this job.
Application Submitted Successfully!
Thank you for applying to Senior GPU Platform Engineer. We have received your application and will review it shortly.
You will be redirected shortly...