Infrastructure Platform Engineer

Fidel Consulting KK
📍 Tokyo ( Remote ) , Tokyo ( Remote ) , Japan 💼 Full-time 🕒 Posted February 28, 2026

Job Description


Appealing points:


End-to-end ownership of production GPU cluster platforms Lead architecture, provisioning, standardization, and lifecycle management of GPU clusters running mission-critical workloads.
Build reliability and operational excellence into large-scale systems Design SLOs/SLIs, observability, safe rollouts, and incident response workflows with a strong automation-first mindset.
Work on performance-sensitive, heterogeneous infrastructure at scale Tackle challenges in capacity planning, utilization optimization, and reliability across diverse GPU generations and platforms.


Annual Salary: 8 Million yen and Above

Job Responsibilities:


Own GPU cluster architecture and operations: provisioning, node images, driver/runtime lifecycle, GPU plugin/operator lifecycle, and standardized deployment patterns for serving pools and system services.
Define and maintain the production baseline: golden node con...

Ready to Apply?

Submit your application today and join our talented team at Fidel Consulting KK .

Submit Application

Job Details

  • Location Tokyo ( Remote ) , Tokyo ( Remote )
  • Job Type Full-time
  • Category other-general
  • Posted Date February 28, 2026
  • Application Deadline April 09, 2026