Training, Process Management Engineer
OpenAI
🔍 Observed
83
Hiring Activity Score
Very Active (80-100)
- Base score
- Just posted (1d, flat to 14d)
- has location, quality description (6515 chars)
- 417 new listings in 30d (×0.98 age 1d)
- 4 skills
- Tier 1 reputable (OpenAI) ×0.98 age 1d
- Low confidence (30%)
- Direct ATS (ashby)
London, UK
First seen 1 day, 15 hours ago
Last seen 3 hours, 39 minutes ago
Ashby
Apply on Ashby
Search Google for This Role
ATS links often expire — Google search finds the latest posting
Job Description
AI Summary
• Develop and maintain distributed process management software in Rust that orchestrates ML training workloads across massive clusters with hundreds of thousands to millions of machines and accelerators
• Design high-performance, scalable systems that balance researcher productivity with hardware efficiency, focusing on reliability, observability, and fault tolerance for long-running distributed jobs
• Profile, optimize, and debug complex distributed systems across supercomputers while adapting to evolving ML infrastructure needs and novel challenges at frontier scale
• Requires strong systems engineering expertise, proficiency in Rust and Python, and demonstrated ability to design end-to-end platforms with deep understanding of high-performance distributed architectures
• Based in London, UK with hybrid work (3 days/week in office) and relocation assistance available; role sits within OpenAI's Training Runtime team building core infrastructure for AI research and model training
Skills
rust
python
aws
go
Quick Actions
Job Information
-
Company:
OpenAI -
Location:
London, UK -
Job Type:
Full-Time -
Work Location:
Remote -
Experience Level:
Mid -
Source:
Ashby -
Status:
Active
Activity Score
83
/100
Very Active (83)
Higher scores indicate more likely active hiring based on listing freshness, company activity, and other signals. Learn more →
More from OpenAI
-
Senior Staff Backend Software Engineer, Codex for Finance
San Francisco -
Staff / Senior Staff Product Engineer, Full Stack - Codex for Finance
San Francisco -
Manager, AI Deployment Engineering (Korea)
Seoul, South Korea -
Solutions Architect, Digital Natives
San Francisco -
AI Systems Engineer, Codex Agents
San Francisco