DevOps Engineer
other jobs Matchtech
Added before 2 Days
  • England,London,City of London
  • Full Time, Permanent
  • £85,000 per annum, negotiable
Job Description:
DevOps Engineer - Reinforcement Learning Platforms
We are seeking an experienced DevOps Engineer to help build and scale a web-based platform for reinforcement learning (RL) training and RLOps. You will design, implement, and maintain the cloud infrastructure, CI/CD pipelines, and deployment systems that support large-scale RL workloads.
Responsibilities * Design and manage scalable cloud infrastructure for high-performance RL training and distributed environments
* Build and optimise CI/CD pipelines for open-source and enterprise components
* Implement containerisation and orchestration using Docker and Kubernetes
* Develop Infrastructure as Code solutions (Terraform, CloudFormation, Pulumi)
* Implement monitoring, logging, and alerting for distributed ML systems
* Collaborate with ML teams on resource optimisation and cost efficiency
* Apply security best practices, manage access controls, and ensure compliance
* Automate operational tasks: backups, disaster recovery, maintenance
* Support GPU clusters and distributed compute resources for RL workloads
* Maintain availability and performance of production ML systems
Requirements * Degree in Computer Science/Engineering or 3+ years of DevOps/infrastructure experience
* Strong background with AWS, GCP, or Azure, including ML/AI workloads
* Proficiency with Docker, Kubernetes, and ML-focused orchestration
* Experience with Terraform/CloudFormation/Pulumi and configuration management
* Solid understanding of CI/CD tools (GitHub Actions, GitLab CI, Jenkins)
* Knowledge of monitoring/observability tools (Prometheus, Grafana, OpenObserve)
* Experience with GPU infrastructure and distributed ML compute frameworks
* Familiarity with MLOps tools and model lifecycle management
* Strong scripting skills (Python, Bash)
* Understanding of cloud networking, security, and database fundamentals
* Experience with HPC environments or schedulers is a plus
* Strong problem-solving and communication skills
Compensation & Benefits * Stock options
* 30 days’ holiday plus bank holidays
* Flexible and remote working options
* Enhanced parental leave
* £500 annual learning and development budget
* Pension scheme
* Regular socials and quarterly gatherings
* Bike-to-Work scheme
Job number 3193729

Increase your exposure to recruiters with ProJobs

Thousands of recruiters are looking for you in the Job Master profile database, increase your exposure 4 times with a ProJob subscription

You can cancel your subscription at any time.
metapel
Company Details:
Matchtech
Company size: 250–499 employees
Industry: Other
We’re motivated by our mission to bridge the STEM skills gapSince our doors opened in 1984, Matchtech has grown to become one of the UK’s ...
The jobs on site are for both men and women