Ometria Logo

Lead Platform Engineer - Site Reliability at Ometria (London, UK)

About the Employer

Job Description

Lead Platform Engineer - Site Reliability London, England, United Kingdom · Engineering Viribus are partnered with a  fast growing AI company called Ometria. They are backed by top VC funds and successful entrepreneurs and are at a really exciting time in their growth, having closed their Series A funding. As part of their growth and plans for scaling internationally, they are looking for a Lead Platform Engineer to manage the operational aspects of their AI powered platform, which collects and aggregates data from all customer touch points in real-time and uses machine learning to profile customers.  You will work with the engineers and co-ordinate developer operations activities. They’re looking for someone to: Foster and champion a development operations culture within engineering and the wider company. Lead and build (along with the VP of engineering/CTO) a platform engineering team that focuses on site reliability engineering. Define the infrastructure/operations strategy with the CTO and engineering teams. Manage the operational aspects of the platform - change management, support/escalations and incident management. Mentor and co-ordinate developer operations activities across the teams. What you'll be doing: Infrastructure Strategy - Define the infrastructure strategy with the CTO for the platform. This includes the production, test and development environments and the overall solution design. Ensure that the system can meet both scalability and reliability demands of our clients as our business grows. Coordinate Developer Operations - Co-ordinate activities across the teams inline with the infrastructure strategy. You’ll mentor the engineers on operations activities and various technology (Kubernetes, Docker, Prometheus etc…). This includes their data platform and database management systems (Postgres, Redshift) and other infrastructure components (e.g. queuing technologies, load balancing etc). You’ll also own the change management process with the CTO. Manage System Qualities, System Design - Manage their AWS cloud infrastructure from a performance, security, cost, robustness and reliability perspective. You’ll work with the teams to ensure that the design of major features and new services have suitable non-functional requirements defined. You’ll ensure that appropriate runbooks are in-place for highly available systems. Manage Support & Escalations, Incidents and Root Cause Analysis - Manage the support and escalations processes and issues with issues raised by the customer support team. You’ll work with the engineering teams to ensure timely and appropriate responses with a high level of communication. You’ll manage the incident logging process with the engineers and perform post mortems with the teams when incidents occur. You should apply if: You have at least five years experience building out infrastructure solutions with at least two (ideally) working with non-trivial SaaS platforms. Experience in either the Python or Golang language and ecosystem. Excellent understanding of Kubernetes concepts and experience of running it in production (we use Tectonic). An in-depth understanding of cloud native applications in an AWS environment and Linux systems. Experience with provisioning tools (Terraform preferred). Experience of working with relational databases from an operational perspective (Postgres a plus) Experience with application performance monitoring tools (Prometheus, StatsD), security monitoring etc… Technical Leadership - you have lead a team before, mentored more junior engineers and worked with a degree of autonomy. Communication - you have excellent communication skills - verbal and written. Attention to detail - you take pride in your work and you don’t cut corners. You are able to think both on a detail level, but also step back and see the bigger picture. You understand the conflicts of attention to detail and pragmatic requirement of timely delivery of business needs, and you are able to prioritise accordingly along with your team. Creativity, passion and knowledge - you share these interests and are hungry to learn about what others are doing, as well as get immersed in what Ometria does and the market it’s in. You have a passion for the latest development and deployment technologies and will help us shape our infrastructure to maintain our competitive edge. Problem solver - you enjoy working in a team, and independently, to solve problems in a robust and scalable way. Benefits: Here are a few of the benefits that make them a great place to work: Unlimited holiday Regular socials and activities Committees and sports clubs Subsidised gym membership Personal development budget Personal wellness fund Salary sacrifice scheme for cycle to work and personal electronics Fully stocked kitchen with breakfast, snacks and drinks Your choice of equipment