Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational functions that support the broader engineering organization. Among these are our multi-cloud-provider Kubernetes infrastructure, networking, load balancing (including our public-facing edge and internal service mesh), and observability and alerting systems.
The ideal candidate should
- Have 6+ years of experience in software development and operating distributed systems
- Proficiency in Python, Go, or a similar language
- Proven experience building and operating large-scale continuous integration and continuous deployment (CI/CD) pipelines
- Possess a customer-focused mindset
- Value efficiency in processes and operations
- Prefer automation over manual process (“allergic to ops work”). We are a small team of software engineers with a strong bias towards software solutions to avoid toil
- Experience using and extending containerization technologies, particularly Kubernetes, to enhance application agility, optimize resource utilization, and accelerate time-to-market
- Expertise in cloud infrastructure platforms, including AWS, Google Cloud Platform (GCP), or Azure
- Understanding of Linux operating system internals and networking concepts (e.g., TCP/IP, DNS, TLS, routing
Expectations
- Contribute to developing a world-class continuous deployment experience, enabling the rapid and reliable shipment of MongoDB products
- This includes, but is not limited to, contributing to open-source projects, or engineering software-based approaches like Kubernetes operators to streamline processes
- Own the onboarding flow other engineering teams follow when launching a new product or service
- Collaborate with other teams within Platform Engineering to ensure a consistent service-onboarding experience
- Provide internal support for our deployment systems, including answering questions and addressing issues
- Participate in a 24/7 on-call rotation to resolve issues involving the deployment infrastructure