is looking for aReliability Engineerto join our Reliability team. This role will be integral in ensuring the robustness and dependability of our platform, impacting millions of users globally.
Maintain a comprehensive understanding of our service architecture and its dependencies.
Identify and mitigate risks associated with tightly coupled services and complex interconnections.
Lead service re-architecture initiatives to improve reliability and scalability.
Review new services and ensure they meet our reliability standards.
Advocate for Chaos Engineering, collaborate with R&D teams, build tools/envs, and improve system resilience
Manage the full lifecycle of reliability tools and services, adhering to the comprehensive architectural guidelines
Collaborate with teams to define and monitor Service Level Indicators (SLIs) and Service Level Objectives (SLOs) that align with business goals and user expectations