The point where experts and best companies meet
Share
The Role
In this role, you will be at the forefront of maintaining the health and efficiency of our SaaS infrastructure. You will monitor system performance, identify potential issues, and proactively implement solutions before they affect our customers. Automating repetitive tasks and streamlining operations will be key to your success, allowing you to focus on scaling our systems as the business grows. You will collaborate with development and product teams to ensure that new features are reliable and scalable. Additionally, you will manage incident response, conduct root cause analysis, and implement long-term solutions to prevent future issues. Your role will also involve contributing to the improvement of deployment processes, reducing downtime, and ensuring seamless updates.
What you bring:
Strong experience with cloud platforms (GCP, Azure, AWS)
Proficiency in scripting and automation (Python, Bash, etc.)
7+ years of experience in a Site Reliability Engineer or similar role
Deep understanding of monitoring, logging, and alerting best practices
Excellent problem-solving skills and a proactive mindset
What we consider beneficial:
Experience with containerization and orchestration tools (Docker, Kubernetes, HELM, terraform)
Familiarity with CI/CD pipelines (Jenkins) and DevOps practices
Understanding of networking and security in cloud environments
Knowledge of database management and optimization
A passion for learning new technologies and improving system resilience
Job Segment:Cloud, ERP, Database, Application Engineering, Supply Chain, Technology, Engineering, Operations
These jobs might be a good fit