So, what’s the role all about?
- Running the production environment by monitoring availability and taking a holistic view of system health
- Building software and systems to manage platform infrastructure and applications
- Improving reliability, quality, and time-to-market of our suite of software solutions
- Measuring and optimizing system performance, to push our capabilities forward, get ahead of customer needs, and innovate to improve continually
- Providing primary operational support and engineering for multiple large distributed software applications
How will you make an impact?
- Gathering and analyzing metrics from both operating systems and applications to assist in performance tuning and fault-finding
- Partnering with development teams to improve services through rigorous testing and release procedures
- Participating in system design consulting, platform management, and capacity planning
- Creating sustainable systems and services through automation and uplift
- Balancing feature development speed and reliability with well-defined service level objectives
Have you got what it takes?
- At least three years of working experience in a production environment.
- Experience with AWS development and solution architecture
- Linux system administration skills
- Ability to program (structured and OO) with one or more high-level languages, such as Python, Java, Ruby, and JavaScript
- A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
You will have an advantage if you also have:
- Experience with NoSQL/Streaming technologies in big data env - an advantage ( Kafka, Flink, DynamoDB)
- Experience with monitoring and log analysis tools such as ELK, Prometheus, Grafana, Datadog, etc.
- Experience with IaC tools – Terraform
- Experience with Docker and Kubernetes technology
- Bachelor’s degree in computer science or other highly technical, scientific discipline
- Previous success in technical engineering
Reporting into:Group Lead, Engineering, Actimize
Individual Contributor