Job responsibilities
- Regularly provides technical guidance and direction to support the business and its technical teams, contractors, and vendors
- Develops secure and high-quality production code, and reviews and debugs code written by others
- Drives decisions that influence the product design, application functionality, and technical operations and processes
- Serves as a function-wide subject matter expert in one or more areas of focus
- Actively contributes to the engineering community as an advocate of firm wide frameworks, tools, and practices of the Software Development Life Cycle
- Influences peers and decision-makers to consider the use and application of leading-edge technologies
- Adds to the team culture of diversity, equity, inclusion, and respect
Required qualifications, capabilities, and skills
- Formal training or certification on site reliability engineering concepts and 10+ years of applied experience in building SRE and observability solutions to solve specific business problems
- Practical knowledge of OTEL spec and observability tools such as Dynatrace, Jaeger, Grafana, Cortex, Prometheus
- Hands-on practical experience architecting, delivering system design, application development, testing, and operational stability
- Practical hands on experience coding in Python etc.
- Experience in leading technology projects & managing technologists
- Proficient in application, data, and infrastructure architecture disciplines.
- Proficient in cloud-native architecture, design and implementation across all systems.
- Advanced knowledge of software applications and technical processes with considerable in-depth knowledge in one or more technical disciplines (e.g., cloud, artificial intelligence, machine learning, mobile, etc.)
- Ability to tackle design and functionality problems independently with little to no oversight
- Practical experience of designing and deploying applications on AWS
- Knowledge of microservices, springboot, web UI application and observability solutions for each layer
Preferred qualifications, capabilities, and skills
- Batch controller is plus (autosys, control M, AirFlow)
- Batch observability and solutions
- Data engineering experience on big data platforms such as databricks, snowflake