Finding the best job has never been easier
Share
You’ll join a diverse team of software, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security and availability. You’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.Key job responsibilities
Define and Deliver Business Priorities
You will be a key contributor and owner of the direction of the global AWS Incident Response team. You will define, plan, track and deliver on strategic goals for the team, while ensuring that the team remains unblocked and focused.Cross-Site, Cross-Team Coordination
You will be responsible for coordinating with your counterparts to ensure that a clear communication channel exists between AWS Operations teams. You will also work closely with systems and product teams to create and maintain a proper processes for monitoring and alarming on services. A portion of this process will include establishing both solid operational acceptance criteria and a concrete feedback loop for resolving deviations from that process.Incident/Change Management
Performance Management/Team Health
You will own all facets of performance and career management for the team.
- 5+ years of direct experience with cloud hosting technologies (AWS, Azure, etc.
- 5+ years experience managing an engineering team operating at scale.
- Deep understanding of infrastructure delivered through the software development lifecycle in an API-enabled environment – including agile development, software /patterns, and modern cloud services.
- Experience in implementing, supporting, and evaluating tools and services with a security, scalability, and performance mindset
- Ability to handle multiple competing priorities in a fast-paced environment
- Ability to interact with and influence people at all levels.
- Excellent written and verbal communication skills and ability to get ideas across to the team, peers and customers.
- Strong understanding of fundamental operational best practices such as monitoring, alerting, deployment and change policies (ITIL a plus)
- Experience running agile frameworks or other workflow methodologies in an DevOps setting.
- Experience dealing with customers during issue resolution and operating under pressure.
- Routine communication of status to senior management
- SLA definition and refinement
- Goal-setting for reduction and elimination of customer facing defects
- Leading post-mortem analysis, including ensuring a high quality bar for analysis and follow through of consequent action items
These jobs might be a good fit