Eperience managing a team of engineers by setting clear expectations, keeping team members energized, and delivering great results.
Software engineering experience, including delivering highly available cloud-based services and platforms.
Experience in designing, analyzing, and troubleshooting distributed systems.
Minimum of a Bachelor's degree in Computer Science, Computer Engineering, Software Design, Software Engineering, or a related field, or equivalent alternative education, skills, and/or practical experience.
Preferred Qualifications:
Prior hands-on development experience including leading architecture for large scale system, designing, and writing code, experience in coding in C++.
Prior experience and/or subject matter expertise in REST, service reliability, Network Security, Encryption and Authentication.
Prior experience and/or subject matter expertise in Storage or File-Systems, Networking, Distributed Systems, Operating Systems and/or Applications at scale.
Other Requirements:
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
Responsibilities
Manage the team: Attract and retain talent while supporting the growth and career development of team members
Create a team culture of quality first mentality that directly contributes to service resiliency capabilities as part of the stack.
Research and deep-dive into technical issues and bring clarity to the team as needed
Own planning and ensure top deliverables are met for the success of Microsoft, Azure Storage, and the team
Contribute to the technical direction and approach alongside other managers across geo locations.
Acts as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions, alerting stakeholders about status and initiates actions to restore system/product/service for simple and complex problems when appropriate