Responsibilities
- Team Leadership: Lead a team of Engineers and Analysts, providing guidance mentoring, performance management. Foster a culture of continuous improvement.
- Process Improvement: Develop and refine processes that streamline workflows, reduce bottlenecks and increase overall velocity.
- Training and Support: Provide training and documentation to other team members on the effective use of existing toolsets and best practices. Act as primary point of contact for staff issues.
- Collaboration and Communication: Work closely with partner teams to understand their pain points and automation goals. Work with teams to make them more efficient.
- Continuous Improvement: Participate in or lead continuous improvement projects driven by automation.
- On-Call Rotation: Participate in an on-call rotation as needed.
- Additional Duties: Perform any other activities as directed by management.
Knowledge and Experience
- Experience: 5+ years of experience as a people manager or in a team lead role with delegation duties in a technical environment.
- Strategic Thinking: Demonstrate ability to think strategically about business and product goals as well as technical challenges and staff workloads.
- Software Development: Prior experience with software development, infrastructure development, or operations.
- Scripting Languages: Strong proficiency in scripting languages such as Bash, Python, and/or PowerShell.
- Relational Databases: Experience with relational databases.
- Server Administration: Strong proficiency with Linux and Windows Server administration.
- Automation Frameworks: Experience in architecting automation frameworks with proficiency in tools such as Jenkins, Chef, Puppet, Ansible, or similar.
- Agile Methods: Experience with Agile methods (Scrum/Kanban) for organizing project deliverables and tracking progress (Jira).
- Version Control: Experience with Git and/or code repository services (BitBucket, GitHub, etc.).
- Cloud Services: Experience with open-source technologies and cloud services (AWS/Azure).
- Monitoring Tools: Experience with monitoring and alerting tools (Splunk, Nagios, BigPanda, PagerDuty).
- Infrastructure as Code: Experience with infrastructure as code (Terraform, CloudFormation).
- Container Technology: Knowledge of and exposure to container technology and orchestration is a plus.
- API Interaction: Experience interacting with REST APIs (GET/POST requests), webhooks, and API client tools (Postman).
- Problem-Solving: Excellent problem-solving and troubleshooting skills.
- Documentation: Process-oriented with great documentation skills (Confluence).
- Data Structures: Experience with data structures/formats such as XML, JSON, YAML, and HCL.
- Business Continuity: Experience with automation of business continuity/disaster recovery.
Preferred
- Production Operations: Experience with managing production operations, monitoring, alerting, notifications, etc.
- Coding Experience: Moderate experience with coding any combination of Perl, Ruby, Bash, PowerShell and Java + others.
- Scheduling Tools: Experience with Rundeck and/or Cisco Tidal Enterprise Scheduler.
- Monitoring Tools: Experience with BigPanda and PagerDuty.
- AI Ops: Experience with AI Ops.