You will collaborate closely with development, operations, and other teams to implement and maintain efficient and resilient systems.
Responsibilities:
- Infrastructure Automation: Developing, deploying, and overseeing Infrastructure as Code (IaC) solutions using tools such asTerraformandAnsibleto automate the provisioning, configuration, and deployment processes.
- Cloud Platform Expertise: Deep understanding of AWS cloud services, including EC2, S3, VPC, RDS, EKS, ECS, CF and more. Experience with serverless architecture and AWS Lambda functions is a plus.
- Containerization and Orchestration: Proficiency in containerization technologies (Docker) and orchestration platforms (Kubernetes) with deploying applications using tools like K8s and Helm.
- CI/CD Pipelines: Build and maintain robust CI/CD pipelines using tools like Jenkins.
- Monitoring and Alerting: Implement comprehensive monitoring and alerting solutions using tools like ELK, Datadog, CloudWatch, Grafana to proactively identify and resolve issues.
- Incident Management: Drive incident response processes, troubleshoot complex issues, and perform Root Cause analysis (RCA) to prevent future occurrences (CAPA).
- Performance Tuning : Continuously optimize system performance, identify bottlenecks, and implement strategies to improve scalability and efficiency.
- Cost Optimization: Identify and implement strategies to reduce cloud costs while maintaining performance and reliability.
- Security Best Practices: Adhere to security best practices and implement measures to protect infrastructure and data from vulnerabilities and threats.
- Collaboration and Communication: Work effectively with cross-functional teams to understand business requirements and provide technical guidance.
- SOP Documentation: Create and maintain documentation for infrastructure, processes, and incident management protocols.