Infrastructure Management: Fully manage and maintain our AWS/GCP infrastructure for video creation and AI production, including administration of Kubernetes clusters.
Developer Enablement: Empower developers with technologies like Terraform for streamlined environment creation and management.
Monitoring and Observability: Ensure the reliability of our production environment by implementing robust monitoring, alerting, and observability solutions using tools like DataDog, Splunk, Prometheus, Grafana, etc.
CI/CD Pipeline Management: Enhance and support our CI/CD pipelines, collaborating with systems like Jenkins and ArgoCD.
Innovation and Development: Research and implement new technologies to enhance our development lifecycle, environments, and production systems.
Requirements:
2+ years of experience in high-scale production environments
Proficiency in any high-level programming languages (preferably Python/Go)
Production experience with Kubernetes and Helm
Experience with one of the public cloud providers (Preferably AWS/GCP)
Skilled in CI/CD methodologies, GitOps, and Infrastructure as Code (IaC)
Creative problem solver with excellent debugging skills
Detail-oriented with great communication and documentation skills