THE ROLE
As the Cloud Development Expert, you will be at the forefront of ensuring the reliability, availability, and performance of our Data Integration Services. You will work with technical leads and management of the Cloud Infrastructure/Obsersability team, and act as a technical leader for the team in Brazil by yourself. You will help to maintain and improve the provisioning, deployment, and orchestration steps needed to run our services for thousands of customers all over the world.
What you will do
- Provide technical guidance and best practices to the team, ensuring consistent and high-quality work.
- Oversee the design and implementation of high available, scalable, and performant infrastructure and services.
- Establish and manage monitoring, logging, and alerting systems to ensure early detection and rapid resolution of issues.
- Architect multi-region and multi-zone deployments to guarantee service resilience and fault tolerance.
- Contribute for development and maintenance of CI/CD pipelines to enhance deployment speed and reliability.
- Implement proactive troubleshooting measures.
- Lead the implementation of security best practices within the infrastructure, ensuring compliance with industry standards.
- Resolve critical outages, coordinating with cross-functional teams to ensure timely recovery.
- Drive root cause analysis and implement long-term solutions to prevent future incidents.
Qualifications
Candidates should meet the following requirements:
- Proficiency in one or more programming languages: Golang, Python, Groovy, Java, shell script and JavaScript
- Container technologies: Docker and Kubernetes.
- Development of high available systems using Kubernetes
- Cloud integration and deployment on AWS, GCP or Azure
- Building CI/CD pipelines: Jenkins, Argo
- Experience with Kubernetes operator framework
- Knowledge in BTP build services and runtimes (e.g., Hyperspace, xmake, Piper, Kyma, Gardener) is a plus
- Experience with implementing and managing observability tools for monitoring, logging, and tracing (e.g., Prometheus, Grafana, ELK Stack, Jaeger)
- Strong understanding of distributed tracing and metrics collection to ensure system reliability and performance optimization
- Ability to diagnose and troubleshoot system issues using observability best practices
Work Experience
- Solid relevant experience in Cloud Infrastructure Development role, DevOps, or a related field, with at least 2 years in a leadership role
- Proven record of projects for infrastructure design and implementation
- Deep understanding in DevOps methodologies and principles
- Hands-on experience with Microservice Delivery processes and technologies
- Strong communication and collaboration abilities, particularly in cross-functional teams and with senior management