Your Role and Responsibilities
Responsibilities:
- Develop and maintain scalable distributed systems in IBM Cloud, AWS, and on-premise.
- Develop and maintain high performance k8s clusters across multiple regions.
- Develop and maintain telemetry infrastructure & service instrumentation (python) for metrics, distributed tracing, and logging.
- Support infrastructure for a petabyte scale data platform and stream analysis services.
- Work with Audio and Speech AI Engineers to accelerate development and deployment of heterogeneous analysis and training pipelines
- Participate in the definition and management of SLIs, SLOs and error budgets for infrastructure and production services.
- Design and implement infrastructure-as-code pipelines
Required Technical and Professional Expertise
- 4+ Years cloud development (IBM cloud preferred and AWS) experience designing, implementing, and support cloud-based infrastructure
- 3+ Years experience architecting, deploying, and supporting kubernetes in cloud and on-prem environments.
- 2+ years experience designing and supporting distributed systems.
- Experience writing production code in one of more languages such as Python (preferred), Java, Go in a microservices environments.
- 2+ Years Linux experience configuring, supporting, and optimizing. Bonus for Redhat
Preferred Technical and Professional Expertise
- Familiarity running distributed ML workloads in cluster orchestrated environments
- Experience building and supporting telemetry and related infrastructure (Open Telemetry, Jaeger, Grafana, Prometheus)
- Experience with k8s ecosystem tooling like helm, deployment tools such as ArgoCD
- Experience designing and implementing infrastructure as code pipelines
- Experience designing and implementing traffic routing strategies in edge and microservices environments.
What we offer:
- Working for a top 5 IT company according to Forbes 2022 best employers ranking
- International and prestigious projects
- Highly skilled teams of experts
- Wide range of IBM trainings and certificates
- Access to Udemy, Harvard Business Review, O’Reilly, Interskill, IBM AI Skills Academy
And what is more:
- Contract of employment
- Competitive compensation – salary range , depending on your skills and experience
- Private medical care and life insurance
- Employee Assistance Program
- Sport, charity & other networking groups
- Summer / winter camps for children
- Discounts with IBM employee badge
- Referral Bonus Program
- Home office option
- No dress code