What you will do
Understand the complex architecture of distributed Kubernetes systems and deploy lab environments across dozens or hundreds of nodes that are analogous to customer expectations and demands
Work closely with management, product owners, developers, and quality engineers to understand product requirements and build suitable performance test plans
Simulate real-world workloads to systematically stress environments through comprehensive end-to-end automation, leveraging custom built and state of the art open source tools and frameworks
Deep dive into performance issues with the intent of discovering their root cause on complex distributed systems
Develop and enhance orchestration, benchmarking, monitoring and reporting tools used within and beyond the Performance and Scale teams
Document your research and results clearly and concisely, communicate findings both internally and externally, and provide continuous feedback to Engineering teams and the leadership
What you will bring
5+ years of experience in a role like Software Engineering, Performance Engineering, or Site Reliability Engineering (SRE).
Proven automation experience with scripting and automation, particularly in Python, Go, and/or Ansible.
Significant hands-on experience deploying and managing container orchestration platforms like Kubernetes or Red Hat OpenShift.
Solid Linux system administration and engineering skills, with a good understanding of bare metal server operations.
A solid understanding of performance analysis methodologies and experience with system-level performance tools (e.g., iostat, vmstat, sar, perf)
Familiarity with observability stacks (e.g., Prometheus, Grafana, Jaeger, OpenTelemetry, ELK, Splunk).
Understanding of cloud-native architectures, microservices, CI/CD pipelines, and collaborative software development methodologies, tools and version control (git, gitLab).
Knowledge of TCP/IP, DNS, DHCP, load balancing, and container networking
Demonstrated abilities to take initiative, work independently, proactively seek collaboration and drive projects to completion
:
Direct experience with telecommunications workloads (5G Core, RAN) or telco architectures.
Experience working with public clouds like AWS, Azure, GCP or IBM Cloud
Experience contributing to open-source projects.
Knowledge of AI/ML concepts and their application in tooling or network optimization
Red Hat (https://www.redhat.com/) is the world’s leading provider of enterprise open source (https://www.redhat.com/en/about/open-source) software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies. Spread across 40+ countries, our associates work flexibly across work environments, from in-office, to office-flex, to fully remote, depending on the requirements of their role. Red Hatters are encouraged to bring their best ideas, no matter their title or tenure. We're a leader in open source because of our open and inclusive environment. We hire creative, passionate people ready to contribute their ideas, help solve complex problems, and make an impact.
משרות נוספות שיכולות לעניין אותך