Expoint - all jobs in one place

מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Limitless High-tech career opportunities - Expoint

Cisco Site Reliability Engineer 
United States, Texas, Austin 
317067287

06.03.2025

The application window is expected to close on 3/20/25.

Job posting may be removed earlier if the position is filled or if a sufficient number of applications are received.

As part of this team, you’ll contribute across the stack—from code and infrastructure to database optimization. We work closely with engineering teams to refine service architecture, guide performance testing, and provide the tools and insights needed to optimize systems.

Your Impact

As aSite Reliability Engineeryou'll play a pivotal role in ensuring the reliability and scalability of our platform. You will be responsible for maintaining and enhancing our in-house load generation tool, managing our performance testing infrastructure, and collaborating closely with engineering and SRE teams. This role is exciting because you'll directly influence the performan
ce of critical services by building testing frameworks, troubleshooting complex issues, and ensuring that we deliver high-quality, performant systems at scale. You'll get hands-on with cutting-edge technologies like Kubernetes, AWS, and observability tools, while also shaping testing strategies that align with service architectures.


ey Responsibilities:
  • Maintain and Enhance Load Generation Tools: Oversee the management and continual improvement of our internal load generation tool, ensuring it meets the needs of our performance testing efforts.
  • Test Infrastructure Management: Manage and optimize our test infrastructure, built on Kubernetes (K8s) and EC2-based AWS deployments. Collaborate with other teams to ensure the infrastructure supports scalable, efficient testing.
  • Performance Test Planning & Execution: Work directly with engineering teams to develop detailed performance test plans tailored to specific services. Ensure the execution of these tests, track their progress, and resolve any issues that arise.
  • Tooling and Observability: Use observability tools like DataDog, OpenSearch, and Grafana to collect, analyze, and report on performance test metrics. Identify potential performance bottlenecks and work with teams to resolve them.
  • Python Scripting & Automation: Write performance tests and automation scripts in Python to validate service performance and scalability. Ensure tests are robust, efficient, and provide valuable insights.
  • Troubleshooting and Problem Resolution: Troubleshoot test failures, infrastructure issues, and performance bottlenecks in Kubernetes, EC2, and MySQL RDS environments. Ensure test environments are stable, and performance testing runs smoothly.
  • Collaboration & Partnerships: Partner with engineering teams to understand the architecture of services and develop test plans that align with their goals. Collaborate closely with SRE teams to performance test infrastructure components and ensure overall platform health.
  • Performance Reporting: Identify, report, and analyze any performance-related issues encountered during tests. Provide clear and actionable recommendations to improve service performance.
Minimum Qualifications:
  • 5+ years of experience in performance testing, SDT (Software Development in Test), or infrastructure management.
  • Experience with Python for writing automated performance tests and tools.
  • Experience with Kubernetes (K8s), EC2, and AWS resources for deploying and managing test environments.
  • Professional work experience with MySQL RDS and cloud-based infrastructure, with a demonstrated ability to troubleshoot performance issues.
  • Experience with Argo Workflows for orchestrating tests in Kubernetes and using observability tools like DataDog, OpenSearch, and Grafana.
Preferred Qualifications:
  • SDT Experience: Strong background in the principles and practices of software development in test to build robust, scalable testing solutions.
  • SRE Experience: Experience working with SRE teams on infrastructure and performance testing, contributing to overall system reliability and performance.
  • Software Engineering Background: Solid understanding of software engineering principles to help integrate performance testing effectively within the broader development lifecycle.
  • Experience with performance testing tools like JMeter, Gatling, or similar.
  • Experience with CI/CD pipelines and integrating performance testing into continuous integration processes.
  • Background in infrastructure or DevOps roles with expertise in cloud platforms like AWS and container orchestration tools.