Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

F5 Site Reliability Engineer III 
India, Telangana, Hyderabad 
341537327

18.02.2025

Software engineering is a core discipline at F5 for many roles. As a software engineer specializing in site reliability, you will bring a software engineering and automated solution mindset to your work.

The Site Reliability Engineer III will be responsible for ensuring the reliability, availability, and scalability of critical systems and SaaS platforms. Systems under the care of an SRE III must operate effectively and reliably through scalable builds and deployments, frequent releases, and complexarchitectures thatencompass modern technologies. You will work closely with technical and non-technical teams throughout the organization to facilitate the design and implementation of scalable solutions, drive automation initiatives, and monitor and maintain the performance of critical systems.


What You’ll Do

  • Apply modern engineering principles and practices to operational functions and employ this methodology throughout the full system lifecycle; from initial concept and architecture through deployment, daily operation, and overall optimization, and apply these practices to refining existing systems.

  • Support and maintain technology systems to ensure optimal performance, reliability, and security.

  • Scale systems sustainably through mechanisms such as automation and evolve systems by fostering changes that improve velocity.

  • Troubleshoot and resolve complex issues, including systems failures, connectivity problems, and performance bottlenecks.

  • Partner with cross-functional teams to design and implement scalable and robust system architecture to improve services on an ongoing basis

  • Investigate various open source and proprietary technologies, components, libraries, tools etc. and help build a highly available, highly scalable and easily manageable system

  • Apply observability and data skills to proactively measure system performance, diagnosing services/needs and quickly identify solutions.

  • Participate in service operation and RCA activities and assist with defining SLOs and SLIs for business stakeholders

  • Implement and enforce security best practices to protect our systems, data, and infrastructure against unauthorized access, cyber threats, andvulnerabilities.

  • Create and maintain comprehensive knowledge bases for system documentation, including standard operating procedures, configurations, and troubleshooting guides, to support end-users' ability to use the systems effectively.

  • Participate in on-call rotation.

  • Responsible for upholding F5’s Business Code of Ethics and for promptly reporting violations of the Code or other company policies.

  • Performs other related duties as assigned.

What You’ll Bring

  • A code-first approach to managing resources across cloud and SaaS platforms.

  • Expertise in managing Docker container applications and orchestrating Kubernetes clusters.

  • Proficiency in Agile methodologies, DevOps principles, SRE practices, and associated tools and technologies.

  • Strong capability to support web applications running on Tomcat, Apache, NGINX, and Node.js.

  • Comprehensive administration skills for core platforms, including backups, recovery, monitoring, maintenance, and upgrades.

  • Experience in scripting and automation with Infrastructure as Code (IaC) tools such as Azure Resource Manager, AWS CloudFormation, Ansible, or Terraform.

  • Proficiency in writing YAML code to build and manage Azure DevOps pipelines.

  • Practical experience with CI/CD pipelines and tools, specifically in writing YAML code for Azure DevOps.

  • Expertise in Azure resource management and operations.

  • Strong familiarity with Linux system internals and administration.

  • Understanding of compliance and regulatory guidelines.

  • Solid grasp of cybersecurity principles and best practices.

  • Demonstrated ability to work independently and collaboratively as an integral member of an agile team.

  • Experience with observability tooling, including logging infrastructure, time series metrics databases, tracing systems, and alert definitions.

  • Proficient communication, planning, problem-solving, troubleshooting, and organizational skills.

  • Flexibility to adapt to changing project requirements and timelines.

Qualifications

  • BS/BA or equivalent work experience

  • 5+ years' experience as a software engineer specializing in site reliability similar role in a technology environment

  • Technical confidence and familiarity with DevOps tools and SRE Practices.

  • Strong proficiency in scripting and/or programming languages (Python, Bash, TypeScript or Java preferred)

  • Hands on experience with technology systems tools, protocols, and platforms

The Job Description is intended to be a general representation of the responsibilities and requirements of the job. However, the description may not be all-inclusive, and responsibilities and requirements are subject to change.

The Job Description is intended to be a general representation of the responsibilities and requirements of the job. However, the description may not be all-inclusive, and responsibilities and requirements are subject to change.