The point where experts and best companies meet
Share
Position Summary
F5 Inc. is actively seeking an exceptional Senior Site Reliability Engineer to play a pivotal role in our SRE team for the groundbreaking F5XC Product.
Primary Responsibilities
F5xc SRE: Play the role of a hands-on SRE Engineer focused on automation and toil-reduction and participate in Ops cycles to support our product.
Perform oncall support function on a rotation basis, providing timely resolution of issues and ensuring operational excellence in managing and maintaining distributed networking and security products
Mentor junior team members to support their professional growth and development
Easy-to-Use Automation: Continue to grow the infra-automation (k8s, ArgoCD, Helm Charts, Golang services, AWS, GCP, Terraform) with a focus on ease of configuration
Environment Stability using Observability: Create and continue to evolve existing Observability (metrics & alerts) and participate in regular monitoring of infrastructure for stability.
Collaborative Engagement: Collaborate closely with application owners and SRE team members as part of roadmap execution and continuous improvement of existing systems.
Scale & Resilient systems: Design & deploy systems/infra which is highly available and resilient for the configured failure domains.
Design systems using strong security principles with security by default.
The Job Description is intended to be a general representation of the responsibilities and requirements of the job. However, the description may not be all-inclusive, and responsibilities and requirements are subject to change.
Knowledge, Skills and Abilities
Hands-on programming experience in any one language python,golang + shell scripting.
Hands-on terraform expertise.
Strong networking fundamentals and experience dealing with different layers of the networking stack.
SRE/Devops on Linux & Kubernetes: Demonstrate excellent, hands-on knowledge of deploying workloads and managing lifecyle on kubernetes, with practical experience on debugging issues.
Experience in upgrading workloads for SaaS Services without downtime.
Oncall Experience in managing everyday OPs for production environments. Experience in production alerts management and using dashboards to debug issues.
GipOps: Experience with helmcharts/kustomizationsand gitops tools like ArgoCD/FluxCD.
CI/CD: Experience working with/designing functional CI/CD systems.
Cloud Infrastructure: Prior experience in deploying workloads and managing lifecycle on any cloud provider (AWS/GCP/Azure)
Experience with Disaster Recovery and Migration is a plus
Qualifications
Typically, requires at least 8+ years of related experience with a bachelor’s degree, 6+ year and a master’s degree, or a PhD with 4+ year of experience or equivalent experience.
Excellent organizational agility and communication skills throughout the organization.
Environment
Empowered Work Culture: Experience an environment that values autonomy, fostering a culture where creativity and ownership are encouraged.
Continuous Learning: Benefit from the mentorship of experienced professionals with solid backgrounds across diverse domains, supporting your professional growth.
Team Cohesion: Join a collaborative and supportive team where you'll feel at home from day one, contributing to a positive and inspiring workplace.
The Job Description is intended to be a general representation of the responsibilities and requirements of the job. However, the description may not be all-inclusive, and responsibilities and requirements are subject to change.
These jobs might be a good fit