Being the cybersecurity partner of choice, protecting our digital way of life.
Your Impact
- Work with development teams to ensure that applications have scalability and reliability built-in from day one- agile is second nature to you and you’re excited to work in scrum teams and represent the SRE perspective
- Design and enhance software architecture to improve scalability, service reliability, cost, and performance- you’ve helped create services that are critical to their customers’ success
- Deploy automation for provisioning and operating infrastructure at large scale. You are experienced in Infrastructure as Code concepts and have put them into production
- Partner with teams to improve CI/CD processes and technology - Helping teams in delivering value early is what you strive for
- Mentor members of the staff on large scale cloud deployments- you’re an expert in deploying in the cloud and can bring a teaching mindset to help others benefit from your experience
- Drive the adoption of observability practices and a data-driven mindset- you love metrics, graphs, and gaining a deep understanding of why things happen in a system, helping others gain visibility into the things they build
- Setup processes like on-call rotations, Postmortems, runbooks to continue supporting the infrastructure owned by the SRE team while finding ways to reduce the time to resolution and improve the reliability of services
- Support, optimize and deploy mission critical, front-end and back-end production
- Improving site performance, monitoring, and overall stability of our infrastructure
Your Experience
- Bachelors/Masters degree in Computer Science or a related field or equivalent military experience required
- 5+ years of industry experience in engineering
- Fluent Scripting skills preferably Python or Bash
- 3+ years of working with Microservices architectures on Kubernetes
- HandsOn experience with container native tools like Helm, Istio, Vault for managing workloads running in Kubernetes
- Experience with public cloud (AWS or GCP/Google cloud or Azure) at medium to large scale
- Proficient in CI/CD platforms like GitlabCI, Jenkins, CircleCI etc
- In-depth knowledge of operating systems (processes, threads, concurrency, etc)
- Excellent experience working with Unix/Linux systems from kernel to shell and beyond
- Drive enhancement of observability by implementing distributed tracing, logging standards, dashboard standardization, profiling, and other relevant practices to meet current Service Level Objectives (SLOs)
- HandOn experience with Monitoring tools - Prometheus, Grafana etc.
- Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
- Experience with RabbitMQ, Kafka, Postgres tuning and performance a huge plus
- Lead the long-term strategy on critical components like Kafka, ElasticSearch, Postgres, MongoDB etc, evaluating options for either reliable self-hosted or managed solutions - HandOn production experience with at least one of these is required
- The exceptional communicator in and across teams, taking the lead
All your information will be kept confidential according to EEO guidelines.
Covid-19 Vaccination Information for Palo Alto Networks Jobs
- Vaccine requirements and disclosure obligations vary by country.
- Unless applicable law requires otherwise, you must be vaccinated for COVID or qualify for a reasonable accommodation if:
- The job requires accessing a company worksite
- The job requires in-person customer contact and the customer has implemented such requirements
- You choose to access a Palo Alto Networks worksite
- If you have questions about the vaccine requirements of this particular position based on your location or job requirements, please inquire with the recruiter.