Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Cisco Technical Lead Site Reliability Engineering 
United States, Georgia, Atlanta 
435871991

12.06.2024

As part of this role you will:

  • Guide the design and operation of the Observability platform ensuring it meets business and technical requirements
  • Work closely with product engineering teams to tailor observability solutions to their needs, facilitating seamless integration and optimal functionality.
  • Oversee the service from concept through deployment, including designing, building, and operating the platform.
  • Ensure high service availability and quality while balancing the rapid delivery of new features and releases.
  • Utilize industry-standard tools and processes to manage distributed big data systems at scale, ensuring efficiency and reliability.
  • Engage in daily stand-ups, weekly sprint reviews, and be part of the on-call rotation to support a live production system.
  • Participate in incident post-mortems and the planning and design of new features.

Qualifications

  • Bachelor's degree with 5 years of related experience in computer science, information technology or other related field, or equivalent professional experience.
  • Minimum 5 years experience building, deploying, and resolving containerized workloads
  • Minimum 3 years experience with either the ELK (ElasticSearch, Logstash, Kibana) + Kafka stack or the Grafana stack (Grafana+Prometheus/Thanos/Cortex/Mimir)
  • Minimum 3 years experience preparing and presenting project proposals including requirements gathering and solution design

Qualifications

  • Experience with Kubernetes in production use cases at scale
  • Proficiency in Terraform, Ansible, or related configuration as code technology
  • Proficiency with the use of Git source control repository
  • Experience in cloud computing and working with cloud providers (AWS, GCP, Azure)
  • Experience configuring and operating Unix-like operating systems
  • Familiarity with an interpreted programming language such as Python or Ruby

Nice to Have-

  • Network design and resolving including firewall, load balancer and subnet configurations in either public cloud or private datacenter
  • Experience with PKI and certificate management
  • Configuring CI/CD pipelines
  • Secrets management using Hashicorp Vault

We tackle whatever challenges come our way. We have each other’s backs, we recognize our accomplishments, and we grow together. We celebrate and support one another – from big and small things in life to big career moments. And giving back is in our DNA (we get 10 days off each year to do just that).