Expoint – all jobs in one place
The point where experts and best companies meet
Limitless High-tech career opportunities - Expoint

Palo Alto Principal DevOps Engineer Cortex 
India, Karnataka, Bengaluru 
663973141

Yesterday

Being the cybersecurity partner of choice, protecting our digital way of life.

Your Impact

As a Principal Engineer in the Global SRE Automation group, you will shape the future of infrastructure reliability, scale, and developer productivity. You will lead the design and development of cloud-native automation tools, streamline operational workflows, and embed resilience into every layer of the platform.

You will:

  • Architect and build automation systems that support self-healing, observability, and service-level assurance
  • Contribute to the developer experience and internal tooling ecosystem, driving reliability through code
  • Influence the SRE strategy by introducing innovations in cloud-native backend services, Kubernetes automation, and platform engineering
  • Partner with global teams to deliver reliable infrastructure, integrating AI models, event-driven systems, and data pipelines to unlock operational insights
  • Set standards for code quality, system design, and operational excellence across the organization

Your Experience

  • 10+ years of experience in Cloud Engineering, DevOps, or Infrastructure Software Development, with a strong focus on automation, reliability, and platform scalability
  • Deep expertise in AWS and Google Cloud Platform (GCP), with strong understanding of networking, compute, serverless, and cost-optimization services
  • Proficient in Python or Go, with a solid grasp of modern backend development frameworks (e.g., Flask, FastAPI, Gin) and cloud-native application design
  • Hands-on experience building RESTful APIs, microservices, and cloud-native platforms supporting high availability and self-service
  • Designed and integrated Generative AI and LLM-based pipelines, including Retrieval-Augmented Generation (RAG), into internal tooling and operational systems to enhance developer productivity and incident response
  • Applied predictive analytics, anomaly detection, and MLOps for use cases such as cost forecasting, capacity planning, and proactive incident management
  • Built and optimized Cloud FinOps tooling to monitor usage patterns, reduce waste, and provide actionable insights into cloud spend
  • Developed AI-driven automation agents (bots) for cloud operations, alert triage, knowledge retrieval, and ticket deflection
  • Strong experience with:
    - Infrastructure-as-Code: Terraform, CDK
    - Kubernetes: Cluster lifecycle management, Helm/Kustomize, GitOps (ArgoCD)
    - CI/CD pipelines, observability frameworks (Prometheus, Grafana, ELK), and SRE tooling for incident automation
  • Proficient in SQL and NoSQL databases, such as PostgreSQL and Elasticsearch
  • Exposure to Kafka and event-driven architectures for real-time data streaming and integration.
  • Excellent problem-solving, debugging, and systems design skills
  • Demonstrated leadership in cross-functional engineering teams, including mentoring, architectural guidance, and influencing long-term platform direction

All your information will be kept confidential according to EEO guidelines.