Key Responsibilities:
- Logging Strategy & Implementation:
- Design, implement, and maintain a unified logging strategy and architecture for SAP's cloud environments across AWS, Azure, and GCP.
- Ensure consistency in log collection, formatting, and ingestion from diverse sources, including hyperscaler platform logs, operating systems, databases, and enterprise applications.
- Develop and enforce logging standards, conventions (e.g., OpenTelemetry, structured logging like JSON), and metadata requirements for all log types.
- Platform & Layer Coverage:
- Configure and optimize logging for AWS (e.g., CloudWatch, S3, Lambda logs), Azure (e.g., Azure Monitor, Event Hubs, Storage Blobs), and GCP (e.g., Cloud Logging, Cloud Storage).
- Establish consistent logging practices for various operating systems (Linux, Windows), database systems (e.g., SAP HANA, traditional databases), and custom SAP and third-party applications.
- Collaborate with network and security teams to ensure comprehensive network flow and security event logging.
- Tooling & Automation:
- Implement, configure, and manage logging tools and platforms (e.g., Splunk, ELK Stack - Elasticsearch, Logstash, Kibana, Prometheus, Grafana, Thanos, Loki, OpenTelemetry collectors) for centralized log aggregation and analysis.
- Develop automation scripts and tools (e.g., Python, Go) for deploying, configuring, and managing logging agents and pipelines.
- Integrate logging solutions with CI/CD pipelines to ensure observability is built-in from the start.
- Observability & Insights:
- Collaborate with SRE, DevOps, and Development teams to define logging requirements for new services and applications.
- Ensure logs provide actionable insights for anomaly detection, performance monitoring, root cause analysis, and security incident response.
- Contribute to the development of real-time analytics and monitoring dashboards based on log data.
- Compliance & Security:
- Ensure logging practices comply with relevant industry standards, regulatory requirements, and SAP's internal security policies.
- Implement log retention, archiving, and protection mechanisms to safeguard log integrity and availability.
- Troubleshooting & Support:
- Provide Level 2/Level 3 support for logging infrastructure incidents, troubleshooting issues related to log collection, processing, and availability.
What You Bring:
- Bachelor’s degree in Computer Science, Software Engineering, or a related field.
- Minimum of 5+ years of experience in cloud operations, DevOps, or SRE roles with a strong focus on logging and observability.
- Deep expertise in logging methodologies and tools across multi-cloud environments (AWS, Azure, GCP).
- Proven experience with logging relevant to operating systems (Linux, Windows), databases (e.g., HANA, SQL), and application layers.
- Hands-on experience with log aggregation and analysis platforms (e.g., Splunk, ELK Stack, Grafana Loki, Prometheus, Azure Log Analytics and DCRs).
- Experience with OpenTelemetry standards (i.e. OCSF ) and implementing telemetry solutions.
- Proficiency in at least one scripting/programming language (e.g., Python, Go, Java) for automation and tool development.
- Familiarity with containerization (Docker) and orchestration (Kubernetes) in a cloud-native context.
- Strong understanding of microservices architecture and distributed systems.
- Excellent problem-solving skills and the ability to work in a fast-paced, collaborative, agile environment.
- Strong communication skills (written and verbal) to articulate technical concepts and collaborate with cross-functional teams.
Bonus Points:
- Experience with SAP-specific logging mechanisms
- Certifications in AWS, Azure, GCP, AliCloud, IBM Cloud
- Contributions to open-source observability projects.
- Experience with Machine Learning for anomaly detection on log data.
Why Join Us
- Be a part of a world-class cloud security engineering team shaping the security of critical infrastructure.
- Solve complex problems in a high-scale, multi-cloud enterprise environment.
- Drive meaningful impact on the resilience and trustworthiness of business-critical application
Successful candidates might be required to undergo a background verification with an external vendor.
Job Segment:Cloud, ERP, Compliance, Computer Science, Software Engineer, Technology, Legal, Engineering