Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

PayPal Site Reliabliity Engineer Intern 
France, Auvergne-Rhône-Alpes 
404013067

Yesterday

Key Responsibilities

  • Chaos Engineering:
    • Design and execute chaos experiments to identify system vulnerabilities (e.g., network latency, node failures).
    • Use tools like Gremlin, Chaos Toolkit, or custom scripts to simulate real-world failures.
    • Analyze results and collaborate on mitigations to improve system resilience.
  • LLM Integration:
    • Explore use cases for LLMs (e.g., OpenAI, Hugging Face) in SRE workflows, such as:
      • Automating incident response (e.g., generating runbooks or analyzing logs).
      • Enhancing documentation by summarizing technical content.
      • Predicting or diagnosing system issues using NLP-driven insights.
  • SRE Collaboration:
    • Participate in on-call rotations and incident response.
    • Help improve monitoring, logging, and alerting pipelines (e.g., Prometheus, Grafana).
    • Document processes and share findings through post-mortems or technical blogs.

Qualifications

  • Education: Pursuing a degree in Computer Science, Software Engineering, or a related field.
  • Technical Skills:
    • Proficiency in at least one scripting language (Python, Bash, or similar).
    • Familiarity with cloud platforms (AWS, GCP, Azure) andinfrastructure-as-code(Terraform, Kubernetes).
    • Basic understanding of distributed systems and networking.
  • Chaos Engineering:
    • Exposure to chaos engineering concepts (e.g., Netflix’s Chaos Monkey) or coursework/projects in system reliability.
    • Experience with chaos tools (bonus: hands-on use of Gremlin, Chaos Engineering frameworks).
  • LLM Knowledge:
    • Interest in AI/ML, with knowledge of how LLMs (e.g., GPT, Llama) work.
    • Experience using LLM APIs for automation or problem-solving (e.g., log analysis, code generation).

Preferred Qualifications

  • Completed coursework or projects in distributed systems, DevOps, or cloud architecture.
  • Prior experience with SRE/DevOps tools (e.g., Prometheus, Grafana, Kubernetes).
  • Open-source contributions or side projects related to chaos engineering or LLM applications.
  • Familiarity with incident management platforms (PagerDuty, Slack, Jira).

Our Benefits:

Any general requests for consideration of your skills, please