Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

NetApp Site Reliability Engineer 
United States, North Carolina 
540495369

30.04.2024
Job Summary

As a Cloud Infrastructure/Site Reliability Engineer, you operate seamlessly between development and operations. You will engage in and improve the lifecycle of cloud services - from design to development, deployment, operation, and refinement. You will maintain services by measuring and monitoring availability, latency, and overall system health. You will play an important role in scaling systems sustainably through automation and evolving them by pushing for changes to improve reliability and velocity. You will administer cloud-based environments that support our SaaS / IaaS offerings that are implemented on a microservices, container-based architecture (Kubernetes).Essential Functions:
• Work with other Cloud Infrastructure Engineer and developers to ensure maximum performance, reliability and automation of our deployments and infrastructure.
• Work with, consult and influence developers on new features and software architecture to ensure scalability.
• Develop software, both as components of our solution and outside of the solution, for deployment automation, packaging, and monitoring visibility.
• Identify tasks and areas where automation can be applied to achieve time efficiencies and risk reduction.
• Debug and troubleshoot service bottlenecks throughout the whole software stack.
• Measure and monitor availability, latency, and overall system health.
• Provide advanced escalation support (tier 2 and 3) to NetApp ‘s Cloud Data Services solutions.
• You will have direct influence on the decisions and outcomes related to solution implementation.


Job Requirements

Ability to embrace new technologies and work in a fast-paced, global environment.
• Systematic problem-solving approach coupled with a sense of ownership and drive.
• Excellent written and verbal communication skills.
• Ability to manage competing priorities and multiple deadlines.
Good interpersonal communication and customer service skills are needed to work successfully with stakeholders in high stress and/or ambiguous situations.
• Strong understanding of systems designs and networking, with regards to performance and scale.
• Familiarity with the lifecycle of cloud services - from design to deployment, operation, and refinement.


Responsibility and Interaction:

Responsibility
• The types of tasks this individual is responsible for are often unique, non-routine and unstructured, requiring creative solutions.
• This individual will apply attained experiences and knowledge in solving routine to moderately complex problems.
• This role includes on-call work and travel from time-to-time.

Interaction
• This individual interacts primarily with their direct manager, site reliability team, development team, and hyperscaler partners on assigned projects and deployments. This may involve interaction across functions, geo-locations, and from staff to Vice President level.
• Limited management direction is provided on new projects or assignments; general guidance is provided on new assignments.
• The ideal candidate will be a proactive contributor and subject matter expert on team projects.
• To be successful, this individual must demonstrate favorable results through coaching and influencing others.

A minimum of 5 years of experience is required. 7-9 years of experience is preferred.
• A Bachelor of Science Degree in Computer Science, a master’s degree, or a PhD; or equivalent experience is required.
• Demonstrated Linux/Unix, CORE OS experience.
• Development experience in Java, .net is a plus.
• Scripting and infrastructure automation using for example, Ansible, Python, Go, Perl or Ruby.
• Deep working Knowledge on Containers, Kubernetes, Serverless computing implementation.
• Understanding of SDLC lifecycle and DevOps development methodologies
• Demonstrated ability to have completed multiple, moderately complex technical tasks.
• Familiarity with distributed systems design patterns using tools such as Kubernetes.
• Familiarity with AWS, Azure or Google Cloud compute.