Job Responsibilities
- Applies strong technical acumen and problem-solving methodologies to network operations of reasonable scope, with a focus on improving network operations, data and systems running at scale, and ensures end to end monitoring of network infrastructure and applications
- Resolves most nuances and determines appropriate escalation path
- Executes conventional approaches to build or break down technical problems
- Drives the daily activities supporting the standard capacity process infrastructure and applications
- Partners with application and infrastructure teams to identify potential capacity risks and govern remediation statuses
- Considers upstream/downstream network, data and systems or technical implications
- Accounts for making significant decisions for a project consisting of multiple technologies and applications
- Supports incident, problem, change management, risk, vendor and client relationship, internal and external communications, service delivery, lifecycle management and service evolution
- Accounts for incident, problem, change management, risk, vendor and client relationship, internal and external communications, service delivery, lifecycle management and service evolution
- Advocates for customer base when faced with Infrastructure or Business Critical issues, and acts independently and leverages critical thinking
- Adds to team culture of diversity, equity, inclusion, and respect
Required qualifications, capabilities, and skills
- Bachelor’s degree or equivalent in computer science or related fields
- Minimally 5 years of site reliability engineering or related experience
- Experience in Networking on Data Center, Enterprise, Perimeter, Routing, Switching, Security and Application layer networking (CCNP or CCIE)
- Experience in triaging and diagnosing issues in complex distributed architectures leveraging infrastructure and application telemetry
- Strong knowledge of one or more infrastructure disciplines such as hardware, networking terminology, databases, storage engineering, deployment practices, integration, automation, scaling, resilience, and performance assessments
- Strong knowledge of one or more scripting languages (e.g., Scripting, Python, Ansible etc.)
- Strong background in Packet Capture tools and analysis
- Experience with multiple cloud technologies with the ability to operate in and migrate across public and private clouds
- Ability to apply proficiency, articulate and share clear, concise complex technical findings to management in understandable terms
- Drives to develop infrastructure engineering knowledge of additional domains, data fluency, and automation knowledge
Preferred qualifications, capabilities, and skills
- Certification in automation is a plus