8+ years of progressive experience in Linux Server Administration with enterprise level skills in Physical and virtual environment management.
Deep Understanding of Linux security methodologies, Hardening procedure and best practices.
Deep Understanding of Linux Internals – Process Management, Memory handling. Kernel Tuning, Capacity planning.
Evaluate and maintain server storage and other IT standards for the enterprise and all supporting products and services.
Excellent problem solving and analytical skills.
Analyze server and storage characteristics (e.g., CPU behavior latency transmission speeds packet loss and throughput) triage and troubleshoot problems. Knowledge and experience of Root Cause Analysis (RCA) required
Familiarity with complex multi-site and global network design multi-layered switch environment firewalls VPN Load balancers and DNS
Familiar with the fundamentals of web application and relational database architectures
Aptitude for influencing others without direct authority
Ability to prioritize and work well under pressure
Excellent communication and documentation skills both in technical and non-technical arenas
Ability to work with a global support team
Ability and willingness for on-call and after-hours work
Ability to thrive in new hybrid work schedule: Days in office, and days remote
Adept at learning and applying new technologies and solving new problems
A real passion and excitement about improving processes with new technology
Ability to coordinate and engage with stakeholders
Skills & Qualifications – Nice to Have:
Scripting / Coding Knowledge- Understanding and writing a code and script (Shell, Python. Perl, Yaml, and / or PHP)
Hands on Experience with automation tools (Ansible, Chef, Puppet)
Knowledge / Hands on Experience with analytics / observability tools (Grafana, Kibana, Elk, Splunk, AppDynamics)
Experience with container technologies (docker, Kubernetes, OpenShift, AWS, GCP).
Code Version Control: Knowledge or certification with BitBucket, GitHub, or AWS CodeCommit
Software Collaboration/Repository Tools: Knowledge or certification with JFrog Artifactory, JIRA
Understanding of agile and other development processes and methodologies
Role Description:
Deliver expert-level support and resolution for complex incidents and problems within the Linux ecosystem, ensuring minimal business impact and adherence to SLAs.
Troubleshoot and resolve escalated issues in the Linux Environment.
Proactively monitor system health, performance and capacity to ensure stability, availability and reliability.
Perform Kernel Upgrade, patches and Vulnerability Management to ensure compliance meeting the organization standards.
Design and enforce system hardening, access control, and security baselines aligned with the compliance framework.
Server as escalation point and technical advisor for Level 1 and 2 administrators, develop technical runbooks and knowledge-based articles.
Participate and lead initiatives for architectural reviews, infrastructure modernization, and enterprise risk assessments.
Lead Major Incidents and engage with stake holders to perform Root Cause Analysis and provide permanent solutions.
Education:
Bachelor's/University degree in Computer Science/Engineering preferred or equivalent experience
Full timeIrving Texas United States$125,760.00 - $188,640.00