Share
Key job responsibilities
Operations & On-Call Support- Serve as primary on-call support for DC infrastructure incidents, responding to high-severity issues (Sev1/Sev2) and troubleshooting failures in MDF and IDF environments.
- Provide technical troubleshooting and resolution for power, cooling, and networking infrastructure, ensuring rapid recovery and minimal downtime.- Configure, maintain, and monitor DCIM and power monitoring systems to proactively detect and prevent infrastructure failures.Lifecycle Management & Remediation- Lead lifecycle replacement projects for aging infrastructure, including UPS, PDU, HVAC, cabling, and rack refreshes.
- Execute remediation efforts for non-compliant, aging, or underperforming infrastructure based on audit findings and risk assessments.
- Work with vendors, TPMs, and engineers to plan and execute upgrades while minimizing operational disruption.Project & Deployment Support- Assist TPMs with deployment projects, ensuring new infrastructure meets Amazon standards and compliance requirements.
- Support design reviews, BOM validation, and implementation guidance for new MDF/IDF builds and network expansions.
- Validate post-build infrastructure performance, ensuring compliance with engineering specifications and audit readiness.Standardization & Documentation- Develop and maintain standard operating procedures (SOPs), troubleshooting guides, and operational best practices.
- Document incident resolution steps and contribute to a knowledge base for improving response times and reducing recurring failures.
- Participate in root cause analysis (RCA) efforts, identifying trends and proposing infrastructure improvements.A day in the life
As a DCIO Engineer, you’ll start your day triaging Sev1/Sev2 infrastructure alerts, coordinating with Field IT to restore service quickly. You’ll troubleshoot UPS, cabling, or HVAC issues, lead lifecycle replacements, and support remediation of non-compliant infrastructure. Between incidents, you'll review post-build audits, validate new deployments, and update SOPs. You’ll collaborate with TPMs, engineers, and vendors to improve system uptime, ensure compliance, and support ongoing infrastructure projects—balancing real-time operations with long-term reliability and standardization.
- Medical, Dental, and Vision Coverage
- Maternity and Parental Leave Options
- Paid Time Off (PTO)
- 401(k) Plan
- 3+ year of experience in data center operations, IT infrastructure, or network engineering.
- Hands-on experience with UPS, ATS, PDU, HVAC, structured cabling, and fiber optics.
- Strong troubleshooting skills in space, power, cooling, and network infrastructure.
- Experience supporting high-availability IT environments with 24/7 operations.
- Ability to work in a fast-paced environment, managing multiple priorities effectively.
- Strong documentation and process development skills.
- Must obtain the BICSI DCDC certification within the first six months of hire.
- Experience with DCIM platforms, power monitoring, and predictive maintenance tools.
- Knowledge of network architecture, structured cabling standards (TIA-942, BICSI), and low-voltage power systems.
- Familiarity with Amazon IT infrastructure and fulfillment center operations.
- Certifications such as BICSI, RCDD, or Uptime Institute Accredited Tier Specialist (ATS) are a plus.
These jobs might be a good fit