Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Facebook Production Systems Engineer Fleet AI Lead 
United States, Washington, Bellevue 
892552838

30.03.2025
Production Systems Engineer, Fleet AI Systems Lead Responsibilities
  • Lead interfacing with external vendors and internal hardware, mechanical, power, thermal, manufacturing and software engineers to understand system architecture to develop and execute the test suites for various architectures.
  • Contribute as a Tech Lead, owning and proactive creating experiments and tooling to detect and diagnose hardware/firmware/software health issues, in organized and collaborative efforts.
  • Develop test framework for large-scale test automation inside fleet during product development and after mass production.
  • Implement remediations across software and hardware stack according to plan, while keeping a thorough procedural record and data log.
  • Develop and publish updates on resolutions and communicate findings internally.
  • Troubleshoot, diagnose and root cause of system failures and isolate the components/failure scenarios while working with internal & external stakeholders.
  • Develop visibility through data visualization and implement systemic solutions to hardware health issues.
  • Drive necessary discussion with external and internal teams on test specification and methodologies to improve test quality continuously.
  • Develop robust, industry leading practices for supporting hardware infrastructure at scale.
Minimum Qualifications
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
  • 7+ years of experience in hardware server system support, troubleshooting server architecture and components, analyzing, triaging, and solving systems level issues.
  • Expertise with Linux and scripting (Python or similar).
  • 5+ years of experience in changing system configurations and measuring change impact, working through full lifecycle progressions of computer systems products.
  • 3+ years of experience engineering innovations in support of different server system/data center products.
Preferred Qualifications
  • Master's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
  • 10+ years of experience in production support at scale (e.g. - 10K storage servers and over 100K HDD) working through full system technologies.
  • 3+ years of experience in post-production, hyperscale environments, delivering solutions to complex systems issues.
  • 4+ years of experience supporting AI or HPC systems and/or related systems, at scale.
  • 3+ years of experience working in a matrix organization, owning or driving initiatives as a leading contributor.
About Meta

$163,000/year to $225,000/year + bonus + equity + benefits
Individual compensation is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base hourly rate, monthly rate, or annual salary only, and do not include bonus, equity or sales incentives, if applicable. In addition to base compensation, Meta offers benefits. Learn more about at Meta.