

We are seeking passionate and innovative software engineers to design and build cutting-edge networking infrastructure that powers large-scale AI training. This role focuses on developing next-generation networking capabilities to ensure high performance, low latency, and minimal jitter for distributed AI workloads. You will play a critical role in enabling state-of-the-art AI systems to achieve their full potential.
Networking & Infrastructure Expertise:
2+ years of experience with networking protocols (Ethernet, TCP/IP, RDMA, gRPC, Infiniband), network virtualization, SDN, and performance tuning.
Experience in managing live site operations for large-scale, mission-critical network infrastructure (e.g., data center networks, cloud interconnects, or AI supercomputing fabrics)
Experience with network monitoring, telemetry, alerting systems, and operational readiness practices to maintain uptime and service reliability at scale.
Distributed Systems & Engineering Leadership:
Experience in designing, building, and scaling large, fault-tolerant distributed systems and of system design, architecture, and high-availability engineering practices.
Other Qualifications:
Preferred Qualifications:
AI & Hardware Integration:
Familiar with AI accelerators like GPUs (NVIDIA, AMD) and TPUs, and their interaction with networking infrastructure. Experienced with telemetry and observability tools for large-scale network monitoring.
Programming & Mentorship:
Proficient in modern programming languages (e.g., C++, Java, C#, Python, Go, Rust). Demonstrated ability to lead technical initiatives and mentor engineers across various levels.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:• Microsoft will accept applications for the role until October 20, 2025.
משרות נוספות שיכולות לעניין אותך