Share
The ideal candidate will have proven experience in the field, focusing on kernel development and cluster automation(build, os/kubernetes upgrade and decommission). You will also drive the implementation of observability practices to monitor, troubleshoot, and ensure the reliability of our infrastructure at scale.
What you will accomplish:
Design, develop, and maintain a stable, high-performance Linux operating system optimized for the Kubernetes platform, along with the supporting cluster management system.
Contribute to kernel development and performance tuning to enhance system scalability, reliability, and efficiency; stay up to date with the latest advancements in kernel and security technologies.
Build high-performance tools and services using Go and Python to support infrastructure automation and diagnostics.
Develop BPF-based tools for in-depth OS diagnostics and implement Cilium/BPF-based network segmentation and service mesh solutions.
Collaborate with cross-functional teams to validate, adopt, and integrate optimized Linux OS distributions across diverse infrastructure environments.
Implement robust observability frameworks to monitor system health, ensure performance, and support proactive issue resolution at scale.
What you will bring:
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
Minimum of 5 years of hands-on experience with Linux systems, including a strong understanding of Linux kernel development and OS internals—such as process scheduling, memory management, file systems, and networking.
Proficient in programming with C++, Go, or Python.
Deep expertise in orchestrating containerized applications and building scalable cluster management systems.
Skilled at identifying system-level gaps and cross-functional issues, proposing effective solutions, and driving end-to-end resolution.
Demonstrated ability to lead and mentor team members, manage small projects, and collaborate effectively across teams to drive impactful change.
These jobs might be a good fit