

Share
What you will do:
Own the resilience testing roadmap for vLLM and llm-d: define resilience indicators, prioritize fault scenarios, and establish go/no-go gates for releases and CI/CD
Design GPU/accelerator-aware fault experiments that target vLLM and the stack beneath it (drivers, GPU Operator/DevicePlugin, NCCL/collectives, storage/network paths, NUMA/topology)
Build an automated harness (preferably extending krkn-chaos (https://github.com/krkn-chaos/krkn) ) to run controlled experiments with scoped blast radius, and evidence capture (logs, traces, metrics)
Integrate fault signals into pipelines (GitHub Actions or otherwise) as resilience gates alongside performance gates
Develop detection and diagnostics: dashboards and alerts for pre-fault signals (e.g., vLLM queue depth, GPU throttling, P2P downgrades, KV-cache pressure, allocator fragmentation)
Triage and root-cause resilience regressions from field/customer issues; upstream bugs and fixes to vLLM and llm-d
Explore and experiment with emerging AI technologies relevant to software development and testing, proactively identifying opportunities to incorporate new AI capabilities into existing workflows and tooling.
Publish learnings (internal/external): failure patterns, playbooks, SLO templates, experiment libraries, and reference architectures; present at internal/external forums
What you will bring:
3+ years in reliability, and/or performance engineering on large-scale distributed systems
Expertise in systemsâlevel software design
Expertise with Kubernetes and modern LLM inference server stack (e.g., vLLM, TensorRT-LLM, TGI)
Observability & forensics skills with experience with Prometheus/Grafana, OpenTelemetry tracing, eBPF/BPFTrace/perf, Nsight Systems, PyTorch Profiler; adept at converting raw signals into actionable narratives.
Fluency in Python (data & ML), strong Bash/Linux skills
Exceptional communication skills - able to translate raw data into customer value and executive narratives
Commitment to openâsource values and upstream collaboration
The following is considered a plus:
Masterâs or PhD in Computer Science, AI, or a related field
History of upstream contributions and community leadership, public talks or blogs on resilience, or chaos engineering
Competitive benchmarking and failure characterization at scale.
The salary range for this position is $127,890.00 - $211,180.00. Actual offer will be based on your qualifications.
Pay Transparency
â Comprehensive medical, dental, and vision coverage
â Flexible Spending Account - healthcare and dependent care
â Health Savings Account - high deductible medical plan
â Retirement 401(k) with employer match
â Paid time off and holidays
â Paid parental leave plans for all new parents
â Leave benefits including disability, paid family medical leave, and paid military leave
These jobs might be a good fit

Share
äž»èŠè·å:
Red Hatã®Cloud補åããã³ãœãªã¥ãŒã·ã§ã³ãç¹ã«Red Hat Enterprise Linux (RHEL) ãçšããã·ã¹ãã æ§ç¯ã«é¢ããæè¡æ å ±ãæäŸããããšã
補åç¥èãæ·±ãæè¡ç¥èãé§äœ¿ããé¡§å®¢ã«æŠå¿µå®èšŒ (POC)ããã¬ãŒã³ããŒã·ã§ã³ããã¢ãæäŸããããšã
èŠèŸŒã¿å®¢ã«å¯ŸããŠè€éãªãœãªã¥ãŒã·ã§ã³ã玹ä»ããŠã䟡å€é§ååã®ã¢ãŒããã¯ãã£ãŒãèšèšãããã®ãããªæè¡ãœãªã¥ãŒã·ã§ã³ã®ã¢ããªã±ãŒã·ã§ã³ãè²»çšå¯Ÿå¹æã説æããããšãç¹ã«LinuxããŒã¹ã®ãœãªã¥ãŒã·ã§ã³ã«ãããåªäœæ§ã匷調ããŸãã
ã»ãŒã«ã¹ããŒã ãšé£æºããŠãå¥çŽãæ°ãã«ç²åŸã§ããããã«ãœãªã¥ãŒã·ã§ã³ãæäŸããããšã
顧客ã®ããžãã¹ã IT ç°å¢ãæ·±ãçè§£ããŠãã»ãŒã«ã¹ããŒã ãšé£æºããRed Hat 補åïŒç¹ã«RHELïŒãã©ã®ããã«åãå ¥ããããšãã§ããããè©äŸ¡ããããšã
å šã¹ããŒã¯ãã«ããŒã«äŸ¡å€ã鲿ãç¶æ³ãäŒããããšã
å¿ é ã¹ãã«:
ããªã»ãŒã«ã¹ãã»ãŒã«ã¹ãšã³ãžãã¢ãªã³ã°ããœãªã¥ãŒã·ã§ã³ã¢ãŒããã¯ãçã®çµéšã
Red HatãŸãã¯åé¡ã®è£œåãç¹ã«Red Hat Enterprise Linux (RHEL)ãKubernetesãã¯ã©ãŠããã€ãã£ãæè¡ã®ææ¡ãèšèšãæ§ç¯ãéçšã®çµéšã
ãªãŒãã³ãœãŒã¹ãç¹ã«Linuxãžã®ç±æãã¯ã©ãŠãããœãããŠã§ã¢ã«é¢ããç¥èãã客æ§ã®ããžãã¹ã IT ã®åé¡ã«é¢ããæ·±ãçè§£åããã©ã³ã¹è¯ãæããããšã
æ¡ä»¶ãé²ããŠããããã®åªããã³ãã¥ãã±ãŒã·ã§ã³åããã¬ãŒã³åã亀æžåã
ã客æ§ïŒãšã³ãžãã¢ãªã³ã°ãããžãã¹ããšã°ãŒã¯ãã£ãã¬ãã«ïŒãšã®é¢ä¿æ§ç¯åã
ç¶ç¶çã«åŠç¿ãæ°ããã±ã€ãããªãã£ãç¿åŸããææ¬²ãããããšã
åºç€çãªè±èªå(ç¹ã«ãªãŒãã£ã³ã°ãšãªã¹ãã³ã°)ã
Javaã䜿çšããã¢ããªã±ãŒã·ã§ã³éçºçµéšãããã°å°å¯ã
IoT / æ©æ¢°åŠç¿(AI) / FinTech / ã€ã³ãã°ã¬ãŒã·ã§ã³ / ãã€ã¯ããµãŒãã¹ãããããªãã¯ã¯ã©ãŠãã®å©çšçµéšãããã°å°å¯ã
These jobs might be a good fit

Share
è·åå 容:
æè¡ã¢ããã€ã¶ãŒãšããŠã販売åãã販売åŸã®å®è£ ãŸã§ã客æ§ãå°ããå°å ¥ã確å®ã«æåãããã
ãã¢ãã¯ãŒã¯ã·ã§ããããã€ããããããžã§ã¯ããéããŠæè¡ã®æ€èšŒãäž»å°ããã客æ§ã®ããŒãºãš Ansible ã®æ©èœãçµã³ä»ããã
ã»ãŒã«ã¹ããŒã ãæ¯æŽããäžå®æ°Žæºã®ææãã客æ§ã«æäŸããããã«ãåå©çšå¯èœãªãœãªã¥ãŒã·ã§ã³ã®æ çµã¿ãšã³ã³ãã³ããéçºããã
補åããŒã ãšååããŠã«ã¹ã¿ããŒãšã¯ã¹ããªãšã³ã¹ãåäžãããRed Hat 瀟å ã§ã客æ§ã®ããŒãºã代åŒããã
ã客æ§ã®æåãå®çŸããããã«ãRFP ã«å¯Ÿããåçã®äœæãããŒã ã®äžå¡ãšããŠæ¯æŽããã
æè¡ã¹ãã«:
Ansible Automation Platform (èªå®è³æ ŒãæãŸãã) ããã³ Puppet/Chef/SaltStack/Terraform ãªã©ã®ããŒã«ã«é¢ããå°éç¥èã åªããå®è·µçã¹ãã«ã
èªåååéã§ 6 幎以äžãã¢ãŒããã¯ãã£ãŒ/éçº/ã³ã³ãµã«ãã£ã³ã°åéã§ 5 - 10 幎ã®çµéšã
Linux (RHEL/Satellite)ãCisco ãããã¯ãŒã¯èªååãDevOps ææ³ã«ç²ŸéããŠããããšã
ããžãã¹ã¹ãã«:
çµå¶å¹¹éšã¬ãã«ã®é¢ä¿è ã«åãããããšã³ã¿ãŒãã©ã€ãºäŒæ¥ã® IT 課é¡ã«å¯ŸåŠããã¯ãã¹ãã©ãããã©ãŒã ãœãªã¥ãŒã·ã§ã³ãææ¡ããèœåã
å€§èŠæš¡ãª IT çµç¹å šäœãšé¢ä¿ãæ§ç¯ãããšã³ãããŒãšã³ãã®æŠå¿µå®èšŒããã»ã¹ã管çããçµéšã
æãŸããè³æ Œ:
Red Hat èªå®è³æ Œ (RHCEãAnsible SpecialistãArchitect) ããã³ã³ã³ãã¥ãŒã¿ãŒãµã€ãšã³ã¹/ãšã³ãžãã¢ãªã³ã°ã®åŠäœã
æ¥çãžã®è²¢ç® (ãã¯ã€ãããŒããŒãã«ã³ãã¡ã¬ã³ã¹ãªã©) ãéããŠæ¥çã®ç¬¬äžäººè ãšããŠã®å°äœãç¯ããåžžã«èªåååéã®ææ°ååãææ¡ããŠããã
These jobs might be a good fit

Share
è·åå 容:
Red Hat Ansible Automation Platform ã®ãœãªã¥ãŒã·ã§ã³ãšãŠãŒã¹ã±ãŒã¹ã«åºã¥ã顧客ã¢ã«ãŠã³ãã®ããžãã¹æé·æŠç¥ãæ åœãã
ã¢ã«ãŠã³ããã©ã³ã®çå®ããã»ã¹ã«ãã㊠Account ããŒã ãšååããã客æ§ã®ããžãã¹æšé²èŠå ãåæããŠããã¯ãããžãŒäž»å°ã®ã€ãããŒã·ã§ã³ãšããžã¿ã«å€é©ãå®çŸããããã®éèŠãªèŠçŽ ãšã㊠Red Hat ã®èªååãœãªã¥ãŒã·ã§ã³ãäœçœ®ä»ããã¹ããŒãªãŒãäœæãã
ã¢ã«ãŠã³ã管çããŒã ããœãªã¥ãŒã·ã§ã³ã¢ãŒããã¯ãããããã§ãã·ã§ãã«ãµãŒãã¹ããŒã ãšé£æºããŠãèŠèŸŒã¿å®¢ã®çºæããæçŽãŸã§ã®è€éãªè²©å£²ãµã€ã¯ã«ã管çãã
å®éçããã³å®æ§çãªããã©ãŒãã³ã¹ã®æåŸ ã«å¿ãã
ãªãŒããŒã·ããã¹ãã«ãšå°éå®¶ãšããŠã®è±å¯ãªçµéšã掻ãããçµå¶å¹¹éš (C ã¬ãã«ã®æææ±ºå®è ) ã«åããããŠä¿¡é Œãç²åŸããããšã§ãå€é©ããããããããžã§ã¯ããåµåºãã
Red Hat ãã¯ãããžãŒãœãªã¥ãŒã·ã§ã³ãããžãã¹ã«ãããã广ã瀺ããŠãã客æ§ããããžã§ã¯ãã«åãçµãã¹ã説åŸåã®ããçç±ãç²åŸãã
ã客æ§ã®ããžãã¹èŠä»¶ã«åãã㊠Red Hat ã®ãœãªã¥ãŒã·ã§ã³ãã«ã¹ã¿ãã€ãºãã
Red Hat ãœãªã¥ãŒã·ã§ã³ãããããå·®å¥åãããããžãã¹äŸ¡å€ãš Red Hat ã®ç«¶äºåªäœæ§ããã客æ§ã®æææ±ºå®è ã«çè§£ããŠããã ãåæãåŸã
Red Hat ã®ã»ãŒã«ã¹ããŒã ãšããŒãããŒããRed Hat ã®ãœãªã¥ãŒã·ã§ã³ãããããããžãã¹äŸ¡å€ã广çã«äŒæ¥ã«äŒããããããæ¯æŽãã
Red Hat ã®ãžã£ãŒããŒããŒã¹ã®ãµãŒãã¹ãšã³ã²ãŒãžã¡ã³ãããã°ã©ã ãšåçšè³Œè²·ããã°ã©ã ãæŽ»çšããã客æ§ãšã®é·æçãã€æŠç¥çãªé¢ä¿ãæ§ç¯ãã
å¿åè³æ Œ:
10 幎以äžã®èªååããã³ç®¡çãœãããŠã§ã¢è£œåãã¯ã©ãŠããµãŒãã¹ããŸãã¯é¢é£ãã¯ãããžãŒè£œåã®è²©å£²çµéš
䟡å€ããŒã¹ã®ãœãªã¥ãŒã·ã§ã³è²©å£²çµéšãã客æ§ã®ããžãã¹ç®æšãå€é©ç®æšãããã¯ãããžãŒãœãªã¥ãŒã·ã§ã³ãæäŸãã䟡å€ãšçµã³ä»ããèœå
åµé çãªæèåãã³ãã¥ãã±ãŒã·ã§ã³èœåãããã³ãã¬ãŒã³ããŒã·ã§ã³ã¹ãã«
ãªãŒãã³ãœãŒã¹ãã¯ãããžãŒãžã®æ ç±ãš Red Hat ã®ãœãããŠã§ã¢ãµãã¹ã¯ãªãã·ã§ã³ããžãã¹ã¢ãã«ã®çè§£
é¡§å®¢ã®æåãå®çŸããããã«ãã°ããŒãã«ãã€éšé暪æçãªããŒã ãšã·ãŒã ã¬ã¹ã«é£æºããŠããå®çžŸ
以äžã®åéã«ãããå°éç¥è:
IT ã®èªååãšç®¡ç
ããžãã¹ããã»ã¹ã®èªåå
ãããã£ãã¯ããã»ã¹ãªãŒãã¡ãŒã·ã§ã³ (RPA)
IT ã»ãã¥ãªãã£ãŒãšã³ã³ãã©ã€ã¢ã³ã¹
人工ç¥èœ (AI) ãšéçš
DevOpsãç¶ç¶çã€ã³ãã°ã¬ãŒã·ã§ã³ (CI) ããã³ç¶ç¶çããªããªãŒ (CD)ããã¹ãããœãããŠã§ã¢éçºã©ã€ããµã€ã¯ã« (SDLC)ãã¢ãžã£ã€ã«ææ³
ãã€ããªããã¯ã©ãŠãããããªãã¯ã¯ã©ãŠããããã³ãã©ã€ããŒãã¯ã©ãŠã
ã³ã³ãããŒãš Kubernetes
ãã¯ãããžãŒãœãªã¥ãŒã·ã§ã³ã®ããžãã¹äŸ¡å€ãæç€ºããèœå
æ¶è²»ããŒã¹ã®äŸ¡æ Œã¢ãã«ããœãããŠã§ã¢ãµãã¹ã¯ãªãã·ã§ã³ãšã©ã€ã»ã³ã¹
Red Hat ã®ãœãããŠã§ã¢ããŒããã©ãªãªãšç«¶å補åã«é¢ããçè§£
These jobs might be a good fit

Share
Primary Job Responsibilities
Manages the development and application of a mature/dynamic multi-year customer account plan based on proven methodologies to manage a sustainable, long-term business portfolio. Leads strategies for the assigned account that high-volume sales and open new opportunities for both customer and Red Hat, aligned to goals, budgets, and forecasts.
Leads and coordinates a diverse team on plan execution and drives accountability to execute and deliver on account plans and grow the account, leveraging industry expertise.
Proactively expands the strategic network of key internal and external partners and decision makers, including vertical industry partners, to ensure execution of core tasks and account transactions, and to provide a comprehensive account management experience.
Demonstrates an understanding of the customer's business model to articulate growth opportunities, leveraging industry expertise to shape the ecosystem. Influences relevant (internal and external) stakeholders and resources to drive change on behalf of the customer and to enhance team capabilities, improve Red Hat offerings.
Required Skills
7+ years of experience working in IT sales with exceptional record
Ability to work as part of a fast-paced and growing team as well as on your own
Good understanding of the companies and opportunities that exist within Japan
Good communication and technical skills to develop relationships at engineering, commercial, and executive levels throughout organizations
Good understanding of the enterprise market and partner ecosystem
High ethical standards and integrity
Understanding of Container, Linux, and middleware software-related sales cycles is a plus
Experience selling open source software technology or other software services in a subscription model is a plus
These jobs might be a good fit

Share
Primary Job Responsibilities
ã¬ãããããã«ãšã£ãŠéèŠãã€æŠç¥çã«éèŠãªãæå®ããããšã³ã¿ãŒãã©ã€ãºã¢ã«ãŠã³ããšã®é¡§å®¢é¢ä¿ãæ§ç¯ãã
æå®ãããã¢ã«ãŠã³ãå ã§æ°ããé¢ä¿ãç¶æãéçºããªãããCã¬ãã«ã®åœ¹å¡ãå«ãçµç¹ã®ãã¹ãŠã®ã¬ãã«ã§é¢ä¿æ§ç¯ããªãŒããã
ã¢ã«ãŠã³ããã©ã³ãçå®ããååæããšã®ç®æšããã³ã¢ã«ãŠã³ãã®å šäœçãªæŠç¥ççºå±ããªãŒããã
ã¢ã«ãŠã³ãããŒã ã®ãªãŒããŒãšããŠãããªã»ãŒã«ã¹ããµããŒããã³ã³ãµã«ãã£ã³ã°ãµãŒãã¹ãå§ããšãã瀟å å€ã®é¢ä¿è ããŸãšããæ åœã¢ã«ãŠã³ãã«å¯Ÿããã¬ãããããã®æäŸè£œåããŒããã©ãªãªå šäœãæé·ããã
ãã€ãã©ã€ã³ã®åµåºãšæ£ç¢ºãªäºæž¬ã«éç¹ã眮ãããæ¡ä»¶åµåºããã¯ããŒãžã³ã°ãŸã§ã®ã»ãŒã«ã¹ãµã€ã¯ã«å šäœã®ç®¡ç
Required Skills
3幎以äžã®äŒæ¥åãITã»ãŒã«ã¹ã®çµéšãæããå€§äŒæ¥åãã¢ã«ãŠã³ããŸãã¯ã°ããŒãã«ã¢ã«ãŠã³ãã®ç®¡çã«æåããå®çžŸãããããš - ã¢ã«ãŠã³ãæŠç¥ã®èšå®ãã³ãããã¡ã³ããããäºæž¬ã®å®çŸãè²©å£²ç®æšã®è¶ ééæã®åè¶ããèšé²
åžžã«æè»ãªæèãæã¡ãæ°ããªå¯èœæ§ã远æ±ãããã€ã³ãã»ãã
é¢ä¿è ãã¹ãŠãšå¿ççå®å šæ§ãç¯ããäººéæ§
ãããªãã¯ã¹åçµç¹ã«ãããŠå€æ§ãªã¹ããŒã¯ãã«ããŒãå·»ã蟌ãåªãããªãŒããŒã·ãããšã³ãã¥ãã±ãŒã·ã§ã³å
ã»ãŒã«ã¹ãµã€ã¯ã«ã®çè§£ã䌎ãå å®ãªæŠç¥ç«æ¡èœå
æ åœãšãªã¢å ã®é¡§å®¢ããã³ããŒãããŒã«ã€ããŠã®æ·±ãçè§£ïŒé¡§å®¢ããžãã¹ãæ¥çååãç«¶åç¶æ³ãRed Hatã®å·®å¥åèŠçŽ ãšæäŸäŸ¡å€ãå«ãïŒ
Red Hatã®ãœãªã¥ãŒã·ã§ã³ã®äŸ¡å€ãå·®å¥åãã€ã³ããããžãã¹æ©äŒã顧客ããã³ããŒãããŒã«æç¢ºã«äŒãã
These jobs might be a good fit

Share
Primary Job Responsibilities
ã¬ãããããã«ãšã£ãŠéèŠãã€æŠç¥çã«éèŠãªãæå®ããããšã³ã¿ãŒãã©ã€ãºã¢ã«ãŠã³ããšã®é¡§å®¢é¢ä¿ãæ§ç¯ãã
æå®ãããã¢ã«ãŠã³ãå ã§æ°ããé¢ä¿ãç¶æãéçºããªãããCã¬ãã«ã®åœ¹å¡ãå«ãçµç¹ã®ãã¹ãŠã®ã¬ãã«ã§é¢ä¿æ§ç¯ããªãŒããã
ã¢ã«ãŠã³ããã©ã³ãçå®ããååæããšã®ç®æšããã³ã¢ã«ãŠã³ãã®å šäœçãªæŠç¥ççºå±ããªãŒããã
ã¢ã«ãŠã³ãããŒã ã®ãªãŒããŒãšããŠãããªã»ãŒã«ã¹ããµããŒããã³ã³ãµã«ãã£ã³ã°ãµãŒãã¹ãå§ããšãã瀟å å€ã®é¢ä¿è ããŸãšããæ åœã¢ã«ãŠã³ãã«å¯Ÿããã¬ãããããã®æäŸè£œåããŒããã©ãªãªå šäœãæé·ããã
ãã€ãã©ã€ã³ã®åµåºãšæ£ç¢ºãªäºæž¬ã«éç¹ã眮ãããæ¡ä»¶åµåºããã¯ããŒãžã³ã°ãŸã§ã®ã»ãŒã«ã¹ãµã€ã¯ã«å šäœã®ç®¡ç
Required Skills
3幎以äžã®äŒæ¥åãITã»ãŒã«ã¹ã®çµéšãæããå€§äŒæ¥åãã¢ã«ãŠã³ããŸãã¯ã°ããŒãã«ã¢ã«ãŠã³ãã®ç®¡çã«æåããå®çžŸãããããš - ã¢ã«ãŠã³ãæŠç¥ã®èšå®ãã³ãããã¡ã³ããããäºæž¬ã®å®çŸãè²©å£²ç®æšã®è¶ ééæã®åè¶ããèšé²
åžžã«æè»ãªæèãæã¡ãæ°ããªå¯èœæ§ã远æ±ãããã€ã³ãã»ãã
é¢ä¿è ãã¹ãŠãšå¿ççå®å šæ§ãç¯ããäººéæ§
ãããªãã¯ã¹åçµç¹ã«ãããŠå€æ§ãªã¹ããŒã¯ãã«ããŒãå·»ã蟌ãåªãããªãŒããŒã·ãããšã³ãã¥ãã±ãŒã·ã§ã³å
ã»ãŒã«ã¹ãµã€ã¯ã«ã®çè§£ã䌎ãå å®ãªæŠç¥ç«æ¡èœå
æ åœãšãªã¢å ã®é¡§å®¢ããã³ããŒãããŒã«ã€ããŠã®æ·±ãçè§£ïŒé¡§å®¢ããžãã¹ãæ¥çååãç«¶åç¶æ³ãRed Hatã®å·®å¥åèŠçŽ ãšæäŸäŸ¡å€ãå«ãïŒ
Red Hatã®ãœãªã¥ãŒã·ã§ã³ã®äŸ¡å€ãå·®å¥åãã€ã³ããããžãã¹æ©äŒã顧客ããã³ããŒãããŒã«æç¢ºã«äŒãã
These jobs might be a good fit

What you will do:
Own the resilience testing roadmap for vLLM and llm-d: define resilience indicators, prioritize fault scenarios, and establish go/no-go gates for releases and CI/CD
Design GPU/accelerator-aware fault experiments that target vLLM and the stack beneath it (drivers, GPU Operator/DevicePlugin, NCCL/collectives, storage/network paths, NUMA/topology)
Build an automated harness (preferably extending krkn-chaos (https://github.com/krkn-chaos/krkn) ) to run controlled experiments with scoped blast radius, and evidence capture (logs, traces, metrics)
Integrate fault signals into pipelines (GitHub Actions or otherwise) as resilience gates alongside performance gates
Develop detection and diagnostics: dashboards and alerts for pre-fault signals (e.g., vLLM queue depth, GPU throttling, P2P downgrades, KV-cache pressure, allocator fragmentation)
Triage and root-cause resilience regressions from field/customer issues; upstream bugs and fixes to vLLM and llm-d
Explore and experiment with emerging AI technologies relevant to software development and testing, proactively identifying opportunities to incorporate new AI capabilities into existing workflows and tooling.
Publish learnings (internal/external): failure patterns, playbooks, SLO templates, experiment libraries, and reference architectures; present at internal/external forums
What you will bring:
3+ years in reliability, and/or performance engineering on large-scale distributed systems
Expertise in systemsâlevel software design
Expertise with Kubernetes and modern LLM inference server stack (e.g., vLLM, TensorRT-LLM, TGI)
Observability & forensics skills with experience with Prometheus/Grafana, OpenTelemetry tracing, eBPF/BPFTrace/perf, Nsight Systems, PyTorch Profiler; adept at converting raw signals into actionable narratives.
Fluency in Python (data & ML), strong Bash/Linux skills
Exceptional communication skills - able to translate raw data into customer value and executive narratives
Commitment to openâsource values and upstream collaboration
The following is considered a plus:
Masterâs or PhD in Computer Science, AI, or a related field
History of upstream contributions and community leadership, public talks or blogs on resilience, or chaos engineering
Competitive benchmarking and failure characterization at scale.
The salary range for this position is $127,890.00 - $211,180.00. Actual offer will be based on your qualifications.
Pay Transparency
â Comprehensive medical, dental, and vision coverage
â Flexible Spending Account - healthcare and dependent care
â Health Savings Account - high deductible medical plan
â Retirement 401(k) with employer match
â Paid time off and holidays
â Paid parental leave plans for all new parents
â Leave benefits including disability, paid family medical leave, and paid military leave
These jobs might be a good fit