Expoint – all jobs in one place
המקום בו המומחים והחברות הטובות ביותר נפגשים

דרושים Support Engineer ב-United States, California, Sacramento

ממשו את הפוטנציאל שלכם בתעשיית ההייטק עם אקספוינט! חפשו הזדמנויות עבודה בתור Support Engineer בUnited States, California, Sacramento והצטרפו לעוד אלפים שכבר מצאו עבודה בחברות המובילות. התחילו את המסע שלכם עוד היום ומצאו את הקריירה האידיאלית עבורכם בתור Support Engineer עם אקספוינט.
חברה
אופי המשרה
קטגוריות תפקיד
שם תפקיד (1)
United States
California
Sacramento
נמצאו 16 משרות
06.09.2025
R

Red hat Senior Performance Resilience Engineer - LLM Inference United States, California, Sacramento

Limitless High-tech career opportunities - Expoint
Own the resilience testing roadmap for vLLM and llm-d: define resilience indicators, prioritize fault scenarios, and establish go/no-go gates for releases and CI/CD. Design GPU/accelerator-aware fault experiments that target vLLM...
תיאור:

What you will do:

  • Own the resilience testing roadmap for vLLM and llm-d: define resilience indicators, prioritize fault scenarios, and establish go/no-go gates for releases and CI/CD

  • Design GPU/accelerator-aware fault experiments that target vLLM and the stack beneath it (drivers, GPU Operator/DevicePlugin, NCCL/collectives, storage/network paths, NUMA/topology)

  • Build an automated harness (preferably extending krkn-chaos (https://github.com/krkn-chaos/krkn) ) to run controlled experiments with scoped blast radius, and evidence capture (logs, traces, metrics)

  • Integrate fault signals into pipelines (GitHub Actions or otherwise) as resilience gates alongside performance gates

  • Develop detection and diagnostics: dashboards and alerts for pre-fault signals (e.g., vLLM queue depth, GPU throttling, P2P downgrades, KV-cache pressure, allocator fragmentation)

  • Triage and root-cause resilience regressions from field/customer issues; upstream bugs and fixes to vLLM and llm-d

  • Explore and experiment with emerging AI technologies relevant to software development and testing, proactively identifying opportunities to incorporate new AI capabilities into existing workflows and tooling.

  • Publish learnings (internal/external): failure patterns, playbooks, SLO templates, experiment libraries, and reference architectures; present at internal/external forums

What you will bring:

  • 3+ years in reliability, and/or performance engineering on large-scale distributed systems

  • Expertise in systems‑level software design

  • Expertise with Kubernetes and modern LLM inference server stack (e.g., vLLM, TensorRT-LLM, TGI)

  • Observability & forensics skills with experience with Prometheus/Grafana, OpenTelemetry tracing, eBPF/BPFTrace/perf, Nsight Systems, PyTorch Profiler; adept at converting raw signals into actionable narratives.

  • Fluency in Python (data & ML), strong Bash/Linux skills

  • Exceptional communication skills - able to translate raw data into customer value and executive narratives

  • Commitment to open‑source values and upstream collaboration

The following is considered a plus:

  • Master’s or PhD in Computer Science, AI, or a related field

  • History of upstream contributions and community leadership, public talks or blogs on resilience, or chaos engineering

  • Competitive benchmarking and failure characterization at scale.

The salary range for this position is $127,890.00 - $211,180.00. Actual offer will be based on your qualifications.

Pay Transparency

● Comprehensive medical, dental, and vision coverage

● Flexible Spending Account - healthcare and dependent care

● Health Savings Account - high deductible medical plan

● Retirement 401(k) with employer match

● Paid time off and holidays

● Paid parental leave plans for all new parents

● Leave benefits including disability, paid family medical leave, and paid military leave

Show more
04.07.2025
R

Red hat Senior Technical Support Engineer United States, California, Sacramento

Limitless High-tech career opportunities - Expoint
Commitment to providing an exceptional customer experience by using professional communication and applying product knowledge and deep troubleshooting to perform direct actions in cluster environments to resolve various issues. Contribute...
תיאור:

What you will do:

  • Commitment to providing an exceptional customer experience by using professional communication and applying product knowledge and deep troubleshooting to perform direct actions in cluster environments to resolve various issues.

  • Contribute to global initiatives and projects to constantly reduce customer effort, improve tooling, and design and write automation software to improve efficiency.

  • Act as the direct contact and advisor for customer inquiries and issues with their Cloud Services through our Customer Portal, conference calls, and remote access.

  • Proactively analyze cluster status, identify single points of failure and other high-risk architecture issues; propose and implement more resilient resolutions.

  • Record customer interactions including investigation, troubleshooting, and resolution of issues, to document diagnostic steps and issue resolution to create reusable solutions for future incidents.

  • Create and maintain knowledge articles aligned with the KCS (Knowledge-Centered Service) methodology.

  • Partner with internal teams and external parties to deliver seamless infrastructure support for Red Hat’s Cloud Services.

  • Manage incident and issue workloads to ensure that all customer issues are handled and resolved in a timely manner.

  • Maintain a strong work ethic, able to work effectively as part of a team, and focus on customers and resolving their issues.

  • Be available to perform weekend shift duties on a rotational schedule.

What you will bring:

  • 5+ years of experience in a customer-facing technical support or solutions engineering role.

  • Proven experience in Infrastructure Implementation, Deployment, Administration, and Production Support of container technologies and orchestration platforms (e.g., CRI-O, Kubernetes, xKS, Docker, OpenShift Container Platform).

  • Experience with developer workflows, Continuous Integration (e.g., Jenkins), and Continuous Deployment paradigms.

  • Exceptional technical, analytical, and troubleshooting skills using tools like curl, strace, oc (kubectl), and Wireshark analysis to investigate and form precise action plans for issue remediation with components such as networking, system performance issues, Kubernetes, OpenShift Container Platform, Service Mesh, and RESTful API calls.

  • Experience working with tools surrounding the Kubernetes ecosystem such as Prometheus, Grafana, FluentD, etc.

  • Experience working with configuration management tools (e.g., Ansible, Terraform) and monitoring and automation tools (e.g., Ansible, Splunk).

  • Proficient scripting and automation skills (e.g., Python, Bash, Go) to convert manual and maintenance functions into fully orchestrated automation is a plus.

  • Ability to operate in complex, highly secure, and highly available environments and interact with Site Reliability Engineering (SRE) domain experts maintaining those environments.

  • Familiarity with established ITIL practices such as Incident, Change, Problem, and Release Management.

  • Excellent English communication skills (written and verbal) and interpersonal skills, with a desire to mentor other members of the support team and share technical knowledge in a helpful and timely fashion.

  • Experience logging issues and working with issue tracking tools such as Jira.

  • Ability to work effectively as part of an agile team, actively communicate status, and complete deliverables on schedule with a strong sense of initiative and ownership.

  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.

  • Ability to work effectively and collaborate within a geographically distributed, global team.

The salary range for this position is $84,400.00 - $134,970.00. Actual offer will be based on your qualifications.

Pay Transparency

● Comprehensive medical, dental, and vision coverage

● Flexible Spending Account - healthcare and dependent care

● Health Savings Account - high deductible medical plan

● Retirement 401(k) with employer match

● Paid time off and holidays

● Paid parental leave plans for all new parents

● Leave benefits including disability, paid family medical leave, and paid military leave

Show more

משרות נוספות שיכולות לעניין אותך

09.05.2025
F

Fortinet Systems Engineer SLED United States, California, Sacramento

Limitless High-tech career opportunities - Expoint
Sales calls - be the main technical resource on sales calls and answer/ educate the customer on issues ranging from features, specifications and functionality to integration. Conversant with networking applications...
תיאור:


• Pre-sales - assist in qualifying sales leads from a technical standpoint.
• Sales calls - be the main technical resource on sales calls and answer/ educate the customer on issues ranging from features, specifications and functionality to integration.
• Conversant with networking applications and solutions.
• Post-sales - be the lead technical contact for identified accounts for technical issues and will work closely with the technical support team and engineering to answer, elevate and resolve customer's technical issues.
• Provide assistance to identified customers with post-sales training.


Required Skills:

• 5 – 8 years experience in technical/pre-sales support as a sales or systems engineer
• 5 – 8 years experience in LAN/WAN/Internet services administration
• Strong understanding of DNS and NFS, SMTP, HTTP, TCP/IP
• Knowledge of the following technologies: Routing, Switching, VPN, LAN, WAN, Network Security, Intrusion Detection, and Anti Virus.
• Strong understanding in the following technologies and protocols: RADIUS, PKI, IKE, Certificates, L2TP, IPSEC, FIREWALL, 802.1Q, MD5, SSH, SSL, SHA1, DES, 3DES
• Experience with encryption and authentication technologies required
• Strong presentation skills

• The Systems Engineer, SLED is required to customarily and regularly work outside of their office or home office engaged in selling, including travel as needed to make a sale.

• Bachelor’s Degree or equivalent experience. Graduate degree preferred.

Wage ranges are based on various factors including the labor market, job type, and job level. Earnings for this position are expected to be $215,400 - $278,700. Need to talk to recruiter . Exact salary offers will be determined by factors such as the candidate's subject knowledge, skill level, qualifications, experience, and geographic location.

Show more

משרות נוספות שיכולות לעניין אותך

04.05.2025
J

Jacobs Bridge Engineer United States, California, Sacramento

Limitless High-tech career opportunities - Expoint
Serve in a lead technical support role on a variety of bridge project sizes. Lead successful delivery of high-quality projects within budget and on schedule. Effectively collaborate with others and...
תיאור:
Your impact

Our Bridge Engineers:

  • Serve in a lead technical support role on a variety of bridge project sizes
  • Lead successful delivery of high-quality projects within budget and on schedule
  • Effectively collaborate with others and lead transportation and bridge teams in all aspects of bridge analysis and design, from conceptual planning and preliminary design to final design and construction
  • Identify creative and innovative engineering solutions based on client, project, and site constraints
  • Provide technical guidance and oversight, and perform quality reviews of work by others
  • Commit to quality and continuous improvement as individuals as well as part of a team
  • Complete assigned tasks with a high-level of quality within schedule and budget constraints while collaborating with teams of professionals from multiple disciplines
  • Lead and support project execution, quality management, and safety plans
  • Develop plans, specifications, cost estimates, and final bid packages for bridges and other transportation structures
  • Train, mentor, and direct the work of less-experienced engineers
  • Identify schedule and cost variances and develop/implement recommendations for corrective action in a timely manner
  • Demonstrate leadership by organizing and actively participating in technical development and other networking activities both internally and externally
  • Assist in marketing activities to procure new opportunities, coordinating with client account management leads
  • Have strong written and oral communication skills and a team-oriented attitude

This position will be based out of any of our Northern CA offices including Sacramento, CA, Redding, CA, San Francisco, CA, Oakland, CA and San Jose, CA, and may include limited travel.

Show more

משרות נוספות שיכולות לעניין אותך

Limitless High-tech career opportunities - Expoint
Own the resilience testing roadmap for vLLM and llm-d: define resilience indicators, prioritize fault scenarios, and establish go/no-go gates for releases and CI/CD. Design GPU/accelerator-aware fault experiments that target vLLM...
תיאור:

What you will do:

  • Own the resilience testing roadmap for vLLM and llm-d: define resilience indicators, prioritize fault scenarios, and establish go/no-go gates for releases and CI/CD

  • Design GPU/accelerator-aware fault experiments that target vLLM and the stack beneath it (drivers, GPU Operator/DevicePlugin, NCCL/collectives, storage/network paths, NUMA/topology)

  • Build an automated harness (preferably extending krkn-chaos (https://github.com/krkn-chaos/krkn) ) to run controlled experiments with scoped blast radius, and evidence capture (logs, traces, metrics)

  • Integrate fault signals into pipelines (GitHub Actions or otherwise) as resilience gates alongside performance gates

  • Develop detection and diagnostics: dashboards and alerts for pre-fault signals (e.g., vLLM queue depth, GPU throttling, P2P downgrades, KV-cache pressure, allocator fragmentation)

  • Triage and root-cause resilience regressions from field/customer issues; upstream bugs and fixes to vLLM and llm-d

  • Explore and experiment with emerging AI technologies relevant to software development and testing, proactively identifying opportunities to incorporate new AI capabilities into existing workflows and tooling.

  • Publish learnings (internal/external): failure patterns, playbooks, SLO templates, experiment libraries, and reference architectures; present at internal/external forums

What you will bring:

  • 3+ years in reliability, and/or performance engineering on large-scale distributed systems

  • Expertise in systems‑level software design

  • Expertise with Kubernetes and modern LLM inference server stack (e.g., vLLM, TensorRT-LLM, TGI)

  • Observability & forensics skills with experience with Prometheus/Grafana, OpenTelemetry tracing, eBPF/BPFTrace/perf, Nsight Systems, PyTorch Profiler; adept at converting raw signals into actionable narratives.

  • Fluency in Python (data & ML), strong Bash/Linux skills

  • Exceptional communication skills - able to translate raw data into customer value and executive narratives

  • Commitment to open‑source values and upstream collaboration

The following is considered a plus:

  • Master’s or PhD in Computer Science, AI, or a related field

  • History of upstream contributions and community leadership, public talks or blogs on resilience, or chaos engineering

  • Competitive benchmarking and failure characterization at scale.

The salary range for this position is $127,890.00 - $211,180.00. Actual offer will be based on your qualifications.

Pay Transparency

● Comprehensive medical, dental, and vision coverage

● Flexible Spending Account - healthcare and dependent care

● Health Savings Account - high deductible medical plan

● Retirement 401(k) with employer match

● Paid time off and holidays

● Paid parental leave plans for all new parents

● Leave benefits including disability, paid family medical leave, and paid military leave

Show more
תכננו את מהלך הקריירה הבא שלכם בתעשיית ההייטק עם אקספוינט! הפלטפורמה שלנו מציעה מגוון רחב של משרות Support Engineer באזור United States, California, Sacramento, ומעניקה לכם גישה לחברות הטובות ביותר בתחום. בין אם אתם מחפשים אתגר חדש או שינוי נוף, אקספוינט תקל על מציאת התאמת העבודה המושלמת עבורכם. עם מנוע החיפוש הקל לשימוש שלנו, תוכלו למצוא במהירות הזדמנויות עבודה ולחבור לחברות מובילות. הירשמו היום ועשו את הצעד הבא בקריירת ההיי-טק שלכם עם Expoint.