Expoint – all jobs in one place
Finding the best job has never been easier
Limitless High-tech career opportunities - Expoint

Ebay AI Capacity Engineer 
United States, Texas, Euless 
316722010

15.07.2025

As a Capacity Engineer, you will be responsible for analyzing, modeling, and forecasting the infrastructure needs of our organization. Your expertise in capacity engineering and forecasting will play a crucial role in optimizing resource allocation and ensuring efficient utilization of our technology assets. Experience with public and private Kubernetes based clouds, application performance principles, and workload optimization will be highly helpful in this role. The successful candidate will have strong analytical skills, superb communication abilities, and a deep understanding of technology infrastructure.

Responsibilities:

  • Collaborate with Core AI teams across multiple geographies to analyse, plan and execute on AI initiatives. Analyse the AI capacity intake requirements for prioritization and scheduling. Seek out and lead execution of performance optimizations of our AI related assets to ensure efficient use. Scope includes Nvidia SuperPod, On-Prem Kubernetes, Azure, and GCP based clouds

  • Understand key performance metrics and scaling characteristics of LLM, non-LLM AI models

  • Understand key concepts and sizing metrics related to RAG, Vector Search, Grounding

  • Influence customer choice of AI models to improve ROI and cost efficiency

  • Design and build dashboards to support management of AI workloads and infrastructure. Be familiar with GPU relevant metrics and how they are used. Strong experience with grafana, prometheus, thanos and ELK stacks

  • Analyze historical data, trends, and growth patterns to develop accurate capacity models such as compute, network, storage, and platform optimization requirements.and forecasts

  • Collaborate with multi-functional teams to gather relevant information on business objectives, technology requirements, and upcoming projects

  • Evaluate and refine existing capacity engineering processes and methodologies to improve accuracy and efficiency

  • Monitor system performance metrics and utilization levels to identify potential bottlenecks or areas of underutilization. Take ownership and drive for the realization of gains from improving utilization

  • Collaborate with technology partners to understand future technology trends and initiatives, anticipate resource demands, and develop proactive capacity plans

  • Conduct "what-if" scenarios to assess the impact of different business scenarios and help guide decision-making processes

  • Manage capacity for large federated Kubernetes environments on primarily private cloud but including some public cloud.

  • Must be articulate and be able to communicate capacity insights, recommendations, and performance metrics to key partners. Advocate for initiatives that provide clear business value through data.

Qualifications:

  • Bachelor's degree in computer science, information systems, statistics or a related field and 2+ years of experience

  • 2 years of experience in capacity engineering, resource allocation, and forecasting in a technology-intensive environment

  • Specialized experience in AI

  • Strong proficiency with grafana, prometheus, thanos, elastic search and kibana (ELK)

  • Proficiency in Kubernetes. Should understand how applications operate in a k8s environment and have experience running apps in k8s.

  • Strong analytical skills with the ability to analyze complex data sets and identify relevant patterns and trends

  • Familiarity with technology infrastructure components, including Ubuntu linux, servers, databases, networks, storage systems, and cloud platforms

  • Knowledge of recommendation engines concepts and their application in infrastructure management

The base pay range for this position is expected in the range below:

$95,200 - $168,700