Expoint – all jobs in one place
המקום בו המומחים והחברות הטובות ביותר נפגשים
Limitless High-tech career opportunities - Expoint

Apple LLM Ops Engineer 
India, Telangana, Hyderabad 
180072061

Yesterday
KEY RESPONSIBILITIES:- Design and build scalable infrastructure for fine-tuning, and deploying large language models.- Develop and optimize inference pipelines using popular frameworks and engines (e.g. TensorRT, vLLM, Triton Inference Server).- Implement observability solutions for model performance, latency, throughput, GPU/TPU utilization, and memory efficiency.- Own the end-to-end lifecycle of LLMs in production—from experimentation to continuous integration and continuous deployment (CI/CD).- Automate and harden model deployment workflows using Python, Kubernetes, Containers and orchestration tools like Argo Workflows and GitOps.- Design reproducible model packaging, versioning, and rollback strategies for large-scale serving.- Stay current with advances in LLM inference acceleration, quantization, distillation, and model compilation techniques (e.g., GGUF, AWQ, FP8).
  • 5+ years of experience in LLM/ML Ops, DevOps, or infrastructure engineering with a focus on machine learning systems.
  • Advance level proficiency in Python/Go, with ability to write clean, performant, and maintainable production code.
  • Deep understanding of transformer architectures, LLM tokenization, attention mechanisms, memory management, and batching strategies.
  • Proven experience deploying and optimizing LLMs using multiple inference engines.
  • Strong background in containerization and orchestration (Kubernetes, Helm).
  • Familiarity with monitoring tools (e.g., Prometheus, Grafana), logging frameworks, and performance profiling.
  • Experience integrating LLMs into micro-services or edge inference platforms.
  • Experience with Ray distributed inference
  • Hands-on with quantization libraries
  • Contributions to open-source ML infrastructure or LLM optimization tools.
  • Familiarity with cloud platforms (AWS, GCP) and infrastructure-as-code (Terraform).
  • Exposure to secure and compliant model deployment workflows