Responsibilities
The scope of this role includes but not limited to:
- LLM Models Management: Assist in building and optimizing a management system for large language models, including deployment, scheduling, and monitoring.
- LLM Images Management: Manage container images for Kubernetes, including building, optimizing, and pulling strategies.
- Deployment Management Platform Development: Help develop and maintain an LLM deployment management platform to improve efficiency and stability.
- Automation & Operations: Optimize containerized application workflows to enhance automation and resource utilization.
- Performance Optimization: Analyze and optimize the LLM runtime environment to improve inference efficiency and system stability.
Essential Skills:
- Currently pursuing a degree in Computer Science or a related field (Bachelor's or above; Master’s preferred).
- Proficiency in at least one programming language (Python, Go, or Java); cloud-native development experience is a plus.
- Understanding of deep learning frameworks (e.g., TensorFlow, PyTorch) and LLM-related concepts is preferred.
- Familiarity with Linux and experience with Docker/Kubernetes is a plus.
- Strong engineering skills and teamwork abilities, with a keen interest in AI and cloud-native technologies.
Desirable Skills:
- Experience with LLMs (e.g., DeepSeek, Llama, Qwen, GPT).
- Knowledge of Docker/Kubernetes.
- Familiarity with GPU scheduling and model inference optimization.
Communication:
- Excellent documentation skills.
Our Benefits:
Any general requests for consideration of your skills, please