BS, MS, or PhD degree in Computer Science or related field, or equivalent practical experience.
5+ years experience designing and developing software applications, distributed systems (i.e. Java, Microservices)
Knowledgeable with Data Tools and frameworks (i.e. Python, PyTorch, Numpy, Pandas, TensorFlow, R, Spark).
In-depth knowledge of Transformer, Encoder, Embeddings at scale.
Understanding of LLM, LangChain, CustomGPTs, Prompt Management, RLHF and ability to fine-tune base models to build efficient production-grade LLM apps.
Computer science fundamentals: data structures, algorithms, performance complexity, and implications of computer architecture on software performance (e.g., I/O and memory tuning).
Experience with integrating applications and platforms with cloud technologies (e.g: AWS Sagemaker)
(Nice to have - Proficiency II)
Knowledge of machine learning (i.e. classification, regression, clustering, neural networks) and MLOps.
Understand and apply machine learning principles (training, weights, validation, testing, error, cost) optimizing for accuracy
Nice to have: Knowledge of data cleaning, streaming, transformations at scale, storage and ingestion pipelines.