Responsibilities:
- Build and train production grade ML models on large-scale datasets to solve various business use cases for Commercial Banking.
- Use large scale data processing frameworks such as Spark, AWS EMR for feature engineering and be proficient across various data both structured and un-structured.
- Use Deep Learning models like CNN, RNN and NLP (BERT) for solving various business use cases like name entity resolution, forecasting and anomaly detection.
- Build ML models across Public and Private clouds including container-based Kubernetes environments.
- Develop end-to-end ML pipelines necessary to transform existing applications and business processes into true AI systems.
- Build both batch and real-time model prediction pipelines with existing application and front-end integrations.
- Collaborate to develop large-scale data modeling experiments, evaluating against strong baselines, and extracting key statistical insights and/or cause and effect relations.
Required qualifications, capabilities and skills:
- 6+ years working experience as a Data Scientist.
- Advanced Degree in field of Computer Science, Data Science or equivalent discipline.
- Expertise with Python, PySpark, DL frameworks like TensorFlow and MLOps.
- Experience in designing and building highly scalable distributed ML models in production (Scala, applied machine learning, proficient in statistical methods, algorithms).
- Experience with analytics (ex: SQL, Presto, Spark, Python, AWS suite).
- Experience with machine learning techniques and advanced analytics (e.g. regression, classification, clustering, time series, econometrics, causal inference, mathematical optimization).
Preferred qualifications, capabilities and skills:
- Experience working with end-to-end pipelines using frameworks like KubeFlow, TensorFlow and/or crowd-sourced data labeling a plus.