As a Data Scientist in the Compliance, Conduct Operational Risk Data Analytics team. you will be responsible for devising and developing Proofs of Concept (POCs) and deployable models using AI/ML techniques, algorithms and other statistical and numerical methods. You will need to able to extract and work with large volumes of data (both structured and unstructured) from multiple sources, transforming it into an analysis-ready format to develop the data pipeline. Additionally, you are expected to independently formulate methodologies, and quantitative and analytical tasks, from business problems.
Job Responsibilities
- Analyze complex/unstructured data to understand the business problem and use case
- Analyze business requirements, design, and develop appropriate methodology
- Develop deployable, scalable and effective models/ analytical methods as part of technology managed system or as a self-served application of a business user
- Work collaboratively and creatively with other data scientists, technology partners, risk professionals, model validation teams, etc.
- Prepare technical documentation of quantitative models for internal model risk and governance review
Required qualifications, capabilities, and skills
- 6+ years of related experience in Python, R or Scala with Bachelor of Science degree in Computer Science, Physical Sciences, Econometrics, Statistics, or other any quantitative discipline.
- Demonstrable theoretical and application knowledge of Machine Learning methods, and/or Statistical Models
- Demonstrable hands-on experience and familiarity with any or all of the following packages, algorithms, and/or alternatives, including Graph Learning Packages : (NetworkX, Torch-Geometric, Graphframes, Graphistry),ML Packages (Pandas, Scikit-Learn, XGBoost, catboost, lightgbm, automl, Optuna, Hyperopt), Visualization Packages (Matplotlib, Seaborn, Geopandas), Algorithm (Ensemble Louvian / Hierarchical Clustering, Label Propagation, Connected Component Analysis, Graph Neural net (Graph Attention Network), Page Rank, Centrality Analysis, Tree based Analysis, Outlier Detection Methods, Zero Shot/ Few Shot learning)
- Demonstrable experience with graph analytics, graph-based learning, and graph representation/visualization
- Experience in graph Database: TigerGraph, Neo4j
- Experience in Query Language: Hive, Cypher (Graph Query Language)
- Experience in financial services industry especially in Anti-Money Laundering & Know Your Customer model development.
- Hands-on professional experience in software development especially with analytical & computationally intensive systems, digital transformations leveraging cloud technologies (AWS, GCP, Azure, Databricks etc.)
- Experience in developing and operationalization of data pipelines
- Familiarity and experience of assimilating large amounts of data from multiple databases and utilize them for creating actionable outcome; Adhering to a standardized analysis and project methodology; and Documenting quantitative analysis
- Experience with process, controls and governance of a highly regulated environment
Preferred qualifications, capabilities, and skills
- Post graduate degrees such as Master’s Degree, PhD, etc. is preferred
- Working knowledge of C/C#/C++ or others is a plus
- Real life exposure to Agile SDLC, ModelOps and /Or Design Thinking is desirable.
- Familiarity with Natural Language Processing techniques is a plus
- Self-starter and strong influencing skills with strong communication skills