Job responsibilities
- Executes software solutions, design, development, and technical troubleshooting with ability to think beyond routine or conventional approaches to build solutions or break down technical problems
- Creates secure and high-quality production code and maintains algorithms that run synchronously with appropriate systems
- Produces architecture and design artifacts for complex applications while being accountable for ensuring design constraints are met by software code development
- Gathers, analyzes, synthesizes, and develops visualizations and reporting from large, diverse data sets in service of continuous improvement of software applications and systems
- Proactively identifies hidden problems and patterns in data and uses these insights to drive improvements to coding hygiene and system architecture
- Deploy and maintain infrastructure (eg., Sagemaker Notebooks) for providing an effective model development platform for data scientists and ML engineers that integrates with enterprise data ecosystem. Build, deploy and maintain ingress/egress and feature generation pipelines to calculate input features for model training and inference
- Deploy and maintain infrastructure for batch and real-time model serving, in high throughput, low latency applications, at scale. Identify, deploy and maintain high quality model monitoring and observability tools
- Deploy and maintain infrastructure for compute intensive tasks such as hyperparameter tuning and interpretability and explainability. Partners with product, architecture, and other engineering teams to define scalable and performant technical solutions. Leverages deep technical expertise to design extensible and scalable solutions, and to coach and grow individuals and teams.
- Ensures team executes work according to compliance standards, SLAs, and business requirements, to meet the objectives of an initiative. Anticipates the needs of broader teams and potential dependencies with other teams.
- Identifies and mitigates issues to execute a book of work while escalating issues as necessary. Proactively helps maintain high operational excellence standards for our production systems. Encourages development of technological methods and techniques within team.
- Coaches agility lead and team to effectively leverage Agility practices. Creates a culture of diversity, equity, inclusion, and respect for team members and prioritizes diverse representation.
Required qualifications, capabilities, and skills
- Formal training or certification on software engineering concepts and 3+ years applied experience
- Hands-on practical experience in system design, application development, testing, and operational stability
- Proficient in coding in one or more languages
- Experience in developing, debugging, and maintaining code in a large corporate environment with one or more modern programming languages and database querying languages
- Overall knowledge of the Software Development Life Cycle
- Deep experience and passion in model training, build, deployment and execution ecosystem such as Sagemaker and/or Vertex AI. Experience in monitoring and observability tools to monitor model input/output and features stats
- Operational experience in big data tools such as Spark, EMR, Ray. Experience and interest in ML model architectures—linear/logistic regression, Gradient Boosted Trees, Neural Network architectures
- Solid grounding in engineering fundamentals and analytical mindset. Bias for action and iterative development
- Experience with recommendation and personalization systems is a plus.
- Programming languages: Python, some Java. Solid fundamentals and experience in containers (docker ecosystem), container orchestration systems [Kubernetes, ECS], DAG orchestration [Airflow, Kubeflow etc]
- Solid fundamentals and experience with cloud technologies—EC2, Sagemaker, IAM. Good knowledge of Databases
Preferred qualifications, capabilities, and skills
- Familiarity with modern front-end technologies
- Exposure to cloud technologies