Collaborate with data scientists and research/machine learning engineers to deliver products to production.
Build and maintain data pipelines for analytics, model evaluation and training (includes versioning, compliance and validation).
Build and maintain scalable infrastructure as code in the cloud (private & public).
Rapid prototyping & shorten development cycles for our software and AI/ML products:
Build and automate and maintain our AI/ML data pipelines & workstream from data analysis, experimentation, model training, model evaluation, deployment, operationalization, and tuning to visualization.
Improve and maintain our automated CI/CD pipeline while collaborating with our stakeholders, various testing partners and model contributors.
Increase our deployment velocity, including the process for deploying models and data pipelines into production.
Contribute to architecture and software management discussions & tasks in the overall team.
Requirements
Minimum Bachelor of Science degree in Computer Science, Software Engineering, Electrical Engineering, Computer Engineering or related field.
Experience in Python Programming, OOP, Databases and Big Data.
Experience in some of the following technologies Hadoop, PySpark, AWS Cloud, AWS DynamoDB, AWS Lambda, AWS Redshift, AWS Kinesis, AWS Athena, AWS Glue, Databricks, Elasticsearch, Airflow, Docker/Kubernetes/Terraform.
3+ years of work-related experience in professional software/data engineering role.
Excellent problem solving and debugging skills.
Strong interpersonal skills; able to work independently as well as in a team.
Experience in containerization and infrastructure as code:
Familiar with monitoring tools such as Prometheus, Grafana, Splunk and Datadog
Desirable
You have a strong commitment to development best practices and code reviews.
You believe in continuous learning, sharing best practices, encouraging and elevating less experienced colleagues as they learn.
Experience with data labelling, validation, provenance and versioning.