Required qualifications, capabilities, and skills
- Formal training or certification on Data Engineering concepts and 3+ years applied experience.
- Experience in software development, including software development life cycle, coding standards, documentation, code reviews, source control management, continuous integration, build processes, testing, and operations experience.
- Demonstrate proficiency in Python, with experience in developing and maintaining production-level code.
- Exhibit proficiency in data engineering, querying various types of data stores, efficiently moving data, working with large datasets, and data preprocessing.
- Have experience with cloud platforms, such as AWS and Azure, for designing and deploying infrastructure to run processes, web apps, and training and inference pipelines for AI/ML models, including handling networking and scaling of these systems.
- Show experience with workflow management and orchestration tools such as Airflow and Kubernetes.
- Possess strong problem-solving and analytical skills.
- Demonstrate excellent documentation, communication, and collaboration skills.
- Have knowledge of infrastructure operations.
- Experience across the data lifecycle.
- Possess significant experience with statistical data analysis and ability to determine appropriate tools and data patterns to perform analysis.
Preferred qualifications, capabilities, and skills
- Have experience in the design or architecture of new and existing systems, including design patterns, reliability, and scaling.
- Demonstrate experience in machine learning engineering.
- Be familiar with DevOps practices for AI/ML model deployment and monitoring.
- Have knowledge of graph databases and familiarity with graph query languages.