Required qualifications, capabilities, and skills
- Formal training or certification on Data Engineering concepts and 3+ years applied experience in AWS and Kubernetes
- Proficiency in one or more large-scale data processing distributions such as JavaSpark/PySpark along with knowledge on Data Pipeline (DPL), Data Modeling, Data warehouse, Data Migration and so-on.
- Experience across the data lifecycle along with expertise with consuming data in any of: batch (file), near real-time (IBM MQ, Apache Kafka), streaming (AWS kinesis, MSK)
- Advanced at SQL (e.g., joins and aggregations)
- Working understanding of NoSQL databases
- Experience in developing, debugging, and maintaining code in a large corporate environment with one or more modern programming languages and database querying languages.
- Solid understanding of agile methodologies such as CI/CD, Application Resiliency, and Security
- Significant experience with statistical data analysis and ability to determine appropriate tools and data patterns to perform analysis
- Experience customizing changes in a tool to generate product
Preferred qualifications, capabilities, and skills