Job responsibilities
- Support review of controls to ensure sufficient protection of enterprise data
- Advise and makes custom configuration changes in one to two tools to generate a product at the business or customer request
- Update logical or physical data models based on new use case
- Use SQL and understands NoSQL databases and their niche in the marketplace
- Analyze, design, develop and drive performance enhancements, you will focus on significantly increasing default ingestion speeds to meet the substantial data demands, ensuring our systems operate at peak efficiency.
- Implement automation, optimization, performance tuning and scaling techniques to ensure efficient pipeline performance
- Handle new and complex challenges, continuously seeking innovative solutions to improve data processing and performance.
- Work closely with cross-functional teams, sharing insights and best practices, and mentoring other engineers to foster a culture of continuous learning and improvement.
Required qualifications, capabilities, and skills
- Formal training or certification on software engineering concepts and 3+ years of applied experience
- Proficiency in Databrick, AWS and PySpark for data processing and analytics
- Databricks Cluster configuration, Unity catalog repository configuration
- Strong programming skills in Python with experience in writing complex SQL queries
- Advanced at SQL (e.g., joins and aggregations, window functions)
- Experience in data engineering and Cloud architecture specifically with Databricks and AWS
- Proven experience and ability to migrate the Data load models developed on ETL framework to the multi node Databricks compute
- Good understanding of developing data warehouses, data marts, etc.
- Excellent problem-solving skills to be able to structure the right analytical solutions. Strong sense of teamwork, ownership, and accountability
- Understanding of system architectures, and design patterns and should be able to design and develop applications using these principles.
- Ability to work in a fast-paced environment with tight schedules.
Preferred qualifications, capabilities, and skills
- Experience in Data modeling using modeling tools (Erwin)
- Working understanding of NoSQL databases
- Experience with statistical data analysis and ability to determine appropriate tools and data patterns to perform analysis