We are seeking a highly skilled and experienced Data Engineer with expertise in Python data engineering, PostgreSQL, and Apache Airflow. The ideal candidate will have a minimum of 5 - 8 years of experience in data engineering and will play a crucial role in designing, implementing, and maintaining data pipelines and orchestration. You will work closely with data scientists, analysts, and other stakeholders to ensure efficient and reliable data processes.
Responsibilities:
- Design, develop, and maintain scalable data pipelines using Python, PostgreSQL, and Apache Airflow.
- Collaborate with data scientists and analysts to understand data requirements and implement solutions.
- Optimize and tune database queries for performance and scalability.
- Integrate Kafka into data pipeline for asynchronous data processing.
- Ensure data quality and integrity through robust validation and monitoring processes.
- Develop and implement data pipeline framework to integrate data from various sources.
- Monitor and troubleshoot data pipelines, ensuring timely resolution of issues.
- Implement best practices for data engineering, including code versioning, testing, and documentation.
- Work with on-premises platforms for CI/CD process to deploy and manage data pipeline.
- Collaborate with software engineering teams to integrate data solutions into applications.
- Stay updated with the latest trends and technologies in data engineering and PostgreSQL database for performance improvement of vector database.
Requirements:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Minimum 5 -8 years of experience in data engineering.
- Strong proficiency in Python programming for data engineering tasks.
- In-depth knowledge of PostgreSQL and experience in database design, optimization, and management.
- Hands-on experience with Apache Airflow for orchestrating data workflows.
- Experience with data pipeline processes and tools.
- Experience with kafka for data processing.
- Familiarity with DevOps and CI/CD including docker containerization.
- Strong problem-solving and analytical skills.
- Excellent communication and collaboration skills.
- Understanding of data privacy, security, and compliance considerations.
Good to Have Skills:
- Experience with other storage technologies (e.g., Dell’s S3).
- Experience with DevOps practices and tools (e.g., Docker, Kubernetes, Terraform).
- Understanding of MLOps and integrating data engineering with machine learning workflows.
- Ability to work in an Agile development environment.
EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets.