Expoint - all jobs in one place

מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Limitless High-tech career opportunities - Expoint

EY EY - GDS Consulting AI DATA AWS Data Engineer Senior 
India, Kerala, Kochi 
536465294

Today


We are seeking a highly skilled and motivated AWS Data Engineer with 3-7 years of experience in AWS Glue, AWS Redshift, S3, and Python to join our dynamic team. As a Data Engineer, you will be responsible for designing, developing, and optimizing data pipelines and solutions that support business intelligence, analytics, and large-scale data processing. You will work closely with data scientists, analysts, and other engineering teams to ensure seamless data flow across our systems.

Key Responsibilities:

  • Design and Develop ETL Pipelines: Leverage AWS Glue to design and implement scalable ETL (Extract, Transform, Load) processes that move and transform data from various sources into AWS Redshift or other storage systems.
  • Engineering governed, batch and near real time data pipelines using AWS native technologies like DirectConnect, S3, Lambda functions, Glue, Kinesis and CloudTrail or equivalent
  • Designing and implementing serverless data engineering workloads using AWS ecosystem , taking inputs from S3, RDS, and other cloud based sources (ex: SaaS data) , applying business transformations using distributed compute (ex : EMR, Glue, Spark, etc. ) and persisting insights in the target store (ex: S3, Redshift, DynamoDB)
  • Maintain, optimize, and scale AWS Redshift clusters to ensure efficient data storage, retrieval, and query performance.
  • Utilize Amazon S3 to store raw data, manage large datasets, and integrate with other AWS services to ensure secure, scalable, and cost-effective data solutions.
  • Create and manage AWS Glue crawlers and jobs to automate data cataloging and ingestion processes across various structured and unstructured data sources.
  • Use Python (and PySpark within Glue) to write scripts for data transformation, integration, and automation tasks, ensuring clean, efficient, and reusable code.
  • Ensure data accuracy and integrity by implementing data validation, cleansing, and error-handling processes in ETL pipelines.
  • Optimize AWS Glue jobs, Redshift queries, and data flows to ensure optimal performance and reduce processing times and costs.
  • Enable data consumption from reporting and analytics business applications using AWS services (ex: QuickSight, Sagemaker, JDBC / ODBC connectivity, etc.)
  • Experience in identify, define and design logical data model, required entities, relationships, data constraints and dependencies focused on enabling reporting and analytics business use cases
  • Work closely with data scientists, analysts, and stakeholders to understand data requirements and provide solutions that enable data-driven decision-making.
  • Monitoring and Troubleshooting: Develop and implement monitoring strategies to ensure data pipelines are running smoothly. Quickly troubleshoot and resolve any data-related issues or failures.


Required Skills and Qualifications:

  • 3-7 years of experience in data engineering or a similar role, with a focus on AWS technologies.
  • Academic background in Computer science
  • Strong background in programming- PySpark, SQL, Stored Procedures, Python
  • Strong experience with AWS Glue building ETL pipelines, managing crawlers, and working with Glue data catalogue.
  • Proficiency in AWS Redshift designing and managing Redshift clusters, writing complex SQL queries, and optimizing query performance.
  • Hands-on experience with Amazon S3 data storage, data lifecycle policies, and integration with other AWS services.
  • Solid programming skills in Python especially for data manipulation (using libraries like pandas) and automation of ETL jobs.
  • Experience with PySpark within AWS Glue for large-scale data transformations.
  • Proficiency in writing and optimizing SQL queries for data manipulation and reporting.
  • Familiarity with data warehouse concepts: star schemas, partitioning, indexing, and data normalization.
  • Strong problem-solving skills and attention to detail.
  • Experience with version control systems like SVN, Git.
  • Experience with Data Streaming Technologies like AWS Kinesis and Kafka Implementation on AWS

Good To have :

  • Knowledge of AWS IAM for managing secure access to data resources.
  • Familiarity with DevOps practices and automation tools like Terraform or CloudFormation.
  • Experience with data visualization tools like QuickSight or integrating Redshift data with BI tools (Tableau, PowerBI, etc.).
  • AWS certifications such as AWS Certified Data Analytics – Specialty or AWS Certified Solutions Architect are a plus.



EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets.