Your key responsibilities
- Analysis and Solution Development: Analyzing business requirements, data properties (volume, speed, type, etc), and developing appropriate solutions to ensure accurate data collection, processing, storage, and retrieval.
- Implementing ETL Processes: Implementing complex ETL (Extract, Transform, Load) processes and workflows to efficiently process large-scale data.
- Deploy data pipeline and verify the business requirements.
- Optimize Systems: Continuously improve system performance and efficiency. This includes performing regular tests, troubleshooting, and integrating new management and data storage technologies.
Must have skills: -
- Hands on data engineer
- Applying object-oriented and functional programming styles to real-world Big Data Engineering problems using Python/Pyspark
- Develop data pipelines to perform batch and Real - Time on structured and semi structured data.
- Experience in Data Ingress and Egress flows.
- Hands-on expertise in cloud services like AWS S3, Lambda, EMR, Glue, Step Function, etc
- Data processing patterns, distributed computing and in building applications for real-time and batch analytics.
- Handling large data sets and perform data wrangling, analysis, etc., using various SQL & NoSQL database technologies. Experience in AWS RDS, Redshift, Athena, Dynamo DB etc.
- Experience working with NoSQL in at least one of the data stores - HBase, Cassandra, MongoDB
- Good understanding of different file format (ORC, Parquet, AVRO) to optimize queries/processing and compression techniques.
- Demonstrable understanding of high-quality coding and testing practices.
- An appetite to learn new technologies and a drive for continual improvement.
- Have knowledge and skills in DevOps practices like CI/CD and containerization. Preferable – having deployment knowledge.
- Understanding on Security and Authentication/Authorization in AWS like IAM roles, security groups, VPC setup etc.
Nice to have skills: -
- Develop data pipelines to perform Stream analytics on structured, semi structured, and unstructured data.
- Experience in Kafka, NiFi.
- Experience in Java/Scala.
To qualify for the role, you must have
- Be a computer science graduate or equivalent with 3-7 years of industry experience
- Have working experience in an Agile base delivery methodology (Preferable)
- Flexible and proactive/self-motivated working style with strong personal ownership of problem resolution.
- Excellent communicator (written and verbal formal and informal).
- Participate in all aspects of Big Data solution delivery life cycle including analysis, design, development, testing, production deployment, and support.
Ideally, you’ll also have
What we look for
- People with technical experience and enthusiasm to learn new things in this fast-moving environment
You get to work with inspiring and meaningful projects. Our focus is education and coaching alongside practical experience to ensure your personal development. We value our employees and you will be able to control your own development with an individual progression plan. You will quickly grow into a responsible role with challenging and stimulating assignments. Moreover, you will be part of an interdisciplinary environment that emphasizes high quality and knowledge exchange. Plus, we offer:
- Support, coaching and feedback from some of the most engaging colleagues around
- Opportunities to develop new skills and progress your career
- The freedom and flexibility to handle your role in a way that’s right for you
EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets.