Your role and responsibilities
- As Data Engineer, you will develop, maintain, evaluate and test big data solutions. You will be involved in the development of data solutions using Spark Framework with Python or Scala on Hadoop and AWS Cloud Data Platform
- Responsibilities:
- Experienced in building data pipelines to Ingest, process, and transform data from files, streams and databases. Process the data with Spark, Python, PySpark, Scala, and Hive, Hbase or other NoSQL databases on Cloud Data Platforms (AWS) or HDFS
- Experienced in develop efficient software code for multiple use cases leveraging Spark Framework / using Python or Scala and Big Data technologies for various use cases built on the platform
- Experience in developing streaming pipelines
- Experience to work with Hadoop / AWS eco system components to implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Apache Spark, Kafka, any Cloud computing etc
Required education
Bachelor's Degree
Preferred education
Master's Degree
Required technical and professional expertise
- Total 5 - 7+ years of experience in Data Management (DW, DL, Data Platform, Lakehouse) and Data Engineering skills
- Minimum 4+ years of experience in Big Data technologies with extensive data engineering experience in Spark / Python or Scala.
- Minimum 3 years of experience on Cloud Data Platforms on AWS; Exposure to streaming solutions and message brokers like Kafka technologies.
- Experience in AWS EMR / AWS Glue / DataBricks, AWS RedShift, DynamoDB
- Good to excellent SQL skills
Preferred technical and professional experience
- Certification in AWS and Data Bricks or Cloudera Spark Certified developers