Finding the best job has never been easier
Share
To be successful in this role, you should have broad skills in Infrastructure design, be comfortable dealing with large and complex data sets, have experience building near real-time and batch processing data pipelines. Must be a self-starter, comfortable with ambiguity in a fast-paced and ever-changing environment, and able to think big while paying careful attention to detail.Key job responsibilities
* Architect a scalable, reliable, and high-performing data processing pipeline on AWS. Leverage AWS services like EMR, Glue, Kinesis, S3, and Managed Airflow to build the data infrastructure.
* Ensure the platform can efficiently handle the ingest, processing, and storage of massive data volumes.
* Highly proficient in writing code in SQL and Python.
* Develop optimized ETL/ELT pipelines to transform raw data into analytical datasets
* Implement data partitioning, bucketing, and indexing strategies to enable high-performance queries
* Work closely with data scientists, software engineers, product managers, and BIEs
* Stay updated with the latest AWS services and data engineering best practices
- 5+ years of data engineering experience
- Experience with data modeling, warehousing and building ETL pipelines
- Experience with SQL
- Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
- Experience mentoring team members on best practices
- Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
- Experience operating large data warehouses
These jobs might be a good fit