Finding the best job has never been easier
Share
You should be expert at implementing and operating stable, scalable data flow solutions from production systems into end-user facing applications/reports. These solutions will be fault tolerant, self-healing and adaptive. You will be working on developing solutions that provide some of the unique challenges of space, size and speed. You will implement data analytics using cuttingedge analytics patterns and technologies that are inclusive of but not limited to various AWS Offerings - EMR, Lambda, Kinesis, and Spectrum. You will extract huge volumes of structured and unstructured data from various sources (Relational /Non-relational/No-SQL database) and message streams and construct complex analyses. You will write scalable code and tune performance running over billion of rows of data. You will implement data flow solutions that process data on Spark, Redshift and store in both Redshift and File based storage (S3) for reporting and adhoc analysis.You should be detail-oriented and must have an aptitude for solving unstructured problems. You should work in a self-directed environment, own tasks and drive them to completion.Key job responsibilities
- Design and develop the pipelines required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Python and AWS big data technologies.
- Oversee and continually improve production operations, including optimizing data delivery, re-designing infrastructure for greater scalability, code deployments, bug fixes and overall release management and coordination.
- Establish and maintain best practices for the design, development and support of data integration solutions, including documentation.- Able to read, write, and debug data processing and orchestration code written Python/Scala etc following best coding standards (e.g. version controlled, code reviewed, etc.)
- 3+ years of data engineering experience
- Experience with data modeling, warehousing and building ETL pipelines
- Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
- Experience as a Data Engineer or in a similar role
These jobs might be a good fit