The point where experts and best companies meet
Share
Key job responsibilities
In this role, you will have the opportunity to display and develop your skills in the following areas- Develop and support ETL pipelines with robust monitoring and alarming- Develop and optimize data tables using best practices for partitioning, compression, parallelization, etc.
- Build robust and scalable data integration (ETL) pipelines using SQL, Python, and AWS services such as Glue, Lambda, and Step Functions
- Implement data structures using best practices in data modeling, ETL/ELT processes, and SQL/Redshift
- Interface with business customers, gathering requirements, and delivering complete reporting solutions- Explore and learn the latest AWS technologies to provide new capabilities and increase efficiencies
About the team
Diverse Experiences
Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.Why AWSWork/Life BalanceInclusive Team CultureMentorship and Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
- Bachelor's degree in Computer Science, Engineering, Mathematics, or a related technical discipline
- 5+ years of industry experience in data engineering related field with solid background in manipulating, processing, and extracting value from large datasets
- Ability to write high quality, maintainable, and robust code, often in SQL and Python.
- 5+ Years of Data Warehouse Experience with Oracle, Redshift, Postgres, Snowflake etc. with demonstrated strength in SQL, Python, PySpark, data modeling, ETL development, and data warehousing
- Extensive experience working with cloud services (AWS or Azure or GCS) with a strong understanding of cloud databases (e.g. Redshift/Aurora/DynamoDB), compute engines (e.g. EMR/EC2), data streaming (e.g. Kinesis), storage (e.g. S3) etc.
- Experience/Exposure using big data technologies (Hadoop, Hive, Hbase, Spark, etc.)
- Fundamental understanding of version control software such as Git
- Experience with CI/CD, automated testing, and DevOps best practices
- Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
- Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
- Masters in computer science, mathematics, statistics, economics, or other quantitative fields
- 7+ years of experience as a experience in data engineering related field in a company with large, complex data sources
- Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
- Experience using big data technologies (Hadoop, Hive, Hbase, Spark, etc.)
- Experience working with AWS (Redshift, S3, EMR, Glue, Airflow, Kinesis, Step Functions)
- Hands-on in any scripting language (BASH, C#, Java, Python, Typescript)
- Hands on experience using ETL tools (SSIS, Alteryx, Talend)
- Background in non-relational databases or OLAP is a plus
- Knowledge of software engineering best practices across the development lifecycle, including agile methodologies, coding standards, code reviews, source management, build processes, testing, and operations
- Strong analytical skills, 5+ years’ experience with Python, Scala and an interest in real time data processing
- Proven success in communicating with users, other technical teams, and senior management to collect requirements, describe data modeling decisions and data engineering strategy
These jobs might be a good fit