Provides data solutions by using software to process, store, and serve data to others. Tests data quality and optimizes data availability. Ensures that data pipelines are scalable, repeatable, and secure. Builds a deep dive analytical skillset by working with higher level Data Engineers on a variety of internal and external data.
Intermediate Glue, EMR, Redshift, Step Functions, Foundational Services like IAM, Cloud Watch, Cloud formation, Lambda, Secrets Manager, Sagemaker, Pyspark, SQL
Qualifications:
Minimum of three years in data engineering( 3 to 6 years ), Data analytics , programming, database administration, or data management experience.
Graduate degree ( Bachelors Degree , BCA or Masters ) or equivalent combination of training and experience.
Communication & Team Collaboration
- Strong verbal and written communication
- Be highly flexible
- Accountability & E2E ownership
Job Description:
- Writes ETL (Extract / Transform / Load) processes, designs database systems, and develops tools for real-time and offline analytic processing.
- Troubleshoots software and processes for data consistency and integrity. Integrates data from a variety of sources for business partners to generate insight and make decisions.
- Translates business specifications into design specifications and code. Responsible for writing programs, ad hoc queries, and reports. Ensures that all code is well structured, includes sufficient documentation, and is easy to maintain and reuse.
- Partners with internal clients to gain a basic understanding of business functions and informational needs. Gains working knowledge in tools, technologies, and applications/databases in specific business areas and company-wide systems.
- Participates in all phases of solution development. Explains technical considerations at related meetings, including those with business clients.
- Experience building specified AWS cloud architecture and supporting services and technologies (Eg: Glue, EMR, Redshift, Step Functions, Foundational Services like IAM, Cloud Watch, Cloud formation, Lambda, Secrets Manager, Sagemaker).
- Experience building Spark data processing applications (Python, Pyspark).
- Experience with Apache airflow ( + iceberg preferred )
- Experience with SQL development and Tableau Reporting ( Preferred ) .
- Experience leveraging / building Splunk dashboards.
- Experience with CI/CD pipeline tools like Bamboo, Bitbucket, Git hub and Terraform scripts / CFS .
- Tests code thoroughly for accuracy of intended purpose & Regression testing . Reviews end product with the client to ensure adequate understanding. Provides data analysis guidance as required.
- Provides tool and data support to business users and fellow team members.
- Good to have : Experience with test automation and test-driven development practice
EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets.