Primary Skills / Must have:
- Should have strong programming skills in Python.
- Experience in creating large scale data processing pipelines using a Python and Spark based framework.
- Work with different aspects of the Spark ecosystem, including Spark SQL, DataFrames, Datasets, and Streaming
- Should possess strong SQL skills.
- Excellent understanding of Unix ecosystem and should have experience in creating the shell scripts.
- Excellent understanding of Hive/Hadoop ecosystem.
- Solid Understanding of data engineering concepts and best practices.
- Excellent understanding of Job Scheduling mechanisms like Autosys, TWS.
- Excellent problem solving and analytical skills.
- Excellent verbal and written communication skills.
- Experience in optimizing large data loads.
Secondary Skills / Desired skills
- Exposure to an Agile Development environment would be a plus.
- Strong understanding of Data warehousing domain.
- Ability to architect an ETL solution and data conversion strategy.
- Good understanding of dimensional modelling.
Roles and Responsibilities
As an ETL/ Python developer, the candidate is expected to
- Design and development of highly optimized and scalable ETL applications using Python and Spark.
- Undertaking end-to-end project delivery (from inception to post-implementation support), including review and finalization of business requirements, creation of functional specifications and/or system designs, and ensuring that end-solution meets business needs and expectations.
- Development of new transformation processes to load data from source to target, or performance tuning of existing ETL code (mappings, sessions).
- Analysis of existing designs and interfaces and applying design modifications or enhancements
- Coding and documenting data processing scripts and stored procedures.
- Providing business insights and analysis findings for ad-hoc data requests
- Testing software components and complete solutions (including debugging and troubleshooting) and preparing migration documentation.
- Providing reporting-line transparency through periodic updates on project or task status.
EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets.