Design, develop, and maintain ETL processes using Apache Spark, Python to ingest, transform, and load data from various sources
Implement data quality checks, monitoring mechanisms, and error handling mechanisms to uphold data integrity
Develop analytical and reporting models for our SaaS product, ensuring they meet the needs of large-scale enterprise customers
Partner with the Operations team to manage and scale our data pipelines on cloud platforms such as AWS, Azure, or GCP
Debug, troubleshoot, and fine-tune complex SQL queries to enhance performance
Improve all aspects of non-functional requirements associated with each feature of our SaaS application delivered on public and private clouds, including conducting periodic performance profiling and capacity planning of critical systems
What you bring
Requires 4+ years of professional experience in development and 1+ experience in Python, Apache Spark, Databricks oror other managed Spark platforms, and ETL processes, along with a passion for solving complex data challenges.
Promote a culture of high-performance Agile software engineering, continuous improvement, and customer satisfaction.
Familiarity with Big Data, NoSQL, Streaming Analytics, and Predictive and Prescriptive Analytics will be advantageous.
Familiarity with Dimensional modelling techniques will be advantageous.
Possess outstanding written and verbal communication skills.
Bachelor’s in computer science required, Master preferred.