Develop, maintain, and optimize data pipelines and workflows using SQL, S3, AWS Glue, AWS Lambda, Pyspark, and AWS SageMaker-Machine Learning platform.
Exposure to Gen AI models.
SQL proficiency, including CTE (Common Table Expressions) and tuning SQL queries for performance. (Snowflake-preferred)
Ensure seamless data integration with databases like Aurora DB, Dynamo DB, Redshift.
Implement CI/CD pipelines and scheduling tools like Stonebranch to streamline data operations.
Perform data profiling, data analysis, and present actionable insights to stakeholders.
Apply Data Warehousing and Data Modeling concepts, including 3NF, Star, and Snowflake schemas to ensure scalable data architecture.
Leverage AWS SageMaker for machine learning model deployment and management in production environments.
Collaborate with cross-functional teams to ensure adherence to development best practices, including peer reviews, unit testing, and process optimization.