Job responsibilities:
- Provides recommendations and insight on data management, governance procedures, and intricacies applicable to the acquisition, maintenance, validation, quality, anomaly detection and utilization of data
- Designs and delivers trusted data collection, storage, access, and analytics data platform solutions in a secure, stable, and scalable way
- Defines database back-up, recovery, and archiving strategy
- Generates advanced data models for one or more teams using firmwide tooling, linear algebra, statistics, and geometrical algorithms
- Approves data analysis tools and processes
- Creates functional and technical documentation supporting best practices
- Proactively identifies hidden problems and patterns in data and uses these insights to drive improvements to coding hygiene and system architecture.
- Design & develop data pipelines end to end using Spark SQL, Java and AWS Services. Utilize programming languages like Java, Python, NoSQL databases, SQL, Container Orchestration services including Kubernetes, and a variety of AWS tools and services.
- Contributes to software engineering communities of practice and events that explore new and emerging technologies
- Evaluates and reports on access control processes to determine effectiveness of data asset security
- Adds to team culture of diversity, equity, inclusion, and respect
Required qualifications, capabilities, and skills:
- Formal training or certification on software engineering concepts and 5+ years of applied experience.
- Experience in developing, debugging, and maintaining code in a large corporate environment with one or more modern programming languages and database querying language.
- Hands-on practical experience in developing spark-based Frameworks for end-to-end ETL, ELT & reporting solutions using key components like Spark SQL & Spark Streaming.
- Proficient in coding in one or more Coding languages - Java, Python
- Experience with Relational and No SQL databases,
- Cloud implementation experience with AWS including:
- AWS Data Services: Proficiency in Lake formation, Glue ETL (or) EMR, S3, Glue Catalog, Athena, Kinesis (or) MSK, Airflow (or) Lambda + Step Functions + Event Bridge
- Data De/Serialization: Expertise in at least 2 of the formats: Parquet, Iceberg, AVRO, JSON-LD
- AWS Data Security: Good Understanding of security concepts such as: Lake formation, IAM, Service roles, Encryption, KMS, Secrets Manager
- Proficiency in automation and continuous delivery methods.
- Solid understanding of agile methodologies such as CI/CD, Applicant Resiliency, and Security
Preferred qualifications, capabilities, and skills:
- Snowflake knowledge or experience preferred
- In-depth knowledge of the financial services industry and their IT systems
- Worked with building Data lake, built Data platforms, built Data frameworks, Built/Design of Data as a Service AP