

Share
We’re looking for a Senior Data Engineer who thrives on solving hard problems, shaping new capabilities, and delivering high-quality results in a fast-paced environment. You will be at the forefront of integrating LLM-powered solutions with robust backend systems, ensuring they scale securely and reliably to serve global customers.Key job responsibilities
- Architect and implement a scalable, cost-optimized S3-based Data Lakehouse that unifies structured and unstructured data from disparate sources.
- Lead the strategic migration from our Redshift-centric architecture to a flexible lakehouse model.
- Establish metadata management with automated data classification and lineage tracking.
- Design and enforce standardized data ingestion patterns with built-in quality controls and validation gates.
- Architect a centralized metrics repository that becomes the source of truth for all FBA metrics.
- Implement robust data quality frameworks with staging-first policies and automated validation pipelines.
- Design extensible metrics schemas that support complex analytical queries while optimizing for AI retrieval patterns.
- Develop intelligent orchestration for metrics generation workflows with comprehensive audit trails.
- Lead the design of semantic data models that balance analytical performance with AI retrieval requirements.
- Implement cross-domain federated query capabilities with sophisticated query optimization techniques.- Design and implement hybrid search strategies combining dense vectors with sparse representations for optimal semantic retrieval.
- 7+ years of data engineering experience with demonstrated expertise in distributed systems at scale.
- Deep technical knowledge of AWS data services (Glue, Kinesis, Redshift, S3, Lambda) and infrastructure-as-code.
- Proven experience designing and implementing enterprise Data Lakehouse architectures and metrics repositories.
- Strong programming skills in Python and/or Scala with expertise in distributed data processing frameworks.
- Track record of building high-performance, scalable data pipelines supporting mission-critical business operations.
- Experience with vector databases, embedding models, or AI-adjacent data infrastructure.
These jobs might be a good fit