Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

Cellebrite Senior Data Engineer 
Israel, Center District, Petah Tikva 
764970434

Yesterday

Position Overview:

You’ll work closely with data scientists, ML engineers, and product teams to build scalable, production-ready data infrastructure leveraging Apache Spark, AWS services, and Infrastructure-as-Code.

Key Responsibilities:

  • Influence Data Architecture: Design scalable, secure data platforms using Spark (EMR), Glue, and AWS event-driven services.
  • Develop Scalable Pipelines: Build batch & streaming ETL/ELT pipelines with Spark EMR, Athena, Iceberg, EKS, Lambda.
  • Drive Innovation: Introduce patterns like data mesh, serverless analytics, and schema-aware pipelines.
  • Cross-Team Collaboration: Work with ML, backend, and product teams to deliver data-powered solutions.
  • Operational Excellence: Apply observability, cost control, performance tuning, and CI/CD automation using CloudWatch, Step Functions, Terraform/CDK.
  • Security: Implement AWS best practices for IAM, encryption, compliance, and auditability.

Experience:

  • 8+ years of hands-on experience in data engineering, with proven responsibility for designing, developing, and maintaining large-scale, distributed data systems in cloud-native environments (preferably AWS).
  • End-to-end ownership of complex data architectures – from data ingestion to processing, storage, and delivery in production-grade systems.
  • Deep understanding of data modeling , data quality , and pipeline performance optimization .

Technical Expertise:

  • Apache Spark : Expertise in writing efficient Spark jobs using PySpark , with experience running workloads on AWS EMR and/or Glue for large-scale ETL and analytical tasks.
  • AWS Services : Strong hands-on experience with S3 , Lambda , Glue , Step Functions , Kinesis , and Athena – including building event-driven and serverless data pipelines.
  • Solid experience in building and maintaining both batch and real-time (streaming) data pipelines and integrating them into production systems.
  • Infrastructure as Code (IaC) : Proficient in using Terraform , AWS CDK , or SAM to automate deployment and manage scalable data infrastructure.
  • Python as a primary development language (with bonus points for TypeScript experience).
  • Comfortable working in agile, fast-paced environments, with strong debugging, testing, and performance-tuning capabilities.

Mindset & Skills:

  • Strong system design & code quality focus
  • Ownership from architecture to monitoring
  • Automation-oriented, loves simplicity
  • Excellent communication & mentoring abilities.

Nice to Have:

  • AWS certifications (e.g., Solutions Architect, Data Engineer).
  • Experience with ML pipelines or AI-enhanced analytics.
  • Familiarity with data governance, data mesh, or self-service platforms.
  • Domain experience in cybersecurity, law enforcement, or regulated industries.

✅ Competitive compensation & full benefits.

✅ Work on mission-critical systems that protect lives.

✅ Continuous learning & real career growth opportunities.