מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Key Responsibilities:
Design and implement data ingestion, transformation, and cleansing pipelines using PySpark, SQL, and Python/Java.
Work on structured and unstructured datasets stored in HDFS, Hive, Parquet, or cloud-based storage.
Optimize existing data workflows and jobs for performance, scalability, and reliability.
Support batch and streaming data processing frameworks across Big Data platforms (e.g., Hadoop, Spark, Hive, Kafka).
Integrate and process data from multiple sources including APIs, flat files, relational databases, and cloud-native services.
Apply data modeling, partitioning, and file format best practices for efficient storage and querying.
Implement monitoring, logging, and alerting for production pipelines and participate in on-call rotation if required.
Document pipeline logic, data lineage, and schema changes to ensure data transparency and auditability.
Collaborate with data analysts, data scientists, and product owners to translate business needs into scalable data solutions.
Assist in proof-of-concept efforts for new technologies and data integration strategies.
Technical Skills Required:
2–5 years of experience in a data engineering, ETL development, or big data role.
Strong programming experience in Python (or Java) for data manipulation and automation.
Advanced proficiency in SQL (window functions, joins, CTEs, optimization techniques).
Experience working with Apache Spark (PySpark) in a distributed environment.
Hands-on with Hadoop ecosystem tools (Hive, HDFS, Oozie, etc.).
Familiarity with Git, Jenkins, Airflow, or other CI/CD and orchestration tools.
Exposure to cloud platforms (AWS Glue/EMR, Azure Data Factory, GCP Dataflow) is a plus.
Knowledge of basic ML workflows (feature engineering, model inputs/outputs) is desirable but not mandatory.
Soft Skills & Communication:
Strong verbal and written communication skills; able to articulate technical concepts to business stakeholders.
Able to document processes, architecture diagrams, and data dictionaries with clarity.
Demonstrates strong interpersonal skills, working well with cross-functional teams in a collaborative Agile/DevOps environment.
Provides informal guidance or mentoring to junior developers and contributes to code reviews and technical discussions.
Proactive in identifying data quality issues, bottlenecks, and process gaps, with a problem-solving mindset.
Education:
Bachelor’s degree in Computer Science, Data Engineering, or a related discipline; or equivalent experience required.
Anticipated Posting Close Date:
משרות נוספות שיכולות לעניין אותך