Essential Responsibilities:
Minimum Qualifications:
Preferred Qualification:
Lead the development,optimizationand maintananceofETL pipelinesto handle large volumes of data from multiple sourcesforadvanced machine learning models
Build and optimize distributed data processing systems using big data frameworks and technologies
Maintain and improve existing data infrastructure, ensuring high availability and fault tolerance
engineers from other sites,data scientistsand business stakeholders to understand data requirements and deliver appropriate solutions
Strong proficiency in Python, Java or Scala
Extensive experience with Apache Spark (Spark SQL, Spark Streaming, PySpark)
Hands-on experience with Hadoop ecosystem (HDFS, YARN, Hive, HBase)
Experience with cloud-based data platforms (Google BigQuery)
Experience with relational databases (e.g., PostgreSQL, MySQL) and/or NoSQL databases (e.g., MongoDB)
Experience with time-series databases (InfluxDB)
Experience with version control systems (Git) and CI/CD practices
Familiar with Linux environments; able to perform troubleshooting and write automation scripts (Shell/Python)
Knowledge of RESTful API development and HTTP client libraries
Experience inperformance tuning of big data solutions
in building GenAI based solutions
Strong problem-solving skills and attention to detail
Experience working in agile development environments
Excellent communication and collaboration skills
Our Benefits:
Any general requests for consideration of your skills, please
משרות נוספות שיכולות לעניין אותך