Masters degree or foreign equivalent in Computer Science, IT, Data Science, Computer Engineering, or related field and 4 years of experience in the job offered or related occupation.
4 years of experience with each of the following skills is required:
Utilizes Data Principles, Data Architecture & Data Modeling, Strong SQL or HiveQL or PigLatin.
Large scale dataset development for multiple use-cases including analytics or Machine Learning or Search.
3 years of experience with each of the following skills is required:
Experience in large-scale data processing using programming language Java, Scala or Python.
Experience in Open-source data formats storage including Apache Hive or Apache Parquet or Apache Iceberg.
Experience with large scale data analysis and data quality using tools including pandas, jupyter notebook or superset.
Experience building distributed data processing applications using Apache Spark or Apache Pig.
Experience with data transformation, enrichment for large scale data using Apache Spark or Apache Flink.
Experience in orchestration engines including Apache oozie or Apache Airflow for large scale data processing.
2 years of experience with each of the following skills is required:
Experience with developing features for machine learning models using tools including scikit-learn or tensorflow or pytorch.
Experience in machine learning or data mining for data curation and improving corpus quality.