מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
About the job
Become an AI software engineer with a data engineer focus, in a team which will play a critical role in infusing Red Hat products with generative AI. In this combined role, you will have two focus areas:
Collaborate with other software engineers to design and develop our generative AI platform; and
Collaborate with data scientists to build and maintain high-leverage data sets containing Red Hat’s unique experience and expertise, to be used for fine-tuning large language models and creating RAG databases of high quality. Your data sets will influence and will be influenced by Red Hat infrastructure built from ground up to support the fine-tuning process.
You’ll not only build the data sets - You will also participate in design, deployment, configuration and optimization of the project’s code and infrastructure. The ideal candidate will therefore be excited about the data, the science and the code.
What you will do
Participation in architectural designs, technology choices as well as setting and meeting a high bar for quality that will impact adoption by a very diverse group of internal customers.
Participation in developing features, fixing bugs, mitigating security threads, reviewing code and writing automated tests for the project
Design systems, integrations and processes required to achieve the best fine tuning results, including selection and integration of data sources, data pre-processing and subsequent quality evaluation.
Design, build, and maintain scalable data pipelines for extracting, transforming, and loading (ETL) data from internal Red Hat systems into LLM training process
Develop and optimize databases to ensure efficient data storage and retrieval.
Design and develop data warehousing solutions to support large scale data storage.
Utilize Python for data manipulation, automation, and analysis. Ensure high quality data is used as an input for model fine tuning and RAG building.
Contribute to the entire stack, from active participation in the fine tuning process to the implementation of and ongoing optimization of the designed systems
Collaborate with other team members (data scientists, software engineers) as well as other teams to deliver a best-in-class solution and maintain it.
Work in a fast-paced agile globally distributed environment of talented engineers
What you will bring
Bachelor's degree in computer science, or equivalent related work experience.
Experience in data engineering, preferably in AI/ML contexts.
Experience with Python software development.
Strong self-motivation, problem solving and organizational skills.
Collaborative attitude and willingness to share ideas openly.
Excellent English written and verbal communication skills.
Ability to quickly learn and use new tools and technologies
The following is considered a plus
Experience with AI and Machine Learning platforms, tools, and frameworks, such as: Tensorflow, PyTorch, LLaMA.cpp, and Kubeflow.
Familiarity with different LLM parameters like temperate, top-k, and repeat penalty, and different LLM outcome evaluation data science metrics and methodologies.
Understanding of LLM architectures, training processes, and data requirements
Experience with various vector store technologies and their applications in AI
Experience with Cloud Native Technologies and Platforms (e.g. Kubernetes)
Understanding of data lakehouse concepts and architectures
Experience with agile development, CI/CD systems and DevOps methodology
Experience with big data storage techniques, such as Parquet, Avro, and S3.
משרות נוספות שיכולות לעניין אותך