מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
You will contribute to building a system to do this for Capital One models, accelerating the move from fully trained models to deployable model artifacts ready to be used to fuel business decisioning and build an observability platform to monitor the models and platform components.
● Architect and develop full stack solutions for monitoring, logging, and managing Generative AI , machine learning workflows and models.
● Architect, build and deploy well-managed core APIs and SDKs for observability of LLMs and proprietary Foundation Models including training, pre-training, fine-tuning and prompting.
● Work with model and platform teams to build systems that ingest large amounts of model and feature metadata and runtime metrics to build an observability platform and to make governance decisions to ensure ethical use, data integrity, and compliance with industry standards for Gen-AI.
● Partner with product and design teams to develop and integrate advanced observability tools tailored to Gen-AI.
● Leverage cloud-based architectures and technologies to deliver solutions for platform users providing deep insights into model performance, data flow, and system health.
● Collaborate as part of a cross-functional Agile team, data scientists, ML engineers, and other stakeholders to understand requirements and translate them into scalable and maintainable solutions.
● Use programming languages like Python, Scala, or Java
● Leverage continuous integration and continuous deployment best practices, including test automation and monitoring, to ensure successful deployments of machine learning models and application code.
● Master's Degree in Computer Science or a related field
● 12+ years of experience in software engineering and solution architecture
● At least 8+ years of experience designing and building data intensive solutions using distributed computing
● At least 8+ years of experience programming with Python, Go, or Java
● Proficiency in observability tools such as Prometheus, Grafana, ELK Stack, or similar, with a focus on adapting them for Gen AI systems.
● Excellent knowledge in Open Telemetry and priority experience in building SDKs and APIs.
● Excellent communication skills, capable of articulating complex technical concepts to diverse audiences and driving cross-functional initiatives.
● Experience developing and deploying ML platform solutions in a public cloud such as AWS, Azure, or Google Cloud Platform.
If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation, please contact Capital One Recruiting at 1-800-304-9102 or via email at . All information you provide will be kept confidential and will be used only to the extent required to provide needed reasonable accommodations.
משרות נוספות שיכולות לעניין אותך