About the role:
If this kind of working environment sounds exciting to you, if you understand that Engineering is about building the most effective and elegant solution within a given set of constraints - consider applying for this position. But hold on, you’d best check the position requirements first :) What you'll be doing:
What you'll need:
- Designing, building and maintaining the ML infrastructure that allows Forter’s models to make billions of real-time decisions every year.
- Handling distributed data processing pipelines to support model development.
- Acting as a consultant to researchers, data scientists and expert analysts and enabling them to research new models faster and with greater precision by providing cutting edge tooling.
- Expanding our ML infrastructure to make it scalable, quick and efficient to bring diverse models to production and to monitor their performance and drift over time.
- Expanding the pool of internal customers able to use ML at Forter. Working with them to understand their needs and help them make the most of the infrastructure that we’ll provide.
- Acting as an advocate for MLOps, continually improving our processes and raising our standards.
Projects you'll work on:
- 2+ years experience with large scale data processing, ideally with Apache Spark.
- 5+ years developing complex software projects (Python / Ruby / Go / etc.)
- Motivation to understand the needs of internal users, provide them with great tooling and teach them how to use it.
- Experience working with public clouds (AWS / GCP / Azure)
- Fluent in written and spoken English
We have a ton of important work to do, which is why we’re hiring! Our projects are of course changing all the time, but here are a few that we’ve either done in the past or are planning for the near future, so you can get an idea of the types of work we do.
It'd be really cool if you also:
- Build data engineering pipelines to support ML projects and their complex data requirements.
- Develop reusable infrastructure and methodology that lets us bring new models to Production faster, without reinventing the wheel for each new business use case.
- Designing and delivering our Data Scientists’ research environment, for instance by providing experiment tracking, distributed hyperparameter search and great EDA tooling.
- Find solutions for effectively monitoring our models’ performance and context drift. Fraud prevention presents unique challenges here; most ‘ground truth’ labels arrive months after the prediction, and for transactions we decline they never arrive at all.
- Provide tools to quickly assess the impact of new features, prior to bringing them to production.
- Make it trivial for our analysts to retrain models and get the newly trained models into production.
- Have familiarity with machine learning concepts and frameworks.
- Are familiar with Databricks or Airflow.
- Are comfortable in a containerized environment.