Understand SQL at an advanced level and are able to review SQL statements from other teams
Have experience with distributed computing frameworks, such as Hadoop or Spark.
Have experience with data pipelines and workflow management tools (for example, Airbyte)
Are able to write easy to maintain code and documentation in at least one language. (We’re a golang shop.)
Know how to leverage CI/CD systems to deploy and make changes to an already existing data platform infrastructure
Are able to use infrastructure-as-code tools to create self-healing systems. (We use Terraform)
Understand the problem domain of cloud-native infrastructure for data services, and keeps your knowledge up to date as the industry evolves
Can lead small-to-medium sized projects independently, while keeping your Lead (Manager) and various stakeholders up-to-date
Can own the infrastructure of our data platform, and uses your knowledge to keep it stable, performant, durable, and secure
Always be willing to jump in and use your knowledge and skills to resolve issues, hopefully before our users even notice
What would make you stand out:
Love the challenge of dealing with large amounts of fast-moving data
In addition to advanced SQL knowledge, you’re familiar with MQL. (MongoDB Query Language)
You know several programming languages, including Go, Java and Python
Have an open mind and a willingness to adapt infrastructure designs over time as our needs change and grow
Are able to work with container orchestration systems, such as Kubernetes, even if you aren’t (yet!) an expert in it
Have an understanding of telemetry data. We use OpenTelemetry and like to make things as visible as we can
You are well-versed in data reliability and integrity best practices
You maintain a working knowledge of the data-focused offerings from our cloud providers, and are able to make recommendations as new technologies emerge
You have a deep knowledge of Linux and TCP/IP networking
You can help our team win the game of “Apache Project or Fictional Dinosaur-Like Cartoon Character?”
Other things you might want to know:
We’re a distributed team. Our Platform team is located mostly in the EDT and PDT time zones, but we work with other teams all over the world
Our team is remote-first. We use tools like Slack and Zoom to work together. We try to get together on occasion, but our day-to-day is all remote (If you live close to one of our offices, and would like to use it, that’s okay, too!)
We have a lot of data that’s generated every day. Our internal CI system, , runs millions of tests daily
One of the challenges we’re facing is figuring out how to take those millions of daily test results and determine which tests are the most beneficial to run
Our customers are all internal to MongoDB. We’re in the nice position where we can easily talk to our users
You’d have a chance to join our team at the very early stages of our data platform. We have a big dream and mission, and you’d get to help us design and implement it along the way
You’d get a chance to look at some very large MongoDB instances