Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Tesla Staff Software Engineer Observability Data Engineering 
United States, Texas, Austin 
306185638

10.04.2025
What to Expect

In this role, you will focus on building components and infrastructure for a Tesla-scale metrics collection and querying service. You will be building large-scale data ingestion, storage, and retrieval systems. These systems will implement monitoring and alerting for mission-critical infrastructure across Tesla’s global network, and are critical for the observability essential to maintaining, scaling, and troubleshooting Tesla’s complex internal systems. You will work in a fast-paced team with a group of world-class engineers and data scientists to help build and deliver a performant, resilient and reliable service. You will be part of the team that is leading and driving core technology and systems innovation in this area, creating a positive impact on the business and observability industry as a whole. You will have end-to-end responsibilities across the entire engineering domain.

What You’ll Do
  • Design, implement and test new features in multiple projects and languages
  • Maintain existing code in said projects, including bug fixes, performance improvements and refractoring
  • Deliver code that meets the team's testability, readability, performance, observability and other quality standards
  • Debug and fix issues in production and development environments, including, but not limited to: Performance problems, Logic bugs, Data issues and Operational problems such as networking or database issues
  • Contribute to infrastructure improvements, automate busywork, simplify deployments, take advantage of Tesla's existing infrastructure
  • Interact with internal customers to onboard them, clarify requirements, collect feedback and support field issues
  • Provide mentorship and feedback to teammates, both junior and senior, in the form of code reviews, design reviews, brainstorming sessions, etc
  • Participate in non formal on-call rotation and respond to incidents in a timely manner, resolving issues to minimize downtime & impact on users
What You’ll Bring
  • 10+ years industry experience in software development, with a minimum of 8 years in Internet-scale distributed systems or data platform system software engineering
  • Proficient in at least two general programming language such as Python, R, C/C++, Rust, Java, or Go. Expertise in Go is an advantage
  • Familiarity with Prometheus, PromQL, Grafana and OpenTelemetry is an advantage
  • Expert knowledge of distributed OLAP databases, SQL query design, optimization, and data analytics
  • A self-driven individual contributor and an excellent team player with a startup mentality is preferred
  • Superb communication skills, with the ability to influence at all levels of the organization, are essential to success