Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Walmart Distinguished Software Engineer 
United States, California, Sunnyvale 
976150906

08.05.2024

What you'll do...

Job Summary:

We are looking for an experienced Distinguished Engineer, AI Systems, to help us build the foundation of our Generative AI platform and the eco-system of services. You will work on a wide range of initiatives, whether that's designing robust, secure infrastructure, building large-scale distributed training clusters, deploying LLMs on GPU instances for real-time use cases, or supporting cutting-edge AI research and development, all in our public cloud infrastructure. You will work with a team of AI engineers and researchers to envision the target state of our platform while helping to design and implement key services.

What you’ll do:

  • Design and build fault-tolerant infrastructure to support long-running and large-scale training tasks resilient to failure of individual nodes, using containers and check-pointing libraries.
  • Design and build infrastructure for serving large ML models, in our public cloud.
  • Produce roadmaps across the technical scope, including consulting on objectives and key results across teams, reviewing designs, participating in and resolving technical discussions, and driving engineering investments for Core ML.
  • Design and implement benchmarks to measure the performance of software systems within a Generative AI Platform and make recommendations on technology selection.
  • Develop applications that leverage LLMs and FMs, e.g. conversational AI.
  • Design and implement platform capabilities to support MLOps for foundation models.
  • Algorithm optimization: Optimize AI algorithms and models for improved performance, efficiency, accuracy, and scalability. Apply techniques such as hyperparameter tuning, feature engineering, and model selection to enhance the quality and reliability of AI systems.
  • Collaborate with cross-functional teams: Work closely with researchers, data scientists, and other stakeholders to understand their needs, gather requirements, and develop AI-driven solutions. Collaborate with domain experts to ensure that AI models and systems align with regulatory and industry standards.
  • Documentation and reporting: Prepare technical documentation, reports, and presentations to effectively communicate AI methodologies, results, and recommendations to both technical and non-technical audiences. Maintain clear and concise documentation of AI models, code, and processes.
  • Stay updated with the latest advancements: Keep abreast of the latest advancements in AI, machine learning, and industry trends. Stay informed about regulatory guidelines, data privacy, and ethical considerations related to AI applications in the Retail domain.


What you’ll bring:

  • Bachelor's degree in Computer Science, Computer Engineering or a technical field
  • At least 9 years of experience designing and building distributed computing/HPC and large-scale ML systems
  • At least 6 years of experience developing AI/ML algorithms in Python or C/C++
  • At least 3 years of experience with the full ML development lifecycle using open-source AI/ML frameworks and public cloud.


Preferred Qualifications:

  • Master's degree or PhD in Engineering, Computer Science, a related technical field, or equivalent practical experience with a focus on modern AI techniques.
  • Experience designing large-scale distributed platforms and/or systems in cloud environments such as Azure, or GCP.
  • Experience architecting cloud systems for security, availability, performance, scalability, and cost.
  • Experience with delivering very large models through the MLOps life cycle from exploration to serving.
  • Experience with building GPU clusters in the public cloud with tightly-coupled storage and networking.
  • Experience with the complete stack for distributed training of large models including ML compilers, distributed training frameworks, and ML development frameworks such as Pytorch, Tensorflow, Lightning etc.
  • Experience with GenAI technology stack including frameworks for prompt engineering, guardrails for GenAI applications, and LLM fine tuning.
  • Experience working with VectorDBs and other data infrastructure required to efficiently support Generative AI training pipelines and production applications.
  • Experience training and maintaining large language models.
  • Authored research publications in top peer-reviewed conferences, or industry-recognized open-source contributions in the space of neural networks, distributed training and SysML.





Benefits: Beyond our great compensation package, you can receive incentive awards for your performance. Other great perks include 401(k) match, stock purchase plan, paid maternity and parental leave, PTO, multiple health plans, and much more.

The above information has been designed to indicate the general nature and level of work performed in the role. It is not designed to contain or be interpreted as a comprehensive inventory of all responsibilities and qualifications required of employees assigned to this job. The full Job Description can be made available as part of the hiring process.

You will also receive PTO and/or PPTO that can be used for vacation, sick leave, holidays, or other purposes. The amount you receive depends on your job classification and length of employment. It will meet or exceed the requirements of paid sick leave laws, where applicable.

For information about PTO, see

Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to a specific plan or program terms.

For information about benefits and eligibility, see

SUNNYVALE, California US-04396:The annual salary range for this position is $169,000.00-$338,000.00 Bentonville, Arkansas US-10735:The annual salary range for this position is $130,000.00-$260,000.00 Additional compensation includes annual or quarterly performance bonuses. Additional compensation for certain positions may also include: - Stock Minimum Qualifications...

Outlined below are the required minimum qualifications for this position. If none are listed, there are no minimum qualifications.

Option 1: Bachelor's degree in computer science, computer engineering, computer information systems, software engineering, or related area and6 years’ experience in software engineering or related area.
Option 2: 8 years’ experience in software engineering or related area.
Preferred Qualifications...

Outlined below are the optional preferred qualifications for this position. If none are listed, there are no preferred qualifications.

840 W CALIFORNIA AVE, SUNNYVALE, CA 94086-4828, United States of America