You will play a key role in designing, implementing, and optimizing ML solutions for highly constrained compute environments. This is a cross-disciplinary role that blends expertise in embedded systems, computer architecture, and machine learning to unlock new applications in areas such as IoT, wearables, robotics, and autonomous systems.RESPONSIBILITIES:- Design and implement embedded ML pipelines on microcontrollers and custom SoCs with tight compute, memory, and power constraints.- Optimize and quantize deep learning models for real-time inference on edge platforms.- Develop and maintain low-level firmware in C/C++ to integrate ML models with custom hardware accelerators and sensors.- Conduct performance benchmarking, memory profiling, and bottleneck analysis across various embedded platforms.- Collaborate closely with ML researchers, hardware architects, and product engineers to co-design efficient ML solutions from model training to deployment.- Evaluate new edge ML techniques, compilers (e.g., TVM, TFLite Micro, CMSIS-NN), and toolchains to advance the team's capabilities.- Contribute to the overall system architecture with a deep understanding of embedded compute, memory hierarchies, and data flow optimization.