Job Description

Your mission will be to translate cutting-edge research into production-ready solutions, focusing on model compression, system optimizations, and agentic capabilities such as function calling and tool orchestration. Experience with designing secure and reliable agentic workflows, including guardrails and safe tool invocation, is considered a strong plus.

What You’ll Do

Optimize LLMs and multimodal models for on-device deployment

Investigate, develop and apply advanced quantization (8-bit, 4-bit, mixed precision), pruning, and distillation techniques for deriving optimized models for NXP NPU targets.

Accelerate inference performance

Investigate, develop and implement system optimizations such as speculative decoding and other efficient decoding algorithms tailored for edge environments.

Engineer agentic AI capabilities towards tiny agents

Investigate methodologies for enhancing the performance of small language models to...

Ready to Apply?

Take the next step in your AI career. Submit your application to microTECH Global LTD today.

Submit Application