Deep Learning Model Optimization Engineer

miniByte

📍 Rawalpindi, Punjab, پنجاب, Pakistan

Full-time Other-General Posted February 27, 2026

Apply Now Similar Jobs

Job Description

About the Role 
 is hiring Deep Learning Model Optimization Engineers to build, train, and optimize state-of-the-art deep learning models for high-performance production deployment. This role sits at the intersection of research and systems engineering, with a strong focus on inference efficiency across GPUs and edge devices. 
Key Responsibilities 
Design and implement deep learning models (CNNs, Transformers, hybrid architectures). 
Build scalable training pipelines and distributed training workflows. 
Apply model compression techniques: quantization, pruning, and knowledge distillation. 
Optimize inference using TensorRT, ONNX Runtime, OpenVINO, or TVM. 
Profile and analyze performance bottlenecks using GPU profiling tools. 
Develop custom CUDA/C++ kernels when required. 
Benchmark latency, throughput, and accuracy across hardware platforms. 
Collaborate on deployment using Triton Inference Server and c...
                

Ready to Apply?

Take the next step in your AI career. Submit your application to miniByte today.

Submit Application

Job Details

Location

Rawalpindi, Punjab, پنجاب, Pakistan

Job Type

Full-time