Job Description
Job Description
45-day contract (full-time), potential long-term role
About the Role
We are building a real-time AI system using open-source LLMs. Your job is to install and optimize backend deep learning infrastructure. You will NOT work on business logic — only the engine.
Responsibilities
- Install, configure, and optimize DeepSeek R1 / V3 models
- Deploy vLLM or LM Studio inference server
- Build FastAPI backend to expose custom LLM APIs
- GPU optimization & quantization (AWQ, GPTQ, FP8)
- Manage model weights, tokenizers, streaming endpoints
- Implement secure API access keys
- Work closely with a system architect (CTO-level guidance provided)
Job Specification
Qualifications
- Strong Python + FastAPI skills
- Experience with vLLM / TGI / Ollama / LM Studio
- Deep learning fundamentals (PyTorch)
- Knowledge of GPU e...
Ready to Apply?
Take the next step in your AI career. Submit your application to Kasipan today.
Submit Application