Fullstack AI Engineer (Onsite, Lahore, PKR Salary)

Pakistan·Lahoremid

Machine Learning EngineerData

0 views0 saves0 applied

Apply Now

Quick Summary

Key Responsibilities

Deploy and optimize LLMs (open-source and commercial) for production use Implement inference optimization techniques (quantization, batching, caching,

Requirements Summary

4 years of experience as a fullstack or backend engineer Strong proficiency in Python and JavaScript/TypeScript Experience with FastAPI / Django / Node.js and React / Next.

Technical Tools

Machine Learning EngineerData

Requirements

~1 min read

4 years of experience as a fullstack or backend engineer
Strong proficiency in Python and JavaScript/TypeScript
Experience with FastAPI / Django / Node.js and React / Next.js
Solid understanding of distributed systems and async architectures
Hands-on experience deploying LLMs such as GPT-4/4.1, Claude, LLaMA, Mistral, Mixtral
Experience serving models using vLLM, Triton, TGI, or similar frameworks
Strong understanding of transformer models and inference trade-offs
Experience with embeddings, vector search, and RAG architectures
Experience with AWS, GCP, or Azure (GPU workloads preferred)
Strong Docker and Kubernetes experience
Familiarity with CI/CD pipelines for ML systems
Experience with observability tools (Prometheus, Grafana, OpenTelemetry)
Experience with multimodal AI (audio, video, image models)
Experience optimizing LLM inference costs at scale
Startup or high-growth environment experience
Prior work on AI-first or AI-native products

Responsibilities

~1 min read

→Deploy and optimize LLMs (open-source and commercial) for production use
→Implement inference optimization techniques (quantization, batching, caching, distillation)
→Build and maintain RAG pipelines (embeddings, vector databases, retrieval strategies)
→Evaluate and improve model quality (latency, accuracy, hallucination reduction, cost)
→Implement prompt management, versioning, and A/B testing
→Design and develop scalable APIs for AI-driven features
→Deploy and manage model-serving infrastructure (Docker, Kubernetes, GPUs)
→Optimize hardware utilization for inference workloads
→Implement monitoring, logging, and alerting for AI services
→Ensure security, data privacy, and compliance across AI pipelines
→Build internal tools and user-facing interfaces for AI workflows
→Integrate LLM services into web and mobile applications
→Work closely onsite with product managers, designers, and data teams
→Rapidly prototype, test, and iterate on AI-powered features