Onit
Onit1mo ago

Senior AI Engineer

PuneFull-timesenior
Data ScienceMachine Learning EngineerAI EngineerDataData & AI
0 views0 saves0 applied

Quick Summary

Overview

We’re seeking a Senior AI Engineer to design and ship production-grade agentic AI systems that automate complex workflows end-to-end. This is a hands-on,

Technical Tools
Data ScienceMachine Learning EngineerAI EngineerDataData & AI
We’re seeking a Senior AI Engineer to design and ship production-grade agentic AI systems that automate complex workflows end-to-end. This is a hands-on, senior role with significant technical ownership. You’ll work closely with the Chief Architect, product, engineering, and domain experts to translate ambiguous, high-impact problems into reliable AI-driven user experiences.
What success looks like:
Ship AI capabilities that measurably improve user outcomes (quality, time saved, throughput)
Build systems that are reliable by design: evals, observability, safety, and cost/latency controls from day one
Iterate quickly using a tight loop of instrument → evaluate → improve → deploy 
Agentic AI Feature & Workflow Development
  • Build and integrate AI-driven features using LLM APIs (OpenAI / Azure OpenAI, Anthropic, Gemini on Vertex AI)
  • Design and implement tool-using agents (structured function calling, schema validation, retries, fallbacks)
  • Build multi-agent workflows when appropriate (e.g., planner/worker, reviewer/critic, specialist routing) and know when a simpler architecture is better
  • Create agentic workflows such as document understanding, extraction, reasoning over evidence, task automation, and multi-step decision support
  • Own context engineering end-to-end:
  • dynamic context assembly (retrieval + state + tool outputs)
  • context budgeting and compression/summarization
  • grounding strategies to reduce hallucinations and improve consistency
  • Implement retrieval-augmented generation (RAG) and search workflows using off-the-shelf vector stores and embedding services 
  • Evaluation, Quality & Iteration (Core)
  • Establish evaluation frameworks for accuracy, reliability, and output quality
  • Build task-specific eval suites: golden datasets, adversarial cases, regression tests, and rubric-based scoring
  • Set up automated evaluation pipelines and release gates (CI/CD-friendly) tied to prompt/model/version changes
  • Define and monitor online metrics (e.g., task success rate, human override rate, safety flags, latency, cost) and run experiments/A-B tests where appropriate
  • Use LLM-as-judge responsibly: calibrate, validate, and pair with human labels when needed 
  • Engineering, Integration & Observability
  • Develop scalable backend services and APIs that incorporate AI functionality
  • Integrate AI pipelines into existing cloud, microservices, and event-driven architectures
  • Implement observability and analytics for all AI features (tracing, evaluations, prompt versioning, cost tracking) Example tooling: Langfuse (and/or OpenTelemetry-compatible stacks)
  • Ensure reliability, uptime, performance, and security of AI services
  • Build internal tooling for evaluation, testing, prompt/version management, and safe deployment
  • Product & Collaboration
  • Partner with product managers, designers, the Chief Architect, and domain SMEs to shape AI-first solutions
  • Rapidly prototype concepts and iterate based on user feedback and measurable eval results
  • Translate business problems into well-structured AI workflows without requiring ML model training
  • Document system behavior, known failure modes, and operational playbooks 
  • Governance & Safety
  • Implement guardrails, checks, and fallback logic for safe and predictable AI behavior
  • Help define and follow compliance, privacy, and responsible AI guidelines
  • Design for safe tool execution (bounded actions, permissions, escalation paths, human-in the-loop review 
  • Core Strengths (Required)
  • Strong software engineering background (Python preferred) and experience shipping backend services
  • Deep hands-on experience building agentic LLM systems from first principles: agent loops, tool interfaces, planning/replanning, memory/state, and failure handling
  • Strong context engineering ability: retrieval strategies, routing, grounding, context budgeting, and long-context tradeoffs
  • Strong evaluation discipline: golden datasets, regression gating, automated eval pipelines, and online monitoring
  • Practical experience with LLM APIs (OpenAI/Azure OpenAI/Anthropic/Gemini) and AI orchestration frameworks
  • Excellent debugging, systems thinking, and problem decomposition skills
  • Comfortable operating in fast-paced, ambiguous environments with high ownership 
  • Signals We Value
  • You’ve shipped an LLM/agent system in production and can clearly explain:
  • the failure modes you discovered
  • the evals you built to catch regressions
  • how you improved cost/latency while increasing quality
  • how you monitored and iterated safely over time 
  • You keep up with industry developments (model releases, frameworks, best practices) and can translate them into pragmatic improvement 
  • Nice to Have
  • Experience with cloud platforms (AWS and/or GCP), microservices, and event-driven systems
  • Experience with observability stacks (OpenTelemetry, Datadog, Honeycomb) and AI-specific tooling (e.g., Langfuse, Braintrust, HumanLoop, W&B Weave)
  • Experience with workflow orchestration for long-running jobs (Temporal, Celery, Airflow)
  • Experience building enterprise AI features (permissions, auditability, compliance constraints)
  • Experience with safety/policy layers (PII handling, prompt injection defenses, sandboxed tool execution) 
  • Build core AI capabilities that directly impact users and product strategy
  • Work on cutting-edge, real-world agentic systems—focused on applied engineering (no model training required)
  • High ownership, fast iteration cycles, and strong cross-functional collaboration
  • Competitive compensation and opportunities for rapid advancement 
  • Ship one production agent workflow end-to-end with:
  • tracing + observability
  • an offline eval suite with regression gates
  • cost/latency targets and monitoring
  • documented failure modes and fallback path 
  • Listing Details

    Posted
    March 23, 2026
    First seen
    March 27, 2026
    Last seen
    April 24, 2026

    Posting Health

    Days active
    28
    Repost count
    0
    Trust Level
    29%
    Scored at
    April 25, 2026

    Signal breakdown

    freshnesssource trustcontent trustemployer trust
    Onit
    Onit
    lever

    Build better ways to work for your unique business.

    Employees
    750
    Founded
    2011
    Domain
    onit.com
    View company profile
    Newsletter

    Stay ahead of the market

    Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

    A
    B
    C
    D
    Join 12,000+ marketers

    No spam. Unsubscribe at any time.

    OnitSenior AI Engineer