Caseware
Caseware2mo ago

AI Test Architect

ColombiaBogotáRemoteFull Time - Permanentmid
Data ScienceQA & TestingTest Architect
0 views0 saves0 applied

Quick Summary

Overview

Caseware is one of Canada's original Fintech companies, having led the global audit and accounting software industry for over 30 years, with more than 500,

Technical Tools
Data ScienceQA & TestingTest Architect
Caseware is one of Canada's original Fintech companies, having led the global audit and accounting software industry for over 30 years, with more than 500,000 users across 130 countries and available in 16 different languages. While you might not have heard of us (yet) over 36,000 accounting and audit professionals list Caseware as a skill on their LinkedIn profiles!

Why This Role Matters As a leader in cloud-native SaaS, we are accelerating our shift to an AI-first future—embedding generative AI and autonomous agents across our platform to deliver smarter, faster user experiences. We are on the lookout for a visionary AI Test Architect to build the next-generation "Quality Intelligence" platform: one that leverages generative AI for automated test creation, self-healing execution, predictive defect analytics, and rigorous validation of our AI features built inhouse for our global audience. 
 
As our foundational AI Test Architect, you'll design scalable, ethical frameworks that ensure reliability, safety, and compliance while accelerating release velocity (targeting 30-50% faster cycles through AI-augmented testing). Your work will reduce risk in production AI agents, minimize hallucinations/bias/security exposures, and empower the entire engineering organization to adopt AI-augmented quality practices that supplement traditional mature frameworks we have. This high-impact role sits at the intersection of Platform Engineering, AI, and Quality—shaping how we build trustworthy intelligence at scale. 
 
📍 Location: This is a fully remote position located in Colombia. 
 
You will be reporting to:
 
Contact:
Maira Russo - Senior Talent Acquisition Partner
  • 1.AI-Driven Quality Strategy & Architecture 
  • Architect a comprehensive "Quality Intelligence" platform using generative AI to predict defect hotspots, intelligently optimize regression suites, auto-generate tests, and enable self-healing automation. 
  • Define enterprise-wide AI-first testing strategy, including non-deterministic evaluation paradigms, continuous monitoring for drift/hallucination, and integration across the full SDLC. 
  • Establish governance for ethical AI testing, aligning with emerging standards 
  •  
  • 2.LLM & Agent Evaluation Frameworks 
  • Design and implement advanced benchmarks, red teaming protocols, and adversarial testing for internal AI agents and generative features—focusing on hallucination rates, bias/fairness, prompt injection, jailbreaks, and goal misalignment. 
  • Build evaluation pipelines with statistical rigor (e.g., multi-trial runs, LLM-as-judge, human-in-the-loop) using tools like LangFuse, LangSmith, DeepEval, RAGAS, or Arize Phoenix for metrics such as faithfulness, context precision, and safety compliance. 
  • Architect harnesses for agentic workflows, tool-calling, planning, multi-agent simulations, and post-deployment observability. 
  •  
  • 3.Infrastructure & Automation Architecture 
  • Partner with DevOps to embed AI-based testing into GitHub-based CI/CD pipelines (e.g., AI-generated tests, predictive flakiness detection, automated gating with quality signals). 
  • Lead design of self-healing test frameworks (integrating AI plugins with Playwright/Cypress or similar) that adapt to UI/model changes with minimal maintenance. 
  • Architect synthetic data generation, maintain golden data-sets and AI-powered data masking solutions to enable privacy-compliant, high-fidelity testing at scale. 
  •  
  • 4.Cross-Functional Leadership & Evangelism 
  • Collaborate with product, data science, ML engineering, and security teams to influence AI feature design with quality guardrails from day one. 
  • Evangelize and mentor: Upskill traditional QA engineers into AI-augmented testers through workshops, playbooks, and communities of practice. 
  • Drive adoption of AI quality best practices organization-wide, including metrics dashboards for DORA + AI-specific indicators (e.g., hallucination rate, red team success rate, self-healing coverage). 
  •  
  • 5.Observability, Metrics & Continuous Evolution 
  • Define and implement AI-specific quality telemetry (e.g., drift detection, faithfulness scoring, compliance incidents) integrated with tools like Langfuse. 
  • Establish feedback loops for model iteration, A/B testing guardrails, and proactive risk mitigation in production. 
  • Building reliable evaluation for non-deterministic, agentic AI in a fast-moving SaaS landscape. 
  • Scaling self-healing and generative test automation without introducing new flakiness or security debt. 
  • Balancing innovation speed with rigorous red teaming and ethical safeguards for customer-facing AI. 
  • Launch the "Quality Intelligence" platform foundation with AI-augmented pipelines covering > 70%+ of critical paths. 
  • Establish red teaming/red-teaming-as-code processes that reduce high-severity AI risks by > 40%+. 
  • Upskill > 50%+ of QA/engineering teams on AI testing fundamentals and deliver measurable velocity/safety gains. 
  • Accuracy Baseline: Establish a baseline 90%+ Faithfulness score for all RAG-powered features. 
  • 8+ years in Quality Engineering/Test Architecture within cloud-native SaaS environments, with 2+ years focused on AI/ML/LLM testing and validation. 
  • Deep expertise in AWS (serverless, microservices, IaC with Terraform/CloudFormation) and GitHub CI/CD ecosystems. 
  • Proficiency architecting LLM-based applications and testing frameworks (LangChain/LangGraph/LangSmith strongly preferred; equivalents acceptable). 
  • Mastery of modern automation (Playwright, Cypress) with hands-on experience integrating self-healing AI plugins or generative test tools. 
  • Strong programming skills in JavaScript/TypeScript and/or Python; solid understanding of foundational AI concepts (transformers, embeddings, RAG, evaluation trade-offs). 
  • Experience with LLM evaluation tools like Bedrock Evaluations, Prompt Management, Guardrails, DeepEval, RAGAS, Arize Phoenix, Langfuse. 
  • Experience with Red teaming frameworks/tools (Cobalt Strike, Sliver, Nmap) and knowledge of adversarial testing methodologies is a bonus. 
  • Proven leadership: Mentoring teams, defining standards, and driving cross-functional change in ambiguous, high-growth settings. 
  • Bachelor's/Master's in Computer Science, AI/ML, or equivalent; relevant certifications a strong plus. 
  • Strong English language communication and collaboration skills
  • Contrato a termino Indefinido with all the legal benefits
  • Prepaid Medicine
  • Life insurance and funeral assistance
  • Internet allowance
  • Home office stipend
  • Competitive compensation — above the market average
  • 100% remote work environment and an excellent work-life balance
  • Opportunity to work for a growing global SaaS leader company
  • A culture that promotes independence, innovation, trust, and accountability
  • Open space to be creative, innovative, and strategize for the future
  • Mentorship by a highly experienced professional 
  • Budget for training, we want you to grow
  • 5 Personal Time Off days per year
  • Sick Leave Top up to total 100% of salary paid by the employer from Day 3 to 90. 
  • Recognition Award, additional paid time off in recognition of the corresponding year of service
  • Upgrade vacation starting at 5 years of service
  • Listing Details

    Posted
    January 30, 2026
    First seen
    March 26, 2026
    Last seen
    April 24, 2026

    Posting Health

    Days active
    29
    Repost count
    0
    Trust Level
    39%
    Scored at
    April 24, 2026

    Signal breakdown

    freshnesssource trustcontent trustemployer trust
    Caseware

    Caseware International Inc. is a global leader in developing audit, assurance, financial reporting, and data analytics software for accounting firms, corporations, and governments. Founded in 1988, the Toronto-based company provides cloud-enabled solutions to over 500,000 users in 130 countries.

    Employees
    750
    Founded
    1988
    View company profile
    Newsletter

    Stay ahead of the market

    Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

    A
    B
    C
    D
    Join 12,000+ marketers

    No spam. Unsubscribe at any time.

    CasewareAI Test Architect