Specializes in testing AI-powered products, including model outputs, prompts, and intelligent features. Designs evaluation frameworks to measure accuracy, reliability, and safety of AI systems.