AI Evaluation Specialist | LLM QA & Prompt Engineering | AI Data Quality & Model Reasoning Analysis
first conversation is free, sign up to message Dennis
I review AI-generated responses for accuracy, logical reasoning, hallucinations, and real-world applicability. Using structured evaluation frameworks, I identify weaknesses in model outputs and provide detailed feedback to improve reliability and performance.
I test AI systems to identify reasoning errors, hallucinations, and edge-case failures. I simulate real-world scenarios, analyze where the model breaks down, and provide structured feedback to improve model reliability, safety, and performance.
AI Data & QA Specialist with experience in LLM evaluation, prompt engineering, hallucination detection, and rubric-based model assessment. I review AI outputs for accuracy, reasoning quality, compliance, and real-world applicability, and I’ve supported annotation and evaluation projects across text, image, and multimodal systems. My background includes side-by-side evaluation, scenario design, quality assurance, and evidence-based feedback to improve model reliability and performance.