New Arrivals/Restock

Evaluating AI Systems: Testing LLMs, RAG, and Agents Kindle Edition

flash sale iconLimited Time Sale
Until the end
03
12
10

$5.99 cheaper than the new price!!

Free shipping for purchases over $99 ( Details )
Free cash-on-delivery fees for purchases over $99
Please note that the sales price and tax displayed may differ between online and in-store. Also, the product may be out of stock in-store.
New  $9.99
quantity

Product details

Management number 220491396 Release Date 2026/05/03 List Price $4.00 Model Number 220491396
Category

The definitive guide to testing AI systems that actually work.Most AI systems ship without meaningful evaluation. Teams eyeball a few responses, declare the system "good enough," and push to production. Then quality degrades, hallucinations appear, and nobody knows why.Evaluating AI Systems is a practical, technical guide to building evaluation frameworks for LLMs, RAG pipelines, and AI agents. Written by Alex Merced, Head of Developer Relations at Dremio and author of multiple technical books, it covers the full evaluation lifecycle from dataset generation to production monitoring.What you will learn:Understand why traditional software testing fails for AI and what to do insteadBuild golden evaluation datasets that accurately measure system qualityImplement prompt testing with tools like DeepEval, RAGAS, and promptfooDesign evaluation metrics for correctness, faithfulness, relevance, and safetyDetect and measure hallucinations with automated pipelinesUse LLM-as-judge patterns with bias mitigation and multi-model consensusBuild regression testing that catches quality degradation before users doDeploy production monitoring with drift detection and quality alertingEvaluate multi-step agent workflows with tool use accuracy metricsManage evaluation costs with tiered strategies from smoke tests to deep expert reviewsWritten with verified specifications for GPT-5.4, Claude Sonnet 4.6, and Gemini 3.1 Pro throughout. Every technique is immediately applicable to production AI systems.For AI engineers: Build evaluation pipelines that prevent quality incidents.For QA engineers: Apply testing discipline to the most untestable systems you have ever worked with.For engineering managers: Make informed quality decisions with data, not gut feeling. Read more

XRay Not Enabled
Edition 1st
Language English
File size 10.3 MB
Page Flip Enabled
Publisher Alex Merced Books
Word Wise Not Enabled
Print length 372 pages
Accessibility Learn more
Screen Reader Supported
Publication date March 16, 2026
Enhanced typesetting Enabled

Correction of product information

If you notice any omissions or errors in the product information on this page, please use the correction request form below.

Correction Request Form

Product Review

You must be logged in to post a review