AI Systems Fail When They Ignore What They Know

April 20, 20262 min read

TL;DR

Researchers argue that retrieval-augmented generation fails because AI ignores applicability limits, and propose a manifest system to fix it.

When AI systems retrieve information to answer questions, they often produce fluent but incorrect responses, not because they fail to find relevant text, but because they cannot determine if that text applies to the specific situation at hand. This issue, termed the applicability problem, is particularly critical in mature organizations where policies vary by factors like region, product version, or user status. The research identifies that the core failure in retrieval-augmented generation (RAG) systems is structural, stemming from a lack of explicit knowledge about when and how different pieces of information should be used.

To address this, the paper proposes a meta-knowledge layer, which involves creating detailed descriptions for each knowledge base within an organization. These descriptions, called manifests, specify what each knowledge base is for, when it applies, what inputs are required, how queries should be formed, and where its boundaries lie. This approach moves beyond simple partitioning of data into separate indexes or namespaces, focusing instead on making scope rules and preconditions inspectable at runtime to prevent systems from blending incompatible information.

Ology centers on the concept of a manifest as an agent-readable contract that enforces applicability before retrieval begins. Unlike ordinary metadata, which might include document dates or tags, a manifest guides the runtime in reasoning about whether a knowledge base is appropriate for a given query. For example, in a customer support scenario, a manifest would ensure that a system consults only the retail warranty knowledge base if the user's toaster was purchased through standard channels, avoiding confusion with utility-program policies or troubleshooting requirements.

From the analysis show that without this meta-layer, systems can produce what the paper calls franken-answers—fluent responses that combine topically relevant but contextually incompatible evidence. The toaster example illustrates this: a naive system might retrieve passages from multiple policy branches, leading to an answer that no real workflow would honor. With manifests, the system first checks applicability conditions, such as verifying the user's purchase channel, ensuring that only authoritative and current information from the correct knowledge base is used.

This approach has significant for agentic systems, where wrong knowledge can lead to incorrect actions rather than just inaccurate text. For instance, an operations agent tasked with restarting a service must verify environment and permissions before selecting a runbook, while a billing agent should confirm account details before making changes. The paper emphasizes that the principle remains the same: agents should act only when applicability conditions are satisfied, not merely because they found semantically related procedures.

However, the research acknowledges limitations, noting that manifests require ongoing maintenance and can drift out of sync with underlying knowledge bases. They do not eliminate the need for good retrieval, source management, or human judgment. Despite these costs, the paper argues that enforcing explicit applicability constraints is often less brittle than relying on hidden logic in prompts or heuristics, offering a more disciplined way to handle real-world constraints in AI systems.