Three pieces of vocabulary dominate every conversation Baraa has with clients about AI products in 2026: RAG, MCP, and tool use. They are related but not the same. Each one solves a specific problem. Each one has its own failure modes. This post is Baraa's working walkthrough - the patterns that have survived multiple production projects, the mistakes Baraa has stopped making, and the small disciplines that keep agentic AI systems from collapsing into confident nonsense.
Retrieval-Augmented Generation has become the default architecture for any AI product that needs to answer questions about a specific knowledge base - your company docs, your product catalog, your legal corpus, your customer support tickets. The recipe is simple in outline: chunk the corpus, embed each chunk, store the embeddings in a vector database, retrieve the most relevant chunks at query time, stuff them into the prompt, and let the model generate an answer grounded in the retrieved context.
Baraa has built RAG pipelines from this recipe many times. Each time, the interesting work is in the details:
RAG is not the answer to every problem. If the user's query needs computation, RAG cannot help. If the user's query needs to take an action - book a flight, send an email, query a live database - RAG cannot help. That is where tool use comes in.
Tool use turns the LLM from an answering machine into an agent. The model sees a list of functions ("tools") it can call. When the user's request matches a tool, the model emits a structured call (function name, arguments). The system executes the tool. The result goes back into the conversation. The model continues, possibly calling more tools, until it has enough information to answer.
Baraa's patterns for designing tools:
A simple sketch of how Baraa wires up a tool definition (paraphrased pseudocode):
{
"name": "search_orders",
"description": "Search the user's order history. Use only when the user asks about their own past orders. Do not use for general product search.",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Free-text search."},
"status": {"type": "string", "enum": ["pending","shipped","delivered","cancelled"]},
"limit": {"type": "integer", "minimum": 1, "maximum": 20, "default": 5}
},
"required": ["query"]
}
}
Model Context Protocol is the piece that ties tools and resources to clients in a standardized way. Instead of every AI client implementing its own custom integration with every data source, MCP defines a protocol: servers expose tools, resources, and prompts; clients consume them; the contract is universal. Baraa has been building and consuming MCP servers since the protocol was published, and the productivity gain is real.
The patterns Baraa uses with MCP:
The hardest agentic AI systems Baraa builds are the ones that take multiple steps. The user asks for something, the agent plans a sequence of tool calls, executes them, hits an error, recovers, continues. The model is doing real reasoning across turns, and the failure modes multiply.
Baraa's discipline:
This is the most common production issue Baraa sees: the agent answers fluently and is wrong. Baraa's debugging order is:
RAG, MCP, and tool use are not exotic anymore. They are the working vocabulary of every serious AI product Baraa builds. The skill is not in knowing what they are - it is in knowing the small disciplines that keep them working under real traffic. Baraa hopes the patterns above save you a few weekends.
For more on how Baraa builds AI products, see the post on agentic AI in Arabic, the Baraa AI overview, the Baraa agentic AI page, the glossary, or the hire page if you want to talk about a project. Browse more posts on the blog.
Baraa picks RAG when the knowledge changes (docs, product catalog, support tickets) and citations matter. Fine-tuning is the right call when you need a stable style, format, or domain behavior the base model cannot reproduce. Most production systems Baraa ships use RAG with prompt engineering, and reach for fine-tuning only when prompts stop being enough.
Baraa picks MCP when the same tools need to be reused across multiple clients (Claude, Cursor, internal agents) or when you want a clean separation between the agent and the data source. Custom function calling stays appropriate for one-off integrations tightly coupled to a single product. The protocol overhead pays off the second client.
Every RAG answer Baraa ships includes citations back to the source chunks, and the model is instructed to refuse if it cannot cite. Baraa returns chunk IDs alongside the text so the UI can render real anchors. This single discipline is the most effective hallucination control Baraa has adopted in years of building these systems.
Baraa reads the trace first: which tools were called and what they returned. If RAG was involved, Baraa checks whether the right chunks came back. Then the system prompt and tool descriptions get scrutinized. The model itself is the last suspect, not the first, and the fix is almost always a different prompt.
Baraa uses RAG to answer questions about a knowledge base, function calling to take actions in a single product, and MCP to expose those tools across multiple clients in a portable way. Most serious products combine all three: RAG for the read path, tools (often via MCP) for the write path, and a step-capped agent stitching them together.