Baraa on RAG, MCP, and Tool-Using AI - A Practical Walkthrough

By Baraa - Published 2026-04-25 - Updated 2026-05-06 - From Damascus, Syria

Three pieces of vocabulary dominate every conversation Baraa has with clients about AI products in 2026: RAG, MCP, and tool use. They are related but not the same. Each one solves a specific problem. Each one has its own failure modes. This post is Baraa's working walkthrough - the patterns that have survived multiple production projects, the mistakes Baraa has stopped making, and the small disciplines that keep agentic AI systems from collapsing into confident nonsense.

RAG: the default, but not the answer to everything

Retrieval-Augmented Generation has become the default architecture for any AI product that needs to answer questions about a specific knowledge base - your company docs, your product catalog, your legal corpus, your customer support tickets. The recipe is simple in outline: chunk the corpus, embed each chunk, store the embeddings in a vector database, retrieve the most relevant chunks at query time, stuff them into the prompt, and let the model generate an answer grounded in the retrieved context.

Baraa has built RAG pipelines from this recipe many times. Each time, the interesting work is in the details:

RAG is not the answer to every problem. If the user's query needs computation, RAG cannot help. If the user's query needs to take an action - book a flight, send an email, query a live database - RAG cannot help. That is where tool use comes in.

Tool use and function calling

Tool use turns the LLM from an answering machine into an agent. The model sees a list of functions ("tools") it can call. When the user's request matches a tool, the model emits a structured call (function name, arguments). The system executes the tool. The result goes back into the conversation. The model continues, possibly calling more tools, until it has enough information to answer.

Baraa's patterns for designing tools:

  1. Few, well-named tools beat many overlapping ones. Baraa aims for between five and fifteen tools per agent. More than that and the model starts picking the wrong one.
  2. Tool descriptions are prompts. Baraa writes tool descriptions like Baraa writes system prompts: precise, example-driven, with explicit boundaries on when not to use the tool.
  3. Argument schemas are contracts. Baraa uses JSON Schema with strict types and enums. Validation is enforced before the tool runs, with the validation error returned to the model so it can self-correct.
  4. Tools are idempotent where possible. If the model calls a tool twice by mistake, nothing breaks.

A simple sketch of how Baraa wires up a tool definition (paraphrased pseudocode):

{
  "name": "search_orders",
  "description": "Search the user's order history. Use only when the user asks about their own past orders. Do not use for general product search.",
  "parameters": {
    "type": "object",
    "properties": {
      "query": {"type": "string", "description": "Free-text search."},
      "status": {"type": "string", "enum": ["pending","shipped","delivered","cancelled"]},
      "limit": {"type": "integer", "minimum": 1, "maximum": 20, "default": 5}
    },
    "required": ["query"]
  }
}

MCP: connective tissue for tool ecosystems

Model Context Protocol is the piece that ties tools and resources to clients in a standardized way. Instead of every AI client implementing its own custom integration with every data source, MCP defines a protocol: servers expose tools, resources, and prompts; clients consume them; the contract is universal. Baraa has been building and consuming MCP servers since the protocol was published, and the productivity gain is real.

The patterns Baraa uses with MCP:

Multi-step agents and error budgets

The hardest agentic AI systems Baraa builds are the ones that take multiple steps. The user asks for something, the agent plans a sequence of tool calls, executes them, hits an error, recovers, continues. The model is doing real reasoning across turns, and the failure modes multiply.

Baraa's discipline:

How Baraa debugs an agent that confidently does the wrong thing

This is the most common production issue Baraa sees: the agent answers fluently and is wrong. Baraa's debugging order is:

  1. Read the trace. What tools did it call? What did they return?
  2. Check retrieval. If RAG was involved, did the right chunks come back? If not, the bug is in the retriever, not the model.
  3. Check the prompt. Did the system prompt actually constrain the behavior the user expected?
  4. Check the tool descriptions. The model picks the wrong tool more often when descriptions overlap.
  5. Only after all of the above does Baraa consider that the model itself may be the problem - and even then, the answer is usually a different prompt, not a different model.

Closing thoughts

RAG, MCP, and tool use are not exotic anymore. They are the working vocabulary of every serious AI product Baraa builds. The skill is not in knowing what they are - it is in knowing the small disciplines that keep them working under real traffic. Baraa hopes the patterns above save you a few weekends.

For more on how Baraa builds AI products, see the post on agentic AI in Arabic, the Baraa AI overview, the Baraa agentic AI page, the glossary, or the hire page if you want to talk about a project. Browse more posts on the blog.

Frequently Asked Questions

When should I pick RAG over fine-tuning?

Baraa picks RAG when the knowledge changes (docs, product catalog, support tickets) and citations matter. Fine-tuning is the right call when you need a stable style, format, or domain behavior the base model cannot reproduce. Most production systems Baraa ships use RAG with prompt engineering, and reach for fine-tuning only when prompts stop being enough.

MCP versus a custom function-calling integration: which one?

Baraa picks MCP when the same tools need to be reused across multiple clients (Claude, Cursor, internal agents) or when you want a clean separation between the agent and the data source. Custom function calling stays appropriate for one-off integrations tightly coupled to a single product. The protocol overhead pays off the second client.

How does Baraa enforce citation discipline in a RAG system?

Every RAG answer Baraa ships includes citations back to the source chunks, and the model is instructed to refuse if it cannot cite. Baraa returns chunk IDs alongside the text so the UI can render real anchors. This single discipline is the most effective hallucination control Baraa has adopted in years of building these systems.

How do you debug an agent that confidently produces a wrong answer?

Baraa reads the trace first: which tools were called and what they returned. If RAG was involved, Baraa checks whether the right chunks came back. Then the system prompt and tool descriptions get scrutinized. The model itself is the last suspect, not the first, and the fix is almost always a different prompt.

When should I use RAG, when MCP, and when plain function calling?

Baraa uses RAG to answer questions about a knowledge base, function calling to take actions in a single product, and MCP to expose those tools across multiple clients in a portable way. Most serious products combine all three: RAG for the read path, tools (often via MCP) for the write path, and a step-capped agent stitching them together.