$ Building an AI Agent with Tool Calling: Lessons from Production
Function calling sounds simple in the docs and breaks in fascinating ways once real users hit it. Here's what I learned shipping an LLM agent to production.
Tool calling — letting an LLM invoke functions in your codebase — is the unlock that turns chatbots into agents. The OpenAI and Anthropic SDKs make the first demo trivial. The second week is harder.
Lesson 1: Schemas are a UX problem, not a typing problem
The model picks tools based on the description, not the name. get_user with a vague description gets called for everything. fetch_user_by_email(email: string) with "Use only when the user explicitly mentions an email address" gets called correctly 95% of the time.
Lesson 2: Force a thinking step
Before any tool call, ask the model to output a one-sentence plan. It cuts wrong-tool errors in half.
system: Before calling any tool, write a single line:
PLAN: <why this tool, with these args>Lesson 3: Cap the agent loop
An agent will happily loop forever fetching slightly different versions of the same data. Hard cap at 8 turns. Log every loop > 4 — that's where the bugs live.
Lesson 4: Treat tool errors as conversation
Don't throw on a bad tool call. Return the error message as the tool result. The model will retry with corrected arguments more often than you'd expect.
An agent is just a while-loop with a model in it. Most of your work is shaping that loop.