Ran Wei/ AI Agents/Module 5
中文
AI Agent Series — Ran Wei

Module 5: Building the Agent Loop

Building the core observe-think-act loop.

1

The Observe → Think → Act Cycle

At its core, every AI agent is a loop. The agent observes the current state (user input, tool results, memory), thinks by sending that context to an LLM for reasoning, and acts by either calling a tool or returning a final response. This cycle repeats until the task is complete.

This pattern is known as the ReAct loop (Reasoning + Acting), first described in the 2022 paper by Yao et al. It is the same pattern used by every modern agent framework — LangChain, CrewAI, OpenAI Agents SDK, and Claude's tool use — regardless of their specific API.

ANALOGY

Think of the agent loop like a chef preparing a complex dish. The chef (1) observes the current state of ingredients and the recipe, (2) thinks about what to do next ("the onions are soft enough, time to add garlic"), and (3) acts by performing the next step. After each action, the chef observes the result and decides the next move. The dish is done when the chef decides no more steps are needed.

Here is the flow in pseudocode:

# The universal agent loop pattern
messages = [user_input]

while True:
    # THINK: Send context to LLM
    response = llm.generate(messages)

    # DECIDE: Does the LLM want to use a tool?
    if response.wants_tool_call:
        # ACT: Execute the tool
        tool_result = execute_tool(response.tool_call)

        # OBSERVE: Add result to context for next iteration
        messages.append(response)         # LLM's reasoning
        messages.append(tool_result)      # Tool's output
        continue                          # Loop back to THINK

    else:
        # DONE: LLM has a final answer
        return response.text

The key insight is that the LLM itself decides when to stop. It is not the developer hard-coding "call tool A, then tool B, then respond." The LLM dynamically chooses its actions based on the task and intermediate results. This is what makes agents flexible — the same loop can handle "What's the weather?" (one tool call) or "Research and summarise the top 5 competitors" (many tool calls and reasoning steps).

NOTE

The agent loop is conceptually simple, but the details matter enormously. How you handle errors, how you manage the growing message history, and how you prevent infinite loops are what separate a toy demo from a production agent.

How the Loop Looks at the API Level

Both OpenAI and Anthropic signal tool use through their response objects. The pattern is the same:

StepOpenAI SignalAnthropic Signal
LLM wants a toolfinish_reason == "tool_calls"stop_reason == "tool_use"
LLM is donefinish_reason == "stop"stop_reason == "end_turn"
Tool call detailsmessage.tool_calls[0]content block with type == "tool_use"
Sending tool result backrole: "tool" messagerole: "user" with type: "tool_result"
2

Minimal Agent — Anthropic

Let us build a minimal but complete agent using the Anthropic SDK. This agent can check the weather — a simple example, but it demonstrates the full observe-think-act loop with real API calls.

Step 1: Define Tools

First, define the tools the agent can use. Each tool needs a name, description, and input schema. The description is critical — it tells the LLM when to use the tool:

import anthropic

client = anthropic.Anthropic()

# Define the tools available to the agent
tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather for a city. Use this when the user asks about weather, temperature, or conditions in a specific location.",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The city name, e.g. 'London' or 'Tokyo'"
                }
            },
            "required": ["city"]
        }
    }
]

Step 2: Implement the Tool

In a real application, this function would call a weather API. For now, we use a stub:

def execute_tool(tool_name: str, tool_input: dict) -> str:
    """Execute a tool and return the result as a string."""
    if tool_name == "get_weather":
        city = tool_input["city"]
        # In production, call a real weather API here
        return f"Current weather in {city}: 22°C, partly cloudy, humidity 65%"
    else:
        return f"Error: Unknown tool '{tool_name}'"

Step 3: The Agent Loop

Now the core loop that ties everything together:

def run_agent(user_message: str, max_steps: int = 10) -> str:
    """Run the agent loop until a final answer is produced."""
    print(f"\n{'='*50}")
    print(f"User: {user_message}")
    print(f"{'='*50}")

    messages = [{"role": "user", "content": user_message}]

    for step in range(max_steps):
        print(f"\n--- Step {step + 1} ---")

        # THINK: Send context to the LLM
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            system="You are a helpful assistant with access to tools.",
            tools=tools,
            messages=messages
        )

        print(f"Stop reason: {response.stop_reason}")

        # CHECK: Does the LLM want to call a tool?
        if response.stop_reason == "tool_use":
            # Find the tool_use block in the response
            tool_block = next(
                b for b in response.content if b.type == "tool_use"
            )
            print(f"Tool call: {tool_block.name}({tool_block.input})")

            # ACT: Execute the tool
            result = execute_tool(tool_block.name, tool_block.input)
            print(f"Tool result: {result}")

            # OBSERVE: Add both the LLM response and tool result to history
            messages.append({"role": "assistant", "content": response.content})
            messages.append({
                "role": "user",
                "content": [{
                    "type": "tool_result",
                    "tool_use_id": tool_block.id,
                    "content": result
                }]
            })
            continue  # Back to THINK

        # DONE: LLM produced a final text response
        final_text = next(
            b.text for b in response.content if b.type == "text"
        )
        print(f"\nFinal answer: {final_text}")
        return final_text

    return "Error: Max steps reached without a final answer."


# Run it!
run_agent("What's the weather like in Tokyo?")
TIP

The for step in range(max_steps) loop is the heart of every agent. It provides a safety bound that prevents infinite loops. The LLM decides when to stop by returning stop_reason == "end_turn" instead of "tool_use".

Expected Output

==================================================
User: What's the weather like in Tokyo?
==================================================

--- Step 1 ---
Stop reason: tool_use
Tool call: get_weather({'city': 'Tokyo'})
Tool result: Current weather in Tokyo: 22°C, partly cloudy, humidity 65%

--- Step 2 ---
Stop reason: end_turn

Final answer: The current weather in Tokyo is 22°C with partly cloudy skies
and 65% humidity.

Minimal Agent — OpenAI Version

The same pattern with the OpenAI API, so you can compare the syntax differences:

from openai import OpenAI
import json

client = OpenAI()

# OpenAI tool format wraps each tool in a "function" type
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a city.",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"}
                },
                "required": ["city"]
            }
        }
    }
]

def run_agent_openai(user_message: str, max_steps: int = 10) -> str:
    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": user_message}
    ]

    for step in range(max_steps):
        response = client.chat.completions.create(
            model="gpt-4o",
            tools=tools,
            messages=messages
        )
        choice = response.choices[0]

        if choice.finish_reason == "tool_calls":
            tool_call = choice.message.tool_calls[0]
            args = json.loads(tool_call.function.arguments)
            result = execute_tool(tool_call.function.name, args)

            # OpenAI requires appending the assistant message first
            messages.append(choice.message)
            # Then the tool result with matching tool_call_id
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result
            })
            continue

        return choice.message.content

    return "Max steps reached."
3

Safety: Preventing Runaway Agents

An agent loop that lacks proper safeguards can spiral out of control — making dozens of API calls, consuming your budget, and producing nonsensical results. Safety mechanisms are not optional; they are a core part of agent design.

WARNING

A runaway agent can consume your entire API budget in minutes. A 20-step loop with GPT-4o, each step sending a growing conversation history, can easily cost $5–$20 per run. Without limits, a bug that causes infinite looping could cost hundreds of dollars before you notice.

Essential Safety Measures

Max Step Limit

Always cap the number of loop iterations. Start with max_steps=10 and adjust based on your use case. Most tasks complete in 3–5 steps.

Token Budget

Track cumulative token usage across all steps. Abort the run if total tokens exceed a threshold (e.g., 50,000 tokens).

Step Logging

Log every step: the tool called, the input, the result, and the token count. This is essential for debugging and cost analysis.

Timeout

Set a wall-clock timeout for the entire agent run. If the agent takes longer than 60 seconds, something is probably wrong.

Implementing Safety in Code

import time

def run_safe_agent(user_message: str, max_steps: int = 10,
                   max_tokens: int = 50000, timeout: int = 60) -> str:
    """Agent loop with comprehensive safety measures."""
    messages = [{"role": "user", "content": user_message}]
    total_tokens = 0
    start_time = time.time()

    for step in range(max_steps):
        # Safety check: timeout
        elapsed = time.time() - start_time
        if elapsed > timeout:
            print(f"TIMEOUT: Agent exceeded {timeout}s limit at step {step + 1}")
            return "Error: Agent timed out."

        # THINK
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            system="You are a helpful assistant.",
            tools=tools,
            messages=messages
        )

        # Safety check: token budget
        step_tokens = response.usage.input_tokens + response.usage.output_tokens
        total_tokens += step_tokens
        print(f"Step {step + 1}: {step_tokens} tokens (total: {total_tokens})")

        if total_tokens > max_tokens:
            print(f"BUDGET: Exceeded {max_tokens} token limit")
            return "Error: Token budget exceeded."

        # Normal loop logic continues...
        if response.stop_reason == "tool_use":
            tool_block = next(
                b for b in response.content if b.type == "tool_use"
            )
            result = execute_tool(tool_block.name, tool_block.input)
            messages.append({"role": "assistant", "content": response.content})
            messages.append({
                "role": "user",
                "content": [{
                    "type": "tool_result",
                    "tool_use_id": tool_block.id,
                    "content": result
                }]
            })
            continue

        final = next(b.text for b in response.content if b.type == "text")
        print(f"\nCompleted in {step + 1} steps, {total_tokens} tokens, "
              f"{time.time() - start_time:.1f}s")
        return final

    return "Error: Max steps reached."

Detecting Infinite Loops

A subtler failure mode is when the agent keeps calling the same tool with the same input, getting the same result, and never making progress. You can detect this by tracking recent tool calls:

def detect_loop(messages: list, lookback: int = 4) -> bool:
    """Check if the agent is stuck calling the same tool repeatedly."""
    recent_tool_calls = []
    for msg in messages[-lookback * 2:]:  # Check last N exchanges
        if isinstance(msg.get("content"), list):
            for block in msg["content"]:
                if isinstance(block, dict) and block.get("type") == "tool_result":
                    recent_tool_calls.append(block.get("content", ""))

    # If all recent tool results are identical, we're probably looping
    if len(recent_tool_calls) >= 3 and len(set(recent_tool_calls)) == 1:
        return True
    return False
NOTE

In production agents, you should also implement rate limiting (e.g., no more than 30 API calls per minute), cost alerts (email notification if daily spend exceeds $10), and kill switches (a way to immediately halt all running agents). These operational concerns become critical as you scale from development to production use.

Up Next

Module 6 — Tool Use & Function Calling