פרק 8: LangGraph — Stateful Agent Workflows

מה יהיה לך בסוף הפרק הזה

ReAct Agent מלא בנוי כ-StateGraph — עם nodes, conditional edges, ו-tool calling
סוכן עם create_react_agent() — הדרך המהירה ביותר לבנות סוכן tool-using
Checkpointing עובד — זיכרון עמיד עם MemorySaver ו-SqliteSaver
Human-in-the-loop workflow — סוכן שעוצר, מבקש אישור אנושי, ממשיך
Multi-agent system — supervisor agent שמנהל worker agents עם sub-graphs
LangSmith tracing פעיל — כל LLM call, tool execution, ו-decision מתועדים
Decision matrix: מתי LangGraph ומתי frameworks אחרים

מה תוכלו לעשות אחרי הפרק הזה

תוכלו לבנות סוכן AI כ-graph עם StateGraph, nodes, conditional edges, ו-checkpointing
תוכלו להשתמש ב-create_react_agent() לבניית סוכנים מהירה עם tool calling
תוכלו להוסיף זיכרון עמיד (short-term + long-term) עם checkpointing backends שונים
תוכלו לבנות human-in-the-loop workflows עם interrupt() ו-Command
תוכלו ליצור multi-agent systems עם supervisor ו-sub-graph patterns

לפני שמתחילים

פרקים קודמים: פרק 1 (מה זה סוכן, לולאת ReAct), פרק 2 (ארכיטקטורות), פרק 3 (Tool Calling), פרק 7 (OpenAI Agents SDK)
מה תצטרכו: Python 3.11+, מפתח API אחד לפחות (Anthropic / OpenAI / Google), עורך קוד (VS Code / Cursor)
ידע נדרש: Python בינוני, הבנה של TypedDict / Pydantic, הכרת async/await בסיסית
זמן משוער: 5-6 שעות (כולל תרגילים)
עלות API משוערת: $5-15

הפרויקט שלך — קו אדום לאורך הקורס

בפרקים 5-7 בניתם סוכנים עם SDKs של Anthropic, Vercel ו-OpenAI — כל אחד עם גישה שונה. בפרק הזה תעברו ל-LangGraph — הגישה של "סוכנים כגרפים". במקום loop פשוט, תייצגו את הסוכן כ-state machine עם nodes, edges, ו-checkpoints. זו הגישה הכי מתאימה למערכות מורכבות, durable execution, ותהליכים שדורשים human-in-the-loop. ה-tools שבניתם בפרק 3 יתחברו ישירות ל-graph שלכם. בפרק 9 תכירו את CrewAI — גישה high-level שמתבססת על "צוותים" של סוכנים.

מילון מונחים — פרק 8

מונח (English)	עברית	הסבר
LangGraph	לאנגגרף	Framework של LangChain לבניית סוכנים כגרפים מבוססי state. הגיע ל-v1.0 ב-2026 — ה-framework הכי battle-tested לפרודקשן
StateGraph	גרף מצבים	ה-class המרכזי ב-LangGraph. מגדיר גרף עם state type, nodes, ו-edges. מקבל TypedDict או Pydantic model כ-schema
State	מצב	המידע שזורם דרך הגרף — מוגדר כ-TypedDict עם שדות כמו messages, tools_output, retry_count. כל node מקבל ומעדכן את ה-state
Node	צומת	פונקציית Python שמבצעת פעולה (LLM call, tool execution, logic). מקבלת state, מחזירה state מעודכן
Edge	קשת	חיבור בין nodes. יכולה להיות רגילה (A → B) או conditional (A → B/C/D לפי תנאי)
Conditional Edge	קשת מותנית	Edge שמפנה ל-node שונה לפי פונקציית routing. מאפשר branching ולולאות בגרף
Checkpointing	שמירת מצב	שמירה אוטומטית של ה-state בכל צעד. מאפשר recovery, time-travel debugging, ו-human-in-the-loop
MemorySaver	שומר זיכרון	Checkpointer בזיכרון (in-memory) — מהיר, לפיתוח. לא מתאים לפרודקשן (נמחק בריסטרט)
interrupt()	הפסקה	פונקציה שעוצרת את הגרף ומחזירה שליטה לאדם. הגרף נשמר ב-checkpoint ומחכה ל-`Command(resume=...)`
Command	פקודה	אובייקט שמחדש ריצה של גרף שנעצר. כולל resume data, state updates, ו-goto (routing)
create_react_agent()	יצירת סוכן ReAct	פונקציה pre-built שיוצרת ReAct agent מוכן. 3 שורות קוד — model, tools, prompt
LangSmith	לאנגסמית׳	פלטפורמת observability של LangChain. Tracing, evaluation, datasets, monitoring. חינם עד 5K traces/חודש
Sub-Graph	תת-גרף	גרף שמוטמע בתוך גרף אחר. מאפשר multi-agent — כל agent הוא sub-graph עצמאי
LangServe	לאנגסרב	כלי deployment שהופך LangGraph app ל-REST API. מוסיף endpoints אוטומטיים ל-invoke, stream, batch

מתחיל 15 דקות חינם

סקירה ופילוסופיה — למה LangGraph

LangGraph הוא ה-framework הכי נפוץ ובוגר לבניית סוכני AI בפרודקשן. הוא חלק מאקוסיסטם LangChain — הספרייה הנפוצה ביותר ל-LLM applications מאז 2022 — אבל הוא פותר בעיה אחרת לגמרי. בזמן ש-LangChain נותן composable components (chains, prompts, parsers), LangGraph נותן orchestration framework — דרך לבנות, לנהל ולפרוס סוכנים stateful לטווח ארוך.

הפילוסופיה של LangGraph:

Agents are state machines: סוכן AI הוא בעצם מכונת מצבים — יש לו state (מידע שהוא מחזיק), nodes (פעולות שהוא מבצע), ו-edges (מעברים בין פעולות). גרפים הם הדרך הטבעית לייצג את זה
Durable execution: סוכנים שורדים כשלים, יכולים לרוץ שעות/ימים, וממשיכים בדיוק מאיפה שעצרו
Maximum control: בניגוד ל-frameworks שמסתירים את הלוגיקה, LangGraph נותן שליטה מלאה על כל צעד, כל מעבר, כל retry
Provider-agnostic: עובד עם Claude, GPT, Gemini, ועוד — דרך LangChain chat models

LangGraph במספרים (מרץ 2026)

גרסה: LangGraph v1.0 — הגיע ל-milestone של גרסה יציבה ב-2026
רישיון: MIT — חינמי לשימוש מסחרי
חברות בפרודקשן: Klarna, Uber, LinkedIn — ה-industry standard לפריסות רציניות
LangSmith: חינם (5K traces/חודש), Plus ($39/seat/חודש, 10K traces), Enterprise (custom)
LangGraph Cloud: $0.001/node execution (100K הראשונים חינם)
אקוסיסטם: LangChain (chains) + LangGraph (agents) + LangSmith (observability) + LangServe (deployment)

המחלוקת סביב LangChain — ואיך LangGraph פתר אותה

LangChain ספג ביקורת משמעותית בשנים 2023-2024 על over-abstraction — שכבות הפשטה מיותרות שהקשו על debugging ושליטה. מפתחים התלוננו שקוד LangChain היה "magic" שקשה לעקוב אחריו.

LangGraph הוא התשובה הישירה לביקורת הזו:

No magic: כל node הוא פונקציית Python רגילה. אין decorators מיוחדים, אין inheritance מורכב
Explicit control: אתה מגדיר בדיוק אילו nodes רצים, באיזה סדר, ומתי — אין "automatic" behavior שלא ביקשת
Low-level + high-level: אפשר לבנות graph מאפס, או להשתמש ב-pre-built agents כמו create_react_agent()
LangChain optional: LangGraph יכול לעבוד בלי LangChain — אפשר להשתמש בו עם כל LLM client

עשה עכשיו 5 דקות

התקינו את LangGraph ואת ה-dependencies:

# יצירת פרויקט חדש
mkdir langgraph-agent && cd langgraph-agent
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# התקנה
pip install langgraph langchain-anthropic langchain-openai langchain-google-genai
pip install langchain-core langsmith python-dotenv

# קובץ .env
echo "ANTHROPIC_API_KEY=sk-ant-..." > .env
echo "OPENAI_API_KEY=sk-..." >> .env
echo "LANGSMITH_API_KEY=lsv2_..." >> .env
echo "LANGSMITH_TRACING=true" >> .env

ודאו ש-python --version מראה 3.11 ומעלה.

מתחיל 15 דקות פרקטי

LangChain Fundamentals — סקירה מהירה

לפני שנצלול ל-LangGraph, צריך להבין את הבסיסים של LangChain — כי LangGraph בנוי על גביו. אם כבר מכירים LangChain — דלגו ל-Section 3.

Chat Models

LangChain מספק wrapper אחיד לכל ספקי ה-LLM:

Python — Chat Models ב-LangChain

from langchain_anthropic import ChatAnthropic
from langchain_openai import ChatOpenAI
from langchain_google_genai import ChatGoogleGenerativeAI

# אותו interface — ספקים שונים
claude = ChatAnthropic(model="claude-sonnet-4-5-20250514")
gpt = ChatOpenAI(model="gpt-4o")
gemini = ChatGoogleGenerativeAI(model="gemini-2.5-flash")

# כולם תומכים באותן פונקציות
response = claude.invoke("Explain agents in one sentence.")
print(response.content)
# "AI agents are software systems that use LLMs to
#  reason about and autonomously execute multi-step tasks."

Messages

LangChain משתמש ב-message objects מובנים:

Python — Message Types

from langchain_core.messages import (
    SystemMessage,
    HumanMessage,
    AIMessage,
    ToolMessage
)

messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="What's the weather in Tel Aviv?"),
    AIMessage(content="Let me check...", tool_calls=[{
        "id": "call_1",
        "name": "get_weather",
        "args": {"city": "Tel Aviv"}
    }]),
    ToolMessage(content='{"temp": 28, "condition": "sunny"}',
                tool_call_id="call_1"),
    AIMessage(content="It's 28C and sunny in Tel Aviv!")
]

עשה עכשיו 5 דקות

בנו chain פשוט — prompt → LLM → output:

from langchain_anthropic import ChatAnthropic
from langchain_core.messages import SystemMessage, HumanMessage

model = ChatAnthropic(model="claude-sonnet-4-5-20250514")

messages = [
    SystemMessage(content="You are a marketing expert. Answer in Hebrew."),
    HumanMessage(content="Give me 3 tips for SaaS landing pages.")
]

response = model.invoke(messages)
print(response.content)

ודאו שאתם מקבלים תשובה בעברית. אם כן — LangChain עובד.

בינוני 25 דקות פרקטי

LangGraph Core — State, Nodes, Edges

זהו החלק הכי חשוב בפרק. ברגע שתבינו את שלושת הקונספטים — State, Nodes, Edges — תוכלו לבנות כל סוכן ב-LangGraph.

State — המידע שזורם בגרף

ה-State הוא TypedDict (או Pydantic model) שמגדיר את כל המידע שהגרף מחזיק ומעביר בין nodes. כל node מקבל את ה-state הנוכחי ומחזיר עדכון.

Python — הגדרת State

from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    # messages — רשימת הודעות עם reducer
    # add_messages מוסיף הודעות חדשות (לא מחליף)
    messages: Annotated[list, add_messages]

    # שדות נוספים לפי הצורך
    current_tool: str
    retry_count: int
    final_answer: str

נקודה קריטית: Reducers

שימו לב ל-Annotated[list, add_messages]. ה-reducer (add_messages) מגדיר איך שדה מתעדכן כש-node מחזיר ערך חדש. בלי reducer, ערך חדש מחליף את הישן. עם add_messages, הודעות חדשות מתווספות לרשימה. זו הסיבה שרשימת ההודעות גדלה עם כל צעד — ולא נמחקת.

Nodes — הפעולות

כל Node הוא פונקציית Python רגילה שמקבלת state ומחזירה dict עם עדכונים:

Python — הגדרת Nodes

from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(model="claude-sonnet-4-5-20250514")

def call_llm(state: AgentState) -> dict:
    """Node: שולח הודעות ל-LLM ומקבל תשובה."""
    response = model.invoke(state["messages"])
    # מחזיר dict — add_messages reducer יוסיף את התשובה
    return {"messages": [response]}

def process_tool(state: AgentState) -> dict:
    """Node: מבצע tool call ומחזיר תוצאה."""
    last_message = state["messages"][-1]
    tool_call = last_message.tool_calls[0]

    # הרצת הכלי (פשוט לדוגמה)
    result = execute_tool(tool_call["name"], tool_call["args"])

    tool_msg = ToolMessage(
        content=str(result),
        tool_call_id=tool_call["id"]
    )
    return {"messages": [tool_msg]}

Edges — החיבורים

Edges מחברים nodes ומגדירים את הזרימה:

Python — בניית Graph מלא

from langgraph.graph import StateGraph, START, END

# יצירת הגרף
workflow = StateGraph(AgentState)

# הוספת nodes
workflow.add_node("llm", call_llm)
workflow.add_node("tools", process_tool)

# Edge רגיל: START → llm
workflow.add_edge(START, "llm")

# Edge מותנה: llm → tools או END
def should_use_tool(state: AgentState) -> str:
    """Routing function: מחליטה לאן ללכת."""
    last_message = state["messages"][-1]
    if hasattr(last_message, "tool_calls") and last_message.tool_calls:
        return "tools"  # יש tool call — לך ל-tools node
    return END          # אין — סיימנו

workflow.add_conditional_edges("llm", should_use_tool)

# Edge רגיל: tools → llm (חזרה ללולאה)
workflow.add_edge("tools", "llm")

# Compile!
app = workflow.compile()

# הרצה
result = app.invoke({
    "messages": [HumanMessage(content="What's 2+2?")],
    "current_tool": "",
    "retry_count": 0,
    "final_answer": ""
})

Framework: "Graph Mental Model" — איך לחשוב על סוכנים כגרפים

קונספט	אנלוגיה	ב-LangGraph	דוגמה
State	הזיכרון של הסוכן	`TypedDict`	messages, tool results, flags
Node	צעד בתהליך	פונקציית Python	call_llm, execute_tool, validate
Edge	מעבר בין צעדים	`add_edge()`	tools → llm
Conditional Edge	החלטה — ימינה או שמאלה?	`add_conditional_edges()`	if tool_calls: tools, else: END
Checkpoint	Save game	`MemorySaver`	שמירה בכל צעד, recovery, time-travel

כלל אצבע: אם אתם יכולים לצייר את הסוכן כ-flowchart — אתם יכולים לבנות אותו ב-LangGraph. כל box הוא node, כל חץ הוא edge, כל diamond הוא conditional edge.

State Design Patterns — איך לעצב State נכון

עיצוב ה-State הוא ההחלטה הכי חשובה ב-LangGraph. State גרוע = קוד מסובך, באגים, ו-debug קשה. הנה שלושה דפוסים מוכחים:

Pattern 1: Minimal State — רק מה שצריך

Python — Minimal State לסוכן פשוט

from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages

# State מינימלי — רק messages וסטטוס
class SimpleAgentState(TypedDict):
    messages: Annotated[list, add_messages]
    status: str  # "working" | "done" | "error"

Pattern 2: Rich State — סוכן עם workflow מורכב

Python — Rich State עם metadata ו-tracking

from typing import TypedDict, Annotated, Optional
from langgraph.graph.message import add_messages

class WorkflowState(TypedDict):
    # === Core ===
    messages: Annotated[list, add_messages]

    # === Workflow tracking ===
    current_phase: str        # "research" | "draft" | "review" | "done"
    iteration_count: int      # מונע לולאות — עוצר ב-10
    error_count: int          # מונה שגיאות רצופות

    # === Data ===
    research_results: list    # ממצאי מחקר
    draft_content: str        # תוכן טיוטה
    human_feedback: str       # משוב אנושי

    # === Metadata ===
    user_id: str
    session_id: str
    started_at: str           # timestamp
    total_tokens_used: int    # מעקב עלויות

Pattern 3: Custom Reducers — שליטה מלאה בעדכונים

לפעמים add_messages לא מספיק. אפשר ליצור reducers מותאמים:

Python — Custom Reducers

from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages
import operator

def increment(current: int, update: int) -> int:
    """Reducer שמוסיף (לא מחליף) — לספירה."""
    return current + update

def keep_last_n(n: int):
    """Reducer factory — שומר רק N אחרונים ברשימה."""
    def reducer(current: list, update: list) -> list:
        combined = current + update
        return combined[-n:]  # שומרים רק N אחרונים
    return reducer

class SmartState(TypedDict):
    messages: Annotated[list, add_messages]
    # מוסיף (לא מחליף) — כל node שמחזיר tool_calls_count=1
    # יגדיל את המונה
    tool_calls_count: Annotated[int, increment]
    # שומר רק 5 תוצאות אחרונות — Sliding Window
    recent_results: Annotated[list, keep_last_n(5)]

כלל הזהב: אל תשימו ב-State מה שלא חייבים

כל שדה ב-State עובר checkpointing, serialization, ועדכון בכל צעד. State עם 20 שדות שרובם ריקים = overhead מיותר. התחילו מינימלי — messages + שדה אחד-שניים. הוסיפו שדות רק כשיש צורך אמיתי.

עשה עכשיו 10 דקות

ציירו על דף (או whiteboard) את הגרף של ReAct Agent:

START → LLM Node
LLM Node → Diamond: "יש tool calls?"
Yes → Tools Node → חזרה ל-LLM Node
No → END

זה כל מה שצריך — 2 nodes, 1 conditional edge, 1 regular edge. כל סוכן ReAct בכל framework הוא הגרף הזה.

בינוני 25 דקות פרקטי

בניית ReAct Agent כ-Graph — מאפס

עכשיו נבנה את ה-ReAct Agent שציירתם — הפעם עם tools אמיתיים ו-tool calling של המודל:

Python — ReAct Agent מלא עם LangGraph

from typing import TypedDict, Annotated
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, ToolMessage
from langchain_core.tools import tool
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
import ast
import operator

# === Step 1: הגדרת State ===
class AgentState(TypedDict):
    messages: Annotated[list, add_messages]

# === Step 2: הגדרת Tools ===
@tool
def get_weather(city: str) -> str:
    """Get current weather for a city."""
    weather_data = {
        "tel aviv": "28C, sunny, humidity 65%",
        "jerusalem": "22C, partly cloudy",
        "haifa": "25C, sunny, sea breeze",
    }
    return weather_data.get(city.lower(), f"No data for {city}")

@tool
def calculate(expression: str) -> str:
    """Calculate a mathematical expression.
    Supports: +, -, *, /, ** (power), % (modulo).
    Example: '15 * 340 / 100' or '2 ** 10'"""
    # Safe math evaluation using ast module
    allowed_operators = {
        ast.Add: operator.add,
        ast.Sub: operator.sub,
        ast.Mult: operator.mul,
        ast.Div: operator.truediv,
        ast.Pow: operator.pow,
        ast.Mod: operator.mod,
        ast.USub: operator.neg,
    }
    try:
        tree = ast.parse(expression, mode='eval')
        def safe_eval(node):
            if isinstance(node, ast.Expression):
                return safe_eval(node.body)
            elif isinstance(node, ast.Constant):
                return node.value
            elif isinstance(node, ast.BinOp):
                left = safe_eval(node.left)
                right = safe_eval(node.right)
                op = allowed_operators.get(type(node.op))
                if op is None:
                    raise ValueError(f"Unsupported operator")
                return op(left, right)
            elif isinstance(node, ast.UnaryOp):
                operand = safe_eval(node.operand)
                op = allowed_operators.get(type(node.op))
                if op is None:
                    raise ValueError(f"Unsupported operator")
                return op(operand)
            else:
                raise ValueError(f"Unsupported expression")
        result = safe_eval(tree)
        return str(result)
    except Exception as e:
        return f"Error: {e}"

tools = [get_weather, calculate]
tools_by_name = {t.name: t for t in tools}

# === Step 3: הגדרת Model עם tools ===
model = ChatAnthropic(model="claude-sonnet-4-5-20250514")
model_with_tools = model.bind_tools(tools)

# === Step 4: הגדרת Nodes ===
def call_model(state: AgentState) -> dict:
    """Node: קורא ל-LLM עם ההודעות הנוכחיות."""
    response = model_with_tools.invoke(state["messages"])
    return {"messages": [response]}

def call_tools(state: AgentState) -> dict:
    """Node: מריץ את ה-tools שהמודל ביקש."""
    last_message = state["messages"][-1]
    results = []
    for tc in last_message.tool_calls:
        tool_fn = tools_by_name[tc["name"]]
        result = tool_fn.invoke(tc["args"])
        results.append(
            ToolMessage(content=result, tool_call_id=tc["id"])
        )
    return {"messages": results}

# === Step 5: Routing Function ===
def should_continue(state: AgentState) -> str:
    last = state["messages"][-1]
    if hasattr(last, "tool_calls") and last.tool_calls:
        return "tools"
    return END

# === Step 6: בניית הגרף ===
workflow = StateGraph(AgentState)
workflow.add_node("agent", call_model)
workflow.add_node("tools", call_tools)
workflow.add_edge(START, "agent")
workflow.add_conditional_edges("agent", should_continue)
workflow.add_edge("tools", "agent")

app = workflow.compile()

# === Step 7: הרצה ===
result = app.invoke({
    "messages": [
        HumanMessage(content="What's the weather in Tel Aviv? "
                     "Also, what's 15 * 340 / 100?")
    ]
})

# הדפסת התשובה האחרונה
print(result["messages"][-1].content)

מה קורה כאן צעד אחרי צעד:

START → agent: ההודעה של המשתמש נשלחת ל-Claude
agent: Claude מזהה שצריך 2 tools — get_weather ו-calculate
should_continue: יש tool_calls? כן → הולכים ל-tools
tools: מריצים את שני ה-tools, מחזירים ToolMessages
tools → agent: חוזרים ל-Claude עם התוצאות
agent: Claude כותב תשובה סופית
should_continue: אין tool_calls? → END

עשה עכשיו 15 דקות

העתיקו את הקוד למעלה לקובץ react_agent.py והריצו:

python react_agent.py

נסו שאילתות שונות:

"What's 2 + 2?" (כלי calculate בלבד)
"How's the weather in Haifa?" (כלי weather בלבד)
"What's the weather in Jerusalem and what's 100 * 3.14?" (שני כלים)
"Tell me a joke" (בלי כלים — ישר ל-END)

שימו לב לדפוס: שאילתות שלא צריכות כלים עוברות agent → END ישירות.

מתחיל 10 דקות פרקטי

Pre-Built Agents — create_react_agent

בניתם ReAct agent מאפס ב-40+ שורות? create_react_agent() עושה את אותו דבר ב-3 שורות:

Python — create_react_agent — 3 שורות

from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import InMemorySaver

@tool
def get_weather(location: str) -> str:
    """Get the current weather for a location."""
    weather_data = {
        "tel aviv": "28C, sunny",
        "new york": "15C, cloudy",
        "london": "12C, rainy",
    }
    return weather_data.get(
        location.lower(),
        f"No weather data for {location}"
    )

@tool
def search_web(query: str) -> str:
    """Search the web for information."""
    return f"Search results for '{query}': [mock results]"

# === זהו — 3 שורות ===
agent = create_react_agent(
    model="anthropic:claude-sonnet-4-5-20250514",
    tools=[get_weather, search_web],
    prompt="You are a helpful assistant. Answer in Hebrew when asked in Hebrew.",
)

# הרצה עם memory
config = {"configurable": {"thread_id": "session-1"}}
result = agent.invoke(
    {"messages": [{"role": "user", "content": "מזג אוויר בתל אביב?"}]},
    config
)
print(result["messages"][-1].content)

# המשך שיחה — באותו thread
result2 = agent.invoke(
    {"messages": [{"role": "user", "content": "ומה לגבי לונדון?"}]},
    config
)
print(result2["messages"][-1].content)

מתי custom graph ומתי pre-built?

LangGraph מציע שני נתיבים: create_react_agent — agent מוכן שמתאים ל-80% מהמקרים, ו-StateGraph מותאם אישית — לכשצריכים שליטה מלאה על הזרימה. הטבלה הבאה עוזרת לבחור בין השניים לפי ה-use case.

תרחיש	create_react_agent	Custom StateGraph
סוכן פשוט עם tools	✓
Prototype מהיר	✓
לוגיקת routing מורכבת		✓
Multi-step validation		✓
Custom retry logic		✓
Multi-agent workflows		✓
Human-in-the-loop at specific steps		✓

כלל אצבע: התחילו עם create_react_agent(). עברו ל-custom graph רק כשצריכים שליטה שהפונקציה לא מספקת.

Customizing create_react_agent — אפשרויות מתקדמות

create_react_agent() מקבל יותר מ-model ו-tools. הנה הפרמטרים השימושיים ביותר:

Python — create_react_agent עם options מתקדמים

from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import MemorySaver

agent = create_react_agent(
    model="anthropic:claude-sonnet-4-5-20250514",
    tools=[get_weather, search_web, calculate],

    # System prompt — ההוראות לסוכן
    prompt="You are a helpful assistant for Israeli businesses. "
           "Answer in the user's language. Be concise but thorough. "
           "Always verify data with tools before answering.",

    # Checkpointer — זיכרון בין שיחות
    checkpointer=MemorySaver(),

    # Response format — structured output
    response_format={
        "answer": "The main answer to the user's question",
        "sources": "List of sources or tools used",
        "confidence": "high, medium, or low"
    },

    # Pre-processing — פונקציה שרצה לפני כל LLM call
    state_modifier=lambda state: {
        **state,
        "messages": state["messages"][-20:]  # שומרים רק 20 הודעות אחרונות
    },
)

# הרצה עם config
config = {"configurable": {"thread_id": "biz-session-1"}}
result = agent.invoke(
    {"messages": [{"role": "user", "content": "מחיר Claude Sonnet?"}]},
    config
)
print(result["messages"][-1].content)

שימו לב ל-state_modifier — זה דפוס חשוב לניהול Context Window. במקום לשלוח את כל ההיסטוריה ל-LLM (שעלולה להיות אלפי הודעות), שולחים רק את 20 האחרונות. זה חוסך tokens ומשפר דיוק.

בינוני 20 דקות פרקטי

Memory ו-Checkpointing

זו אחת היכולות החזקות ביותר של LangGraph — checkpointing אוטומטי. בכל צעד בגרף, ה-state נשמר. זה מאפשר:

Recovery: הסוכן קרס? ממשיך מהצעד האחרון
Time-travel debugging: חוזרים לכל נקודה בהיסטוריה ובודקים מה קרה
Human-in-the-loop: עוצרים, מבקשים אישור, ממשיכים
Conversation memory: סוכן שזוכר שיחות קודמות

Checkpointing Backends

Backend	Class	מתאים ל	Persistence
In-Memory	`MemorySaver`	פיתוח, testing	נמחק בריסטרט
SQLite	`SqliteSaver`	Single-server, MVP	קובץ מקומי
PostgreSQL	`PostgresSaver`	פרודקשן	Database מרוחק

Python — Checkpointing עם MemorySaver

from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import StateGraph, START, END

# יוצרים checkpointer
memory = MemorySaver()

# בונים graph (כמו קודם)
workflow = StateGraph(AgentState)
workflow.add_node("agent", call_model)
workflow.add_node("tools", call_tools)
workflow.add_edge(START, "agent")
workflow.add_conditional_edges("agent", should_continue)
workflow.add_edge("tools", "agent")

# Compile עם checkpointer!
app = workflow.compile(checkpointer=memory)

# Thread 1 — שיחה ראשונה
config1 = {"configurable": {"thread_id": "user-123"}}
result1 = app.invoke(
    {"messages": [HumanMessage(content="My name is Nadav.")]},
    config1
)

# המשך שיחה — באותו thread
result2 = app.invoke(
    {"messages": [HumanMessage(content="What's my name?")]},
    config1
)
print(result2["messages"][-1].content)
# "Your name is Nadav."  -- הסוכן זוכר!

# Thread 2 — שיחה חדשה (לא זוכר)
config2 = {"configurable": {"thread_id": "user-456"}}
result3 = app.invoke(
    {"messages": [HumanMessage(content="What's my name?")]},
    config2
)
print(result3["messages"][-1].content)
# "I don't know your name."  -- thread חדש = clean slate

Python — Checkpointing עם SQLite (persistent)

import sqlite3
from langgraph.checkpoint.sqlite import SqliteSaver

# חיבור ל-SQLite — הנתונים נשמרים בקובץ
conn = sqlite3.connect("agent_memory.db")
checkpointer = SqliteSaver(conn)

# Compile עם SQLite checkpointer
app = workflow.compile(checkpointer=checkpointer)

# עכשיו גם אחרי restart של התוכנית,
# הסוכן ימשיך מאיפה שעצר!
config = {"configurable": {"thread_id": "persistent-session"}}
result = app.invoke(
    {"messages": [HumanMessage(
        content="Remember: project deadline is April 1st"
    )]},
    config
)

# --- restart program ---
# הסוכן עדיין זוכר:
result = app.invoke(
    {"messages": [HumanMessage(content="When is the deadline?")]},
    config
)
# "The project deadline is April 1st."

Time-Travel Debugging עם Checkpoints

אחד היתרונות הכי חזקים של checkpointing: אפשר "לחזור בזמן" ולראות את ה-state בכל נקודה בריצה. זה הופך debugging של סוכנים מ"מה לעזאזל קרה?" ל"אני יודע בדיוק מה קרה בצעד 3".

Python — Time-Travel Debugging

from langgraph.checkpoint.memory import MemorySaver

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

# הרצה רגילה
config = {"configurable": {"thread_id": "debug-session"}}
result = app.invoke(
    {"messages": [HumanMessage(content="Weather in Tel Aviv and calculate 15*23")]},
    config
)

# === Time-Travel: בדיקת כל checkpoint ===
# מקבלים את כל ה-checkpoints של ה-thread
checkpoints = list(
    memory.list(config)
)

print(f"Total checkpoints: {len(checkpoints)}")

for i, cp in enumerate(reversed(checkpoints)):
    state = cp.checkpoint
    print(f"\n--- Checkpoint {i} ---")
    print(f"  Node: {cp.metadata.get('source', 'unknown')}")
    print(f"  Messages count: {len(state.get('channel_values', {}).get('messages', []))}")
    # הצגת ההודעה האחרונה בכל checkpoint
    msgs = state.get("channel_values", {}).get("messages", [])
    if msgs:
        last = msgs[-1]
        content_preview = str(last.content)[:100] if hasattr(last, 'content') else str(last)[:100]
        print(f"  Last message: {content_preview}...")

# === Replay: הרצה מחדש מ-checkpoint ספציפי ===
# שימושי כשרוצים לבדוק "מה היה קורה אם..."
specific_checkpoint = checkpoints[2]  # חוזרים לצעד 2
result_replay = app.invoke(
    {"messages": [HumanMessage(content="Actually, check Jerusalem too")]},
    {
        "configurable": {
            "thread_id": "debug-session",
            "checkpoint_id": specific_checkpoint.config[
                "configurable"
            ]["checkpoint_id"]
        }
    }
)

PostgreSQL Checkpointer — דפוס Production

MemorySaver טוב לפיתוח, אבל בפרודקשן צריך persistence אמיתי. הנה ה-setup המלא עם PostgreSQL:

Python — PostgreSQL Checkpointer לפרודקשן

from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver
import asyncio

async def create_production_agent():
    """Setup מלא לפרודקשן עם PostgreSQL."""

    # חיבור ל-PostgreSQL
    DB_URI = "postgresql://user:pass@localhost:5432/agent_db"

    async with AsyncPostgresSaver.from_conn_string(DB_URI) as checkpointer:
        # יצירת הטבלאות (פעם ראשונה)
        await checkpointer.setup()

        # Compile עם PostgreSQL checkpointer
        app = workflow.compile(checkpointer=checkpointer)

        # הרצה — ה-state נשמר ב-PostgreSQL
        config = {"configurable": {"thread_id": "prod-user-123"}}
        result = await app.ainvoke(
            {"messages": [HumanMessage(content="Hello!")]},
            config
        )

        # אחרי restart — הסוכן ממשיך מאיפה שעצר!
        result2 = await app.ainvoke(
            {"messages": [HumanMessage(content="What did I say?")]},
            config
        )

        return result2

# הרצה
asyncio.run(create_production_agent())

Checkpointer Decision Matrix

פיתוח: MemorySaver — מהיר, אפס setup. MVP/Testing: SqliteSaver — persistent, אין צורך בשרת DB. Production: AsyncPostgresSaver — scale-out, reliability, ניתן לשאילתות SQL. כלל אצבע: אם יש לכם כבר PostgreSQL בסביבה — השתמשו בו. אם לא — SqliteSaver מספיק עד 1,000 sessions ביום.

עשה עכשיו 10 דקות

הוסיפו checkpointing ל-ReAct Agent שבניתם:

צרו MemorySaver והוסיפו ל-compile()
הריצו שיחה עם thread_id — שאלו "My name is [השם שלכם]"
באותו thread, שאלו "What's my name?"
פתחו thread חדש ושאלו שוב — ודאו שהסוכן לא זוכר

ברגע שזה עובד — הבנתם checkpointing.

בינוני 20 דקות פרקטי

Human-in-the-Loop — interrupt ו-Command

אחת הסיבות המרכזיות לבחור ב-LangGraph היא human-in-the-loop מובנה. הסוכן יכול לעצור בכל נקודה בגרף, להציג מידע לאדם, לחכות לאישור/שינוי, ולהמשיך.

interrupt() — עצירת הגרף

הפונקציה interrupt() עוצרת את הגרף ומחזירה payload לצד הלקוח. הגרף נשמר ב-checkpoint ומחכה ל-Command(resume=...).

Python — Email Agent עם Human Review

from langgraph.types import Command, interrupt
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import StateGraph, START, END

class EmailState(TypedDict):
    messages: Annotated[list, add_messages]
    email_draft: str
    approved: bool

def draft_email(state: EmailState) -> dict:
    """Node: LLM כותב טיוטת מייל."""
    response = model.invoke(state["messages"])
    return {"email_draft": response.content}

def human_review(state: EmailState) -> Command:
    """Node: עוצר לאישור אנושי."""
    # interrupt() מציג את הטיוטה ומחכה לתשובה
    decision = interrupt({
        "draft": state["email_draft"],
        "action": "Please review this email. "
                  "Reply with 'approve' or provide edits."
    })

    if decision.get("approved"):
        # אושר — ממשיכים לשליחה
        edited = decision.get(
            "edited_draft", state["email_draft"]
        )
        return Command(
            update={"email_draft": edited, "approved": True},
            goto="send_email"
        )
    else:
        # נדחה — חוזרים לטיוטה
        return Command(
            update={"approved": False},
            goto=END
        )

def send_email(state: EmailState) -> dict:
    """Node: שולח את המייל (simulated)."""
    print(f"Sending email: {state['email_draft'][:100]}...")
    return {}

# בניית הגרף
workflow = StateGraph(EmailState)
workflow.add_node("draft", draft_email)
workflow.add_node("review", human_review)
workflow.add_node("send_email", send_email)

workflow.add_edge(START, "draft")
workflow.add_edge("draft", "review")
workflow.add_edge("send_email", END)

memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

# === שלב 1: הרצה ראשונה — הגרף יעצור ב-review ===
config = {"configurable": {"thread_id": "email-1"}}
result = app.invoke(
    {"messages": [HumanMessage(
        content="Draft an email to the team about the Q2 launch"
    )]},
    config
)
# result["__interrupt__"] מכיל את הטיוטה

# === שלב 2: אישור אנושי — הגרף ממשיך ===
result = app.invoke(
    Command(resume={"approved": True}),
    config
)
# "Sending email: ..."

interrupt() דורש Checkpointer

interrupt() עובד רק כש-graph compiled עם checkpointer. הסיבה: הגרף צריך לשמור את ה-state שלו כדי לחכות לתשובה אנושית ולהמשיך מאיפה שעצר. בלי checkpointer — תקבלו error.

Command — חידוש ריצה עם עדכונים

Command הוא אובייקט רב-תכליתי שעושה שלושה דברים בו-זמנית:

Parameter	מה עושה	דוגמה
`resume`	מחזיר ערך ל-`interrupt()` שממתין	`resume={"approved": True}`
`update`	מעדכן שדות ב-state	`update={"email_draft": "edited text"}`
`goto`	מנתב ל-node ספציפי	`goto="send_email"`

דפוסים מתקדמים ל-Human-in-the-Loop

דפוס 1: Approval עם Timeout — אישור עם מגבלת זמן

בפרודקשן, לא תמיד מישהו זמין לאשר. הוספת timeout מאפשרת fallback אוטומטי:

Python — Human approval עם timeout ו-fallback

import asyncio
from langgraph.types import Command, interrupt

def human_approval_with_timeout(state: dict) -> Command:
    """
    מבקש אישור אנושי.
    אם אין תשובה תוך timeout — פועל לפי default policy.
    """
    action = state.get("pending_action", {})
    risk_level = action.get("risk", "low")

    # low risk — אישור אוטומטי, לא צריך אדם
    if risk_level == "low":
        return Command(
            update={"approved": True},
            goto="execute_action"
        )

    # medium/high risk — מבקשים אישור
    decision = interrupt({
        "action": action,
        "risk_level": risk_level,
        "message": f"Action requires approval (risk: {risk_level})",
        "timeout_seconds": 300,  # 5 דקות
        "default_on_timeout": "reject" if risk_level == "high" else "approve"
    })

    if decision.get("approved"):
        return Command(
            update={"approved": True, "approved_by": decision.get("user", "unknown")},
            goto="execute_action"
        )
    else:
        reason = decision.get("reason", "No reason provided")
        return Command(
            update={"approved": False, "rejection_reason": reason},
            goto="notify_rejection"
        )

דפוס 2: Multi-Step Approval — אישור רב-שלבי

פעולות קריטיות יכולות לדרוש אישור ממספר אנשים — למשל, העברת כספים מעל סכום מסוים:

Python — Multi-step approval chain

def multi_step_approval(state: dict) -> Command:
    """
    שרשרת אישורים:
    1. מנהל ישיר (כל פעולה)
    2. מנהל בכיר (מעל $1000)
    3. CFO (מעל $10000)
    """
    amount = state.get("amount", 0)
    approvals_needed = ["manager"]
    if amount > 1000:
        approvals_needed.append("senior_manager")
    if amount > 10000:
        approvals_needed.append("cfo")

    current_approvals = state.get("approvals", [])
    remaining = [
        a for a in approvals_needed
        if a not in current_approvals
    ]

    if not remaining:
        # כל האישורים התקבלו
        return Command(
            update={"fully_approved": True},
            goto="execute_action"
        )

    next_approver = remaining[0]
    decision = interrupt({
        "requires_approval_from": next_approver,
        "amount": amount,
        "current_approvals": current_approvals,
        "remaining_approvals": remaining[1:],
        "action_summary": state.get("action_summary", "")
    })

    if decision.get("approved"):
        updated_approvals = current_approvals + [next_approver]
        return Command(
            update={"approvals": updated_approvals},
            goto="multi_step_approval"  # חוזר לבדוק אם צריך עוד
        )
    else:
        return Command(
            update={"rejected_by": next_approver},
            goto="notify_rejection"
        )

עשה עכשיו 15 דקות

בנו Email Agent עם human review:

העתיקו את הקוד למעלה לקובץ email_agent.py
הריצו — תראו את הטיוטה ב-result["__interrupt__"]
שלחו Command(resume={"approved": True}) — ודאו שה"מייל נשלח"
הריצו שוב ודחו (approved: False) — ודאו שהגרף מסתיים

בינוני 25 דקות פרקטי

Multi-Agent — Sub-Graphs ו-Supervisor

LangGraph מאפשר בניית מערכות multi-agent על ידי הרכבת גרפים:

Sub-graphs: כל agent הוא graph עצמאי שמוטמע בתוך graph אב
Supervisor pattern: agent מנהל שמחליט מי עובד על מה
Swarm pattern: agents מעבירים שליטה ביניהם דינמית

Supervisor Pattern

Python — Supervisor Agent עם Researcher + Writer

from typing import Literal
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, SystemMessage
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langgraph.types import Command

class TeamState(TypedDict):
    messages: Annotated[list, add_messages]
    next_worker: str
    research_output: str
    final_output: str

model = ChatAnthropic(model="claude-sonnet-4-5-20250514")

def supervisor(state: TeamState) -> Command:
    """Supervisor: מחליט מי עובד הבא."""
    sys_msg = SystemMessage(content="""You are a project supervisor.
Based on the conversation, decide the next step:
- If research is needed, respond with: ROUTE:researcher
- If writing is needed (after research), respond with: ROUTE:writer
- If the task is complete, respond with: ROUTE:FINISH

Always start with researcher, then writer, then FINISH.""")

    response = model.invoke([sys_msg] + state["messages"])
    content = response.content

    if "ROUTE:researcher" in content:
        return Command(
            update={"messages": [response]},
            goto="researcher"
        )
    elif "ROUTE:writer" in content:
        return Command(
            update={"messages": [response]},
            goto="writer"
        )
    else:
        return Command(
            update={"messages": [response]},
            goto=END
        )

def researcher(state: TeamState) -> dict:
    """Researcher: מבצע research ומחזיר ממצאים."""
    sys_msg = SystemMessage(content="""You are a research analyst.
Research the topic thoroughly and provide key findings,
data points, and insights. Be factual and detailed.""")

    response = model.invoke([sys_msg] + state["messages"])
    return {
        "messages": [response],
        "research_output": response.content
    }

def writer(state: TeamState) -> dict:
    """Writer: כותב תוכן בהתבסס על ה-research."""
    sys_msg = SystemMessage(content=f"""You are a professional writer.
Based on this research:
{state.get('research_output', 'No research available')}

Write a compelling, well-structured article. Use Hebrew if
the original request was in Hebrew.""")

    response = model.invoke([sys_msg] + state["messages"])
    return {
        "messages": [response],
        "final_output": response.content
    }

# בניית הגרף
workflow = StateGraph(TeamState)
workflow.add_node("supervisor", supervisor)
workflow.add_node("researcher", researcher)
workflow.add_node("writer", writer)

workflow.add_edge(START, "supervisor")
workflow.add_edge("researcher", "supervisor")
workflow.add_edge("writer", "supervisor")

app = workflow.compile()

# הרצה
result = app.invoke({
    "messages": [HumanMessage(
        content="Write an article about AI agents in Israeli startups"
    )]
})

print(result["final_output"])

מה קורה כאן:

supervisor מקבל את הבקשה ומפנה ל-researcher
researcher מבצע research ומחזיר ממצאים
supervisor רואה שיש research — מפנה ל-writer
writer כותב מאמר בהתבסס על ה-research
supervisor רואה שהמשימה הושלמה — FINISH

מגבלת עלות ב-Multi-Agent

כל agent עושה LLM call נפרד. Supervisor + Researcher + Writer = לפחות 5 LLM calls (supervisor נקרא 3 פעמים). עם Claude Sonnet, זה ~$0.10-0.30 per task. עם Opus — $0.50-1.50. תמיד הגדירו retry limit ו-budget.

Swarm Pattern — העברת שליטה דינמית

ב-Supervisor Pattern, סוכן אחד מנהל את כולם. ב-Swarm Pattern, הסוכנים מעבירים שליטה ביניהם — כל סוכן מחליט מי עובד אחריו. זה מתאים למקרים שבהם הזרימה לא ידועה מראש:

Python — Swarm Pattern עם Handoff

from langgraph.types import Command

class SwarmState(TypedDict):
    messages: Annotated[list, add_messages]
    current_agent: str
    handoff_count: int  # מגביל — מונע לולאות

def sales_agent(state: SwarmState) -> Command:
    """Sales agent — מטפל בשאלות מכירה."""
    sys_msg = SystemMessage(content="""You are a sales agent.
If the user asks about pricing or products, help them.
If they ask about technical issues, hand off to: HANDOFF:support
If they ask about billing, hand off to: HANDOFF:billing
Otherwise, respond directly.""")

    response = model.invoke([sys_msg] + state["messages"])
    content = response.content

    # בדיקת handoff
    if "HANDOFF:support" in content:
        return Command(
            update={
                "messages": [response],
                "current_agent": "support",
                "handoff_count": 1
            },
            goto="support_agent"
        )
    elif "HANDOFF:billing" in content:
        return Command(
            update={
                "messages": [response],
                "current_agent": "billing",
                "handoff_count": 1
            },
            goto="billing_agent"
        )

    # אין handoff — עונה ישירות
    return Command(
        update={"messages": [response]},
        goto=END
    )

def support_agent(state: SwarmState) -> Command:
    """Support agent — מטפל בבעיות טכניות."""
    # מגבלת handoffs — מונע Ping-Pong בין סוכנים
    if state.get("handoff_count", 0) >= 3:
        return Command(
            update={"messages": [
                AIMessage(content="מעביר לנציג אנושי — יותר מדי העברות.")
            ]},
            goto=END
        )

    sys_msg = SystemMessage(content="""You are a technical support agent.
Help with technical issues. If you can't help and it's a billing
issue, hand off to: HANDOFF:billing""")

    response = model.invoke([sys_msg] + state["messages"])
    content = response.content

    if "HANDOFF:billing" in content:
        return Command(
            update={
                "messages": [response],
                "current_agent": "billing",
                "handoff_count": 1
            },
            goto="billing_agent"
        )

    return Command(
        update={"messages": [response]},
        goto=END
    )

# בניית הגרף
workflow = StateGraph(SwarmState)
workflow.add_node("sales_agent", sales_agent)
workflow.add_node("support_agent", support_agent)
workflow.add_node("billing_agent", billing_agent)  # דומה ל-support

# Entry point — תמיד מתחילים ב-sales
workflow.add_edge(START, "sales_agent")

app = workflow.compile(checkpointer=MemorySaver())

מתי Supervisor ומתי Swarm?

קריטריון	Supervisor Pattern	Swarm Pattern
שליטה	מרוכזת — supervisor מחליט הכול	מבוזרת — כל agent מחליט
צפויות	גבוהה — supervisor מגדיר סדר	נמוכה — זרימה דינמית
עלות	supervisor נקרא בכל צעד — יקר	ללא overhead — רק agents רלוונטיים
Debugging	קל — supervisor log	קשה — צריך לעקוב אחרי handoffs
מתאים ל	workflows מוגדרים (research → write → edit)	customer service, routing דינמי

תרגיל 1: ReAct Agent מאפס 30 דקות

מטרה: בניית ReAct Agent מלא עם StateGraph, tools, ו-checkpointing.

הגדירו AgentState עם messages, retry_count, ו-total_tool_calls
צרו 3 tools: get_weather, search_news (mock), calculate
בנו call_model node ו-call_tools node
הוסיפו safety check: אם retry_count > 5, עצרו (END) גם אם יש tool calls
הוסיפו MemorySaver ובדקו שהסוכן זוכר שיחות

Success criteria: הסוכן מטפל בשאילתות עם/בלי tools, זוכר שמות, ולא נכנס ללולאה אינסופית.

עלות: ~$0.50

תרגיל 2: Human-in-the-Loop Email Agent 25 דקות

מטרה: בניית סוכן שכותב מיילים ומבקש אישור אנושי לפני שליחה.

בנו graph עם 4 nodes: draft, review, edit, send
ב-review: השתמשו ב-interrupt() להציג את הטיוטה
אם האדם אישר — ל-send. אם ביקש שינויים — ל-edit → review
הוסיפו מונה edits — אחרי 3 סבבי עריכה, שלחו אוטומטית

Success criteria: הסוכן עוצר לאישור, מאפשר עריכה, וממשיך נכון.

עלות: ~$0.50

בינוני 15 דקות פרקטי

LangSmith — Observability ו-Evaluation

LangSmith הוא פלטפורמת ה-observability של LangChain. הוא מאפשר לראות כל דבר שהסוכן עושה — כל LLM call, כל tool execution, כל החלטה.

Setup — שורה אחת

Python — הפעלת LangSmith Tracing

# Option 1: Environment variables (מומלץ)
# ב-.env:
# LANGSMITH_API_KEY=lsv2_pt_...
# LANGSMITH_TRACING=true
# LANGSMITH_PROJECT=my-agent-project

# Option 2: בקוד
import os
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_API_KEY"] = "lsv2_pt_..."
os.environ["LANGSMITH_PROJECT"] = "my-agent-project"

# זהו! כל LangGraph/LangChain call עכשיו מתועד
# פתחו https://smith.langchain.com לראות traces

מה תראו ב-LangSmith:

Trace view: כל צעד בגרף — nodes, LLM calls, tool executions — בציר זמן ויזואלי
Token usage: כמה tokens כל call צרך (input + output)
Latency: כמה זמן כל צעד לקח
Error tracking: שגיאות, retries, failures
Cost estimation: עלות משוערת לכל trace

Custom Metrics ב-LangSmith

מעבר ל-tracing אוטומטי, אפשר להוסיף custom metadata לכל trace — שימושי לניתוח ביצועים לאורך זמן:

Python — Custom metadata ו-tags ב-LangSmith

from langsmith import traceable

@traceable(
    name="customer_support_agent",
    tags=["production", "tier-1"],
    metadata={
        "team": "support",
        "version": "1.3.2",
    }
)
def run_support_agent(query: str, user_id: str) -> dict:
    """Agent עם metadata מלא ב-LangSmith."""
    config = {
        "configurable": {"thread_id": f"support-{user_id}"},
        # Metadata שיופיע ב-LangSmith
        "metadata": {
            "user_id": user_id,
            "query_length": len(query),
            "language": "he" if any(
                "\u0590" <= c <= "\u05FF" for c in query
            ) else "en",
        },
        # Tags — לסינון ב-Dashboard
        "tags": [
            f"user:{user_id}",
            "channel:web",
        ],
    }

    result = app.invoke(
        {"messages": [HumanMessage(content=query)]},
        config
    )

    return {
        "response": result["messages"][-1].content,
        "steps": len(result["messages"]),
    }

# עכשיו ב-LangSmith אפשר לסנן:
# - לפי user_id
# - לפי שפה (עברית/אנגלית)
# - לפי channel
# - לפי version

Quality Testing Datasets — בדיקת איכות שוטפת

סוכן שעובד היום עלול להפסיק לעבוד מחר — בגלל שינוי במודל, drift בנתונים, או באגים שמצטברים. בדיקה שוטפת עם test datasets היא ההגנה:

Python — Weekly quality check עם LangSmith

from langsmith import Client
from langsmith.evaluation import evaluate

client = Client()

# Golden dataset — 20+ test cases עם תשובות מצופות
# יוצרים פעם אחת, מעדכנים כשמוסיפים features
DATASET_NAME = "weather-agent-golden-v2"

def weekly_quality_check():
    """הרצה שבועית — מוודאת שהסוכן עדיין עובד."""

    def agent_fn(inputs: dict) -> dict:
        result = app.invoke({
            "messages": [HumanMessage(content=inputs["query"])]
        })
        return {"output": result["messages"][-1].content}

    # בודקים: האם הסוכן השתמש בכלי הנכון?
    def check_tool_usage(run, example):
        expected_tool = example.outputs.get("expected_tool")
        if not expected_tool:
            return {"key": "tool_check", "score": True}
        trace_steps = str(run.outputs)
        used_tool = expected_tool.lower() in trace_steps.lower()
        return {"key": "tool_check", "score": used_tool}

    # בודקים: האם התשובה בשפה הנכונה?
    def check_language(run, example):
        expected_lang = example.outputs.get("expected_language", "en")
        output = run.outputs.get("output", "")
        has_hebrew = any("\u0590" <= c <= "\u05FF" for c in output)
        if expected_lang == "he":
            return {"key": "correct_language", "score": has_hebrew}
        return {"key": "correct_language", "score": not has_hebrew}

    results = evaluate(
        agent_fn,
        data=DATASET_NAME,
        evaluators=[check_tool_usage, check_language],
    )

    pass_rate = results.summary_results.get("tool_check", 0)
    print(f"Weekly quality check pass rate: {pass_rate:.0%}")

    if pass_rate < 0.8:
        print("ALERT: Agent quality dropped below 80%!")
        # בפרודקשן: שליחת alert ל-Slack/email

weekly_quality_check()

Quality Testing כ-CI/CD

חברות מתקדמות מריצות quality tests כחלק מ-CI/CD pipeline. כל PR שמשנה prompt, tools, או graph — מריץ את ה-test dataset אוטומטית. אם Pass Rate יורד — ה-PR לא עובר. זה מונע רגרסיות ושומר על איכות גבוהה לאורך זמן.

Evaluation — בדיקת איכות הסוכן

Python — Evaluation עם LangSmith

from langsmith import Client
from langsmith.evaluation import evaluate

client = Client()

# יצירת dataset עם test cases
dataset = client.create_dataset("weather-agent-tests")
client.create_examples(
    inputs=[
        {"query": "What's the weather in Tel Aviv?"},
        {"query": "Is it raining in London?"},
        {"query": "Compare weather in NYC and SF"},
    ],
    outputs=[
        {"must_contain": "Tel Aviv"},
        {"must_contain": "London"},
        {"must_contain": ["NYC", "SF"]},
    ],
    dataset_id=dataset.id
)

# פונקציית evaluation
def run_agent(inputs: dict) -> dict:
    result = app.invoke({
        "messages": [HumanMessage(content=inputs["query"])]
    })
    return {"output": result["messages"][-1].content}

# הרצת evaluation
results = evaluate(
    run_agent,
    data=dataset.name,
    evaluators=[
        lambda run, example: {
            "key": "contains_expected",
            "score": all(
                term in run.outputs["output"]
                for term in (
                    [example.outputs["must_contain"]]
                    if isinstance(
                        example.outputs["must_contain"], str
                    )
                    else example.outputs["must_contain"]
                )
            )
        }
    ]
)

print(f"Pass rate: {results.summary_results['contains_expected']}")

עשה עכשיו 10 דקות

הפעילו LangSmith tracing:

הירשמו ב-smith.langchain.com (חינם)
צרו API key ב-Settings
הוסיפו את 3 ה-environment variables ל-.env
הריצו את ה-ReAct Agent שבניתם — ופתחו את LangSmith Dashboard
לחצו על ה-trace ותראו כל node, כל LLM call, כל tool

הרגע הזה — כשתראו את כל ה-agent loop מתועד ויזואלית — הוא הרגע שתבינו למה observability הוא לא luxury אלא הכרח.

בינוני 15 דקות פרקטי

Deployment — LangServe ו-LangGraph Platform

LangServe — REST API בדקות

LangServe הופך כל LangGraph app ל-REST API:

Python — Deployment עם LangServe

from fastapi import FastAPI
from langserve import add_routes

# ה-agent שבניתם
app_agent = workflow.compile(checkpointer=memory)

# FastAPI app
api = FastAPI(title="AI Agent API")

# LangServe מוסיף endpoints אוטומטיים
add_routes(api, app_agent, path="/agent")

# עכשיו יש לכם:
# POST /agent/invoke    — הרצה בודדת
# POST /agent/stream    — streaming
# POST /agent/batch     — הרצה מרובה
# GET  /agent/playground — UI לבדיקה

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(api, host="0.0.0.0", port=8000)

Deployment Checklist ל-LangGraph

לפני שפורסים סוכן LangGraph לפרודקשן, ודאו שכל הפריטים מסומנים:

#	פריט	חשיבות
1	Checkpointer מוגדר (PostgreSQL, לא MemorySaver)	קריטי
2	`recursion_limit` מוגדר ב-config (מומלץ: 15-25)	קריטי
3	Loop detection ב-routing function	קריטי
4	LangSmith tracing מופעל עם project name	מומלץ מאוד
5	API keys ב-environment variables (לא בקוד)	קריטי
6	Error handling בכל node — try/except עם fallback	קריטי
7	Rate limiting על endpoint חיצוני	מומלץ
8	Quality test dataset עם 20+ test cases	מומלץ מאוד
9	Cost alert מוגדר אצל ספקי ה-LLM	מומלץ
10	Health check endpoint שמוודא ש-DB ו-LLM זמינים	מומלץ

LangGraph Platform — Managed Hosting

Feature	Self-Hosted	LangGraph Platform
Persistent state	צריך DB	Built-in
Scaling	ידני	Auto-scale
Cron jobs	צריך scheduler	Built-in
Double-texting	צריך לנהל	Handled
Streaming	צריך setup	Built-in
Pricing	Infrastructure costs	$0.001/node execution

עשה עכשיו 10 דקות

פרסו את הסוכן כ-REST API:

# התקנה
pip install langserve fastapi uvicorn

# צרו server.py עם הקוד למעלה
python server.py

# בדקו
curl -X POST http://localhost:8000/agent/invoke \
  -H "Content-Type: application/json" \
  -d '{"input": {"messages": [{"role": "user",
       "content": "Hello!"}]}}'

פתחו http://localhost:8000/agent/playground — תקבלו UI אוטומטי לבדיקה.

בינוני 15 דקות פרקטי

Debugging LangGraph — טכניקות מעשיות

Debugging של סוכני AI זה לא כמו debugging של תוכנה רגילה. הסוכן לא דטרמיניסטי — אותה שאילתה יכולה לתת תוצאות שונות. ב-LangGraph יש כלים מובנים שהופכים את ה-debugging לניהול:

טכניקה 1: Graph Visualization — ציור הגרף

Python — ציור הגרף אוטומטית

from IPython.display import Image, display

# ב-Jupyter Notebook / IPython
app = workflow.compile(checkpointer=memory)

# מייצר תמונת PNG של הגרף
graph_image = app.get_graph().draw_mermaid_png()
display(Image(graph_image))

# אלטרנטיבה: Mermaid text (אפשר להדביק ב-mermaid.live)
mermaid_text = app.get_graph().draw_mermaid()
print(mermaid_text)
# graph LR
#   __start__ --> agent
#   agent --> tools
#   agent --> __end__
#   tools --> agent

כש-graph מתחיל להיות מורכב (5+ nodes), ה-visualization הוא הכלי הראשון שתפנו אליו. "מה הגרף עושה?" הרבה יותר קל לענות כשרואים את זה ויזואלית.

טכניקה 2: Step-by-Step Streaming — צפייה בזמן אמת

Python — Streaming של כל node בנפרד

config = {"configurable": {"thread_id": "debug-1"}}

# stream_mode="updates" — רואים רק את מה שהשתנה בכל צעד
for event in app.stream(
    {"messages": [HumanMessage(content="Weather in Tel Aviv?")]},
    config,
    stream_mode="updates"
):
    for node_name, state_update in event.items():
        print(f"\n{'='*50}")
        print(f"NODE: {node_name}")
        print(f"{'='*50}")

        # הצגת הודעות חדשות בלבד
        if "messages" in state_update:
            for msg in state_update["messages"]:
                msg_type = type(msg).__name__
                content = str(msg.content)[:200]
                print(f"  [{msg_type}] {content}")

                # הצגת tool calls אם יש
                if hasattr(msg, "tool_calls") and msg.tool_calls:
                    for tc in msg.tool_calls:
                        print(f"    -> Tool: {tc['name']}({tc['args']})")

טכניקה 3: Node-Level Debugging — בדיקת node ספציפי

כשיודעים ש-node ספציפי מתנהג לא נכון, אפשר לבדוק אותו בבידוד:

Python — בדיקת node בבידוד

# במקום להריץ את כל הגרף, בודקים node אחד
test_state = {
    "messages": [
        HumanMessage(content="What's 15 * 23?"),
        # מדמים תשובת LLM עם tool call
    ],
    "retry_count": 0,
}

# קוראים ישירות ל-node function
result = call_model(test_state)
print("Model response:", result)

# בודקים את ה-routing function
route = should_continue(test_state)
print(f"Route decision: {route}")

# בודקים tool execution
tool_state = {
    "messages": [
        # הודעה עם tool_calls
        AIMessage(content="", tool_calls=[{
            "id": "test-1",
            "name": "get_weather",
            "args": {"city": "Tel Aviv"}
        }])
    ]
}
tool_result = call_tools(tool_state)
print("Tool result:", tool_result)

טכניקה 4: Retry ו-Loop Detection

הבעיה הנפוצה ביותר ב-LangGraph: לולאות אינסופיות. הנה דפוס שמזהה ומונע אותן:

Python — Loop detection ב-routing function

def should_continue_safe(state: AgentState) -> str:
    """Routing function עם הגנה מלולאות."""
    last = state["messages"][-1]
    retry_count = state.get("retry_count", 0)

    # הגנה 1: מגבלת iterations
    if retry_count >= 10:
        print(f"[SAFETY] Max iterations ({retry_count}) reached. Stopping.")
        return END

    # הגנה 2: זיהוי חזרה על אותו tool call
    if len(state["messages"]) >= 4:
        recent_tool_calls = []
        for msg in state["messages"][-6:]:
            if hasattr(msg, "tool_calls") and msg.tool_calls:
                for tc in msg.tool_calls:
                    recent_tool_calls.append(
                        f"{tc['name']}:{sorted(tc['args'].items())}"
                    )

        # אם אותו tool call חוזר 3 פעמים — לולאה
        from collections import Counter
        counts = Counter(recent_tool_calls)
        for call, count in counts.items():
            if count >= 3:
                print(f"[SAFETY] Loop detected: {call} called {count} times")
                return END

    # הגנה 3: מגבלת טוקנים (אם עוקבים)
    total_tokens = state.get("total_tokens_used", 0)
    if total_tokens > 50000:
        print(f"[SAFETY] Token budget exceeded: {total_tokens}")
        return END

    # לוגיקה רגילה
    if hasattr(last, "tool_calls") and last.tool_calls:
        return "tools"
    return END

Debugging Checklist — 5 צעדים

כשסוכן לא עובד כצפוי, עברו על הרשימה הזו בסדר:

Visualize: ציירו את הגרף — ודאו שה-edges נכונים
Stream: הריצו עם stream_mode="updates" — ראו מה כל node מחזיר
Checkpoints: בדקו את ה-state בכל checkpoint — מצאו את הצעד הבעייתי
Isolate: בדקו את ה-node הבעייתי בבידוד עם state מוכן
LangSmith: פתחו את ה-trace — ראו tokens, latency, ושגיאות

עשה עכשיו 10 דקות

הוסיפו loop detection ל-ReAct Agent שלכם:

החליפו את ה-routing function ב-should_continue_safe מלמעלה
הוסיפו retry_count ל-State (עם reducer increment)
בדקו: שלחו prompt שגורם ל-tool call חוזר — ודאו שהסוכן עוצר אחרי 3 חזרות
הריצו עם stream_mode="updates" וראו כל צעד

מתחיל 10 דקות חינם

LangGraph מול Frameworks אחרים

Framework: "When to Use What" — LangGraph מול השאר

קריטריון	LangGraph	Claude Agent SDK	Vercel AI SDK	CrewAI
שפה	Python (+ TS)	Python	TypeScript	Python
גישה	Graph-based	Loop-based	Hooks + Agents	Role-based teams
שליטה	מקסימלית	גבוהה	בינונית	נמוכה-בינונית
Checkpointing	Built-in, multiple backends	לא built-in	לא built-in	בסיסי
Human-in-the-loop	Native (interrupt/Command)	Tool-level	needsApproval	מוגבל
Multi-agent	Sub-graphs, supervisor	Manual	Manual	Native (core feature)
Observability	LangSmith (built-in)	Third-party	OpenTelemetry	Basic logging
Learning curve	גבוהה	נמוכה	בינונית	נמוכה
Production-ready	הכי בוגר	חדש	בוגר	מתבגר
הכי מתאים ל	Complex workflows, regulated	Claude-centric apps	Full-stack TypeScript	Multi-agent teams

Bottom line:

LangGraph — כשצריכים שליטה מקסימלית, durable execution, regulated industries
Claude Agent SDK — כשעובדים עם Claude ורוצים פשטות
Vercel AI SDK — כשהצוות הוא TypeScript-first
CrewAI — כשרוצים multi-agent teams במהירות (פרק 9)

תרגיל 3: Supervisor Multi-Agent System 35 דקות

מטרה: בניית מערכת multi-agent עם supervisor שמנהל researcher ו-writer.

בנו 3 agents כ-nodes: supervisor, researcher, writer
ה-supervisor מחליט מי עובד — הוסיפו logic שמתחיל ב-researcher תמיד
ה-researcher מקבל 2 tools: search_web ו-analyze_data
ה-writer מקבל את output ה-researcher וכותב מאמר
הוסיפו checkpointing ו-LangSmith tracing
הריצו: "Write a market analysis of AI agents in Israel"

Success criteria: Supervisor מנתב נכון, researcher מבצע research, writer מייצר מאמר. כל הזרימה גלויה ב-LangSmith.

עלות: ~$1

תרגיל 4 (מתקדם): Full-Stack LangGraph Agent 45 דקות

מטרה: חיבור כל מה שלמדנו — custom graph + memory + human-in-the-loop + LangSmith + deployment.

בנו custom StateGraph עם 5 nodes: classify (intent detection), search, analyze, draft_response, human_review
הוסיפו conditional edges: classify → search (אם צריך מידע) או draft_response (אם לא)
ב-human_review: interrupt() לאישור (רק על פעולות high-risk)
הוסיפו SqliteSaver לזיכרון עמיד
הפעילו LangSmith tracing
פרסו כ-REST API עם LangServe

Success criteria: Agent פועל כ-REST API, עם persistent memory, human approval, ו-full tracing ב-LangSmith.

עלות: ~$3

מתחיל 10 דקות חינם

טעויות נפוצות — ואיך להימנע מהן

טעות 1: שכחת Reducer על messages

מה קורה: מגדירים messages: list בלי Annotated[list, add_messages].

למה זה בעיה: כל node שמחזיר messages מחליף את כל ה-list — במקום להוסיף. הסוכן "שוכח" הודעות קודמות.

הפתרון: תמיד השתמשו ב-Annotated[list, add_messages] לשדה messages. ה-reducer מבטיח שהודעות חדשות מתווספות לרשימה.

טעות 2: אין תנאי עצירה בלולאה

מה קורה: conditional edge שמפנה tools → agent → tools → agent... בלי limit.

למה זה בעיה: המודל יכול להמשיך לקרוא ל-tools בלי סוף. 20 לולאות = 40 LLM calls = $5-20+ עם Opus.

הפתרון: הוסיפו retry_count ל-state ובדקו ב-routing function. או השתמשו ב-recursion_limit ב-config: {"recursion_limit": 15}.

טעות 3: interrupt() בלי Checkpointer

מה קורה: משתמשים ב-interrupt() אבל ה-graph compiled בלי checkpointer.

למה זה בעיה: interrupt() שומר את ה-state וממתין. בלי checkpointer — אין איפה לשמור. מקבלים error.

הפתרון: אם משתמשים ב-interrupt() — חייבים checkpointer. אפילו MemorySaver() מספיק לפיתוח.

טעות 4: MemorySaver בפרודקשן

מה קורה: פורסים לפרודקשן עם MemorySaver.

למה זה בעיה: MemorySaver שומר בזיכרון (RAM). כשה-process מתאפס — הכל נמחק. גם scale-out בלתי אפשרי — כל instance יש לו memory משלו.

הפתרון: בפרודקשן השתמשו ב-PostgresSaver (מומלץ) או SqliteSaver (single-server).

טעות 5: Over-engineering — graph כשמספיק loop

מה קורה: מפתח בונה StateGraph מורכב עם 8 nodes לסוכן שרק עונה על שאלות.

למה זה בעיה: LangGraph הוא כלי חזק — לבעיות שדורשות אותו. סוכן פשוט עם 2-3 tools? create_react_agent() מספיק. סוכן שלא צריך checkpointing? Claude Agent SDK פשוט יותר.

הפתרון: התחילו פשוט. create_react_agent() → custom graph → multi-agent. עברו לרמת מורכבות גבוהה יותר רק כשיש צורך אמיתי.

שגרת עבודה — פרק 8

תדירות	משימה	זמן
יומי	סקרו traces ב-LangSmith — זמני ריצה, errors, token usage	5 דק'
יומי	בדקו agent failures — nodes שנכשלו, retries, timeouts	5 דק'
שבועי	הריצו evaluation suite — בדקו שהסוכן עדיין מגיב נכון על test cases	15 דק'
שבועי	בדקו LangGraph versions — `pip install --upgrade langgraph`	5 דק'
חודשי	Review של graph architecture — האם יש nodes שאפשר למחוק/לאחד?	20 דק'
חודשי	Cost audit — בדקו average cost per agent run ב-LangSmith	10 דק'

אם אתם עושים רק דבר אחד מהפרק הזה 15 דקות

בנו ReAct agent עם create_react_agent(), כלי אחד (כמו weather), ו-MemorySaver. הריצו שתי שיחות באותו thread_id — האחת שבה אתם אומרים את שמכם, השנייה שבה אתם שואלים "מה שמי?". ברגע שתראו agent שזוכר שיחות קודמות דרך checkpointing — תבינו למה durable state הוא ה-killer feature של LangGraph.

בדוק את עצמך — 5 שאלות

מהם שלושת הקונספטים המרכזיים של LangGraph ואיך הם קשורים? (רמז: State, Nodes, Edges)
מה ההבדל בין add_edge() ל-add_conditional_edges()? תנו דוגמה מתי תשתמשו בכל אחד. (רמז: unconditional vs routing function)
מה עושה reducer כמו add_messages ומה קורה בלעדיו? (רמז: append vs replace)
הסבירו את הזרימה של interrupt() → checkpoint → Command(resume=...). למה חייבים checkpointer? (רמז: שמירת state בזמן המתנה)
מתי תבחרו create_react_agent() ומתי custom StateGraph? תנו 2 דוגמאות לכל אחד. (רמז: פשטות vs שליטה)

עברתם 4 מתוך 5? מצוין — אתם מוכנים לפרק 9.

סיכום הפרק

בפרק הזה למדתם לבנות סוכני AI כגרפים עם LangGraph — ה-framework הכי battle-tested לפרודקשן. התחלנו עם LangChain fundamentals (Chat Models, Messages), הבנו את שלושת הקונספטים של LangGraph (State, Nodes, Edges), בנינו ReAct Agent מאפס עם StateGraph, ואז ראינו שאפשר לעשות את אותו דבר ב-3 שורות עם create_react_agent(). הוספנו checkpointing לזיכרון עמיד, human-in-the-loop עם interrupt() ו-Command, ובנינו multi-agent system עם supervisor pattern. לבסוף — חיברנו LangSmith ל-observability מלאה ופרסנו את הסוכן כ-REST API עם LangServe.

הנקודה המרכזית: LangGraph נותן שליטה מקסימלית ו-durable execution — במחיר של learning curve גבוה יותר. אם הפרויקט שלכם דורש human-in-the-loop מורכב, persistent memory, regulated industries, או multi-agent coordination מורכבת — LangGraph הוא הכלי הנכון.

בפרק הבא (פרק 9) תכירו את CrewAI — Multi-Agent Teams. גישה שונה לגמרי: במקום לבנות גרפים, תגדירו "צוות" של סוכנים עם תפקידים, מטרות, ו-backstories — ותתנו להם לעבוד יחד.

צ'קליסט — סיכום פרק 8

מבין/ה את הפילוסופיה של LangGraph: סוכנים כגרפים, durable execution, maximum control
יודע/ת להגדיר AgentState כ-TypedDict עם Annotated reducers
יודע/ת ליצור nodes כפונקציות Python רגילות שמקבלות ומחזירות state
מבין/ה את ההבדל בין add_edge() ל-add_conditional_edges()
בנית ReAct Agent מאפס עם StateGraph, tools, ו-routing function
בנית ReAct Agent עם create_react_agent() ב-3 שורות
הוספת MemorySaver checkpointing — ובדקת שהסוכן זוכר שיחות
מכיר/ה את 3 ה-checkpointing backends: Memory, SQLite, PostgreSQL
בנית human-in-the-loop workflow עם interrupt() ו-Command(resume=...)
בנית multi-agent system עם supervisor pattern
הפעלת LangSmith tracing ותפסת traces ב-dashboard
מכיר/ה את LangSmith evaluation — datasets ו-evaluators
פרסת סוכן כ-REST API עם LangServe
עבדת על לפחות 3 מתוך 4 תרגילים
עניתי על 4 מתוך 5 שאלות ב"בדוק את עצמך"