פרק 18: Build -- Code Review & DevOps Agent

מה יהיה לך בסוף הפרק הזה

Code Review Agent שקורא PR diffs, מזהה bugs, בעיות style, ומגיב עם הערות קונסטרוקטיביות
Security Scanner שמזהה secrets חשופים, SQL injection, XSS, ותלויות פגיעות
Test Generation Agent שמייצר unit tests אוטומטית לקוד שהשתנה ב-PR
Deployment Assistant שמריץ pre-deploy checks, מנטר deployment, ומציע rollback כשצריך
Production Monitor שמזהה anomalies, מנתח incidents, ומריץ runbooks
אינטגרציה מלאה עם GitHub Actions ו-Slack
Context system שלומד מהיסטוריית reviews ומכיר את ה-conventions של הפרויקט
Evaluation framework עם מדדים ברורים: precision, recall, false positive rate, developer satisfaction
Deliverable סופי: GitHub-integrated DevOps agent מוכן לפרודקשן

מה תוכלו לעשות אחרי הפרק הזה

תוכלו לבנות code review agent שמנתח PR diffs ומספק הערות actionable על correctness, security, ו-performance
תוכלו לממש security scanner שמזהה OWASP Top 10, secrets חשופים, ותלויות פגיעות
תוכלו ליצור test generation agent שמייצר unit tests עם pytest / Jest לקוד חדש
תוכלו לתכנן deployment workflow אוטומטי עם pre-deploy checks, monitoring, ו-rollback detection
תוכלו לשלב את כל הסוכנים ב-CI/CD pipeline עם GitHub Actions ו-webhook handling

לפני שמתחילים

פרקים קודמים: פרק 2 (Architecture), פרק 3 (MCP), פרק 5 (Claude SDK), פרק 8 (LangGraph), פרק 13 (Multi-Agent), פרק 14 (Safety & Guardrails)
מה תצטרכו: Python 3.11+ ו/או Node.js 18+, מפתח API של Anthropic, חשבון GitHub עם repo לניסויים, Git CLI
ידע נדרש: Git workflows (branching, PRs, merging), הכרת CI/CD בסיסית, tool calling ו-agent loops
זמן משוער: 5-7 שעות (כולל בניית כל הסוכנים ותרגילים)
עלות API משוערת: $10-25 (PR analysis, test generation, monitoring queries)

הפרויקט שלך -- קו אדום לאורך הקורס

בפרק 17 בניתם Marketing Agent שמנתח קמפיינים, מייצר תוכן, ומנהל campaigns. עכשיו אתם עוברים ל-developer territory: DevOps Agent שמאיץ את תהליכי הפיתוח עצמם. הפרויקט הזה משלב כל מה שלמדתם -- multi-agent orchestration מפרק 13, guardrails מפרק 14, tool use מפרק 11, MCP מפרק 3. בפרק 19 תיקחו את כל מה שבניתם עד כה ותעבירו אותו לפרודקשן אמיתי -- Deploy to Production.

מילון מונחים -- פרק 18

מונח (English)	עברית	הסבר
Pull Request (PR)	בקשת משיכה	בקשה לשלב שינויים מ-branch אחד לאחר. נקודת הכניסה המרכזית של ה-review agent
Diff	הבדלים	ההבדלים בין שתי גרסאות קוד. קווים ירוקים (הוספות) ואדומים (מחיקות)
Code Review	סקירת קוד	תהליך שבו מפתח (או סוכן) בודק קוד של מפתח אחר לפני שהוא נכנס ל-codebase
CI/CD	אינטגרציה ופריסה רציפים	Continuous Integration / Continuous Deployment -- אוטומציה של build, test, ו-deploy
GitHub Actions	--	מערכת CI/CD מובנית ב-GitHub. Workflows שרצים על events כמו PR opened, push, schedule
Webhook	--	HTTP callback שנשלח כש-event מתרחש. GitHub שולח webhook כשנפתח PR, נעשה push, וכו'
OWASP Top 10	--	רשימת 10 הפגיעויות הנפוצות ביותר ב-web applications. תקן התעשייה לsecurity
Secret Detection	זיהוי סודות	סריקת קוד לזיהוי API keys, passwords, tokens שנכנסו בטעות ל-repository
Dependency Scanning	סריקת תלויות	בדיקת ספריות צד-שלישי (npm, pip) עבור פגיעויות ידועות (CVEs)
Rollback	חזרה לגרסה קודמת	חזרה לגרסה יציבה כשdeployment חדש גורם לבעיות. חייב להיות מהיר ואוטומטי
Runbook	ספר הפעלה	מדריך צעד-אחרי-צעד לטיפול ב-incident ידוע. הסוכן יכול לבצע runbooks אוטומטית
SLA (Service Level Agreement)	הסכם רמת שירות	התחייבות על uptime, latency, response time. ה-monitoring agent עוקב אחרי עמידה ב-SLA
MCP (Model Context Protocol)	--	פרוטוקול סטנדרטי לחיבור LLMs ל-tools חיצוניים. ה-DevOps agent משתמש ב-GitHub MCP server
Anomaly Detection	זיהוי חריגות	זיהוי אוטומטי של דפוסים חריגים -- error rate שקופץ, latency שעולה, traffic pattern משונה
False Positive	התראת שווא	כשהסוכן מדווח על בעיה שלא באמת קיימת. יותר מדי false positives = המפתחים מתעלמים
Inline Comment	הערה בשורה	הערת review שמצביעה על שורה ספציפית בקוד. הדרך הכי אפקטיבית לתת feedback ב-PR

ארכיטקטורה ותכנון המערכת

intermediate30 דקותconcept + practice

לפני שכותבים שורה אחת של קוד, צריך להבין מה אנחנו בונים ולמה. ה-DevOps Agent הוא לא סוכן אחד -- הוא מערכת של סוכנים מתמחים שעובדים יחד תחת orchestrator אחד.

הבעיה שאנחנו פותרים

בעיה	המצב היום	המצב עם DevOps Agent
Code review ממתין	PR פתוח 1-3 ימים עד שreviewer מגיע	Review ראשוני תוך 2 דקות מפתיחת PR
Security blind spots	Security review רק לפני release, אם בכלל	סריקת security אוטומטית על כל PR
Test coverage gaps	"נכתוב tests אחר כך" (spoiler: לא נכתוב)	Tests מוצעים אוטומטית לקוד חדש
Deployment anxiety	"מי deploy-ם ביום שישי?" -- אף אחד	Pre-deploy checks, monitoring, auto-rollback
Incident response	Alert ב-3 בלילה, מי שבתורן מנסה להבין מה קרה	Agent מנתח logs, מזהה root cause, מציע פתרון

Architecture: Router + Specialist Agents

הארכיטקטורה מבוססת על ה-Router Pattern מפרק 13 (Multi-Agent). event מגיע מ-GitHub, ה-router מזהה את סוג האירוע, ומפנה לסוכן המתמחה:

# Architecture Overview -- DevOps Agent System
#
# GitHub Event (webhook)
#     |
#     v
# +---------------------------+
# |     Event Router           |
# |  (classify and dispatch)   |
# +----------+----------------+
#            |
#     +------+----------+--------------+
#     v      v          v              v
# +-------+ +--------+ +-----------+ +------------+
# |Review | |Security| |Test Gen   | |Deploy      |
# |Agent  | |Scanner | |Agent      | |Assistant   |
# +-------+ +--------+ +-----------+ +------------+
#     |         |           |              |
#     v         v           v              v
# +---------------------------------------------+
# |          GitHub API (comments, checks)       |
# |          Slack (notifications)               |
# |          Metrics API (monitoring)            |
# +---------------------------------------------+

Tech Stack

רכיב	טכנולוגיה	למה
LLM	Claude Sonnet 4.6	יחס מחיר/איכות מצוין, 1M context לPRs גדולים
Agent SDK	Claude Agent SDK (Python/TS)	tool calling מובנה, streaming, guardrails
GitHub Integration	GitHub MCP Server + REST API	MCP לקריאת repos, REST API לכתיבת comments ו-checks
CI/CD	GitHub Actions	native integration, event-driven
Notifications	Slack API	הודעות לצוות על תוצאות review ו-deploy status
Monitoring	Datadog / Grafana API	מטריקות, logs, alerting
State	SQLite / Redis	היסטוריית reviews, learning, configuration

עשו עכשיו 5 דקות

פתחו repo חדש ב-GitHub (או השתמשו באחד קיים). צרו PR קטן עם 2-3 שינויים -- פונקציה חדשה, fix קטן, ושינוי ב-config. ה-PR הזה ישמש אתכם כ-test case לאורך כל הפרק.

הגדרת ה-Event Router

ה-Router הוא הנקודה המרכזית. הוא מקבל GitHub webhook events ומפנה לסוכן המתאים:

# Python - Event Router
import anthropic
from dataclasses import dataclass
from enum import Enum
from typing import Optional

class EventType(Enum):
    PR_OPENED = "pull_request.opened"
    PR_UPDATED = "pull_request.synchronize"
    PR_REVIEW_REQUESTED = "pull_request.review_requested"
    DEPLOY_TRIGGERED = "deployment"
    ALERT_FIRED = "alert"
    COMMENT_CREATED = "issue_comment.created"

@dataclass
class GitHubEvent:
    event_type: EventType
    repo: str
    payload: dict
    pr_number: Optional[int] = None

class DevOpsRouter:
    """Event Router -- routes GitHub events to specialist agents."""

    def __init__(self):
        self.client = anthropic.Anthropic()
        self.review_agent = CodeReviewAgent()
        self.security_agent = SecurityAgent()
        self.test_agent = TestGenerationAgent()
        self.deploy_agent = DeployAssistant()
        self.monitor_agent = MonitorAgent()

    async def route(self, event: GitHubEvent) -> dict:
        """Route event to the appropriate agent."""
        routing_table = {
            EventType.PR_OPENED: self._handle_pr,
            EventType.PR_UPDATED: self._handle_pr,
            EventType.PR_REVIEW_REQUESTED: self._handle_pr,
            EventType.DEPLOY_TRIGGERED: self._handle_deploy,
            EventType.ALERT_FIRED: self._handle_alert,
            EventType.COMMENT_CREATED: self._handle_comment,
        }
        handler = routing_table.get(event.event_type)
        if not handler:
            return {"status": "unknown_event", "event": event.event_type}
        return await handler(event)

    async def _handle_pr(self, event: GitHubEvent) -> dict:
        """PR event -- run review, security, and test generation in parallel."""
        import asyncio
        review_task = asyncio.create_task(self.review_agent.review(event.repo, event.pr_number))
        security_task = asyncio.create_task(self.security_agent.scan(event.repo, event.pr_number))
        test_task = asyncio.create_task(self.test_agent.generate(event.repo, event.pr_number))
        return {
            "review": await review_task,
            "security": await security_task,
            "tests": await test_task,
        }

    async def _handle_deploy(self, event: GitHubEvent) -> dict:
        return await self.deploy_agent.handle_deploy(event)

    async def _handle_alert(self, event: GitHubEvent) -> dict:
        return await self.monitor_agent.analyze_alert(event)

    async def _handle_comment(self, event: GitHubEvent) -> dict:
        comment = event.payload.get("comment", {}).get("body", "")
        if "@devops-agent" in comment.lower():
            return await self.review_agent.respond_to_comment(event.repo, event.pr_number, comment)
        return {"status": "ignored", "reason": "not addressed to bot"}

Framework: The DevOps Agent Routing Matrix

לכל event מ-GitHub, הרצו את הסוכנים הנכונים:

GitHub Event	Review Agent	Security Scanner	Test Gen	Deploy Agent	Monitor
PR Opened	V	V	V	--	--
PR Updated (push)	V (incremental)	V	V (new files)	--	--
PR Comment @bot	V (respond)	on request	on request	--	--
PR Merged	--	--	--	V (pre-deploy)	--
Deploy Started	--	--	--	V (monitor)	V (baseline)
Alert Fired	--	--	--	V (rollback?)	V (diagnose)

עיקרון מפתח: PR events מפעילים review + security + tests במקביל. Deploy events מפעילים deploy assistant + monitor ברצף.

Code Review Agent -- סקירת קוד אוטומטית

advanced45 דקותpractice

ה-Code Review Agent הוא הלב של המערכת. הוא קורא את ה-diff של PR, מבין את ההקשר, ומייצר review comments שהם ספציפיים, actionable, וקונסטרוקטיביים.

עקרונות Review טוב

עיקרון	דוגמה רעה	דוגמה טובה
ספציפי	"הקוד הזה לא טוב"	"בשורה 42, users.find() בלי index יגרום ל-full table scan על 100K+ רשומות"
Actionable	"צריך לשפר את ה-error handling"	"הוסיפו try/catch סביב ה-API call בשורה 15. טפלו ב-timeout (408) ו-rate limit (429) בנפרד"
קונסטרוקטיבי	"למה עשית את זה ככה?"	"גישה אלטרנטיבית: השתמשו ב-Map במקום Object כדי לקבל O(1) lookup"
Prioritized	50 הערות על כל דבר קטן	3-5 הערות מהותיות עם severity (critical / suggestion / nit)

קטגוריות Review

קטגוריה	מה בודקים	Severity ברירת מחדל
Correctness	באגים, לוגיקה שגויה, race conditions, null dereferences	Critical / High
Security	SQL injection, XSS, secrets, permissions	Critical
Performance	N+1 queries, missing indexes, unnecessary allocations	Medium
Readability	naming, complexity, comments, code organization	Low / Nit
Test Coverage	חסרים tests, edge cases לא מכוסים	Medium

מימוש ב-Python עם Claude Agent SDK

# Python - Code Review Agent (Full Implementation)
import anthropic
import json
import re
from dataclasses import dataclass, field
import httpx

@dataclass
class ReviewComment:
    file: str
    line: int
    body: str
    severity: str  # "critical", "high", "medium", "low", "nit"
    category: str  # "correctness", "security", "performance", "readability", "testing"

@dataclass
class ReviewResult:
    summary: str
    comments: list[ReviewComment] = field(default_factory=list)
    approved: bool = False
    score: int = 0  # 0-100

class CodeReviewAgent:
    """Code Review Agent -- analyzes PRs and generates review comments."""

    SYSTEM_PROMPT = """You are an expert code reviewer. Review PR diffs and provide
constructive, specific, actionable feedback.

## Categories: correctness, security, performance, readability, test coverage
## Rules:
- Be SPECIFIC: reference exact file names and line numbers
- Be ACTIONABLE: suggest a concrete fix
- PRIORITIZE: critical issues first. Max 5-7 comments
- Use severity: critical, high, medium, low, nit

## Output: JSON with summary, approved (bool), score (0-100), comments array"""

    def __init__(self, github_token: str = None):
        self.client = anthropic.Anthropic()
        self.http = httpx.AsyncClient(
            headers={"Authorization": f"token {github_token}"} if github_token else {}
        )

    async def review(self, repo: str, pr_number: int) -> ReviewResult:
        """Full PR review."""
        pr_data = await self._fetch_pr_data(repo, pr_number)
        diff = await self._fetch_pr_diff(repo, pr_number)
        files = await self._fetch_changed_files(repo, pr_number)
        context = await self._fetch_repo_context(repo)
        review = await self._analyze_with_claude(pr_data, diff, files, context)
        await self._post_review(repo, pr_number, review)
        return review

    async def _fetch_pr_diff(self, repo: str, pr_number: int) -> str:
        resp = await self.http.get(
            f"https://api.github.com/repos/{repo}/pulls/{pr_number}",
            headers={"Accept": "application/vnd.github.v3.diff"}
        )
        return resp.text

    async def _fetch_repo_context(self, repo: str) -> str:
        context_parts = []
        for fname in ["README.md", "CONTRIBUTING.md", ".eslintrc.json", "pyproject.toml"]:
            try:
                resp = await self.http.get(
                    f"https://api.github.com/repos/{repo}/contents/{fname}",
                    headers={"Accept": "application/vnd.github.v3.raw"})
                if resp.status_code == 200:
                    context_parts.append(f"=== {fname} ===\n{resp.text[:2000]}")
            except Exception:
                pass
        return "\n\n".join(context_parts)

    async def _analyze_with_claude(self, pr_data, diff, files, context) -> ReviewResult:
        user_message = f"""Review this Pull Request.
## PR: {pr_data.get('title', 'N/A')} by {pr_data.get('user', {}).get('login', 'N/A')}
Files changed: {len(files)} | +{pr_data.get('additions', 0)} / -{pr_data.get('deletions', 0)}
## Context\n{context[:3000]}
## Diff\n{diff[:50000]}"""

        response = self.client.messages.create(
            model="claude-sonnet-4-6-20260325", max_tokens=4096,
            system=self.SYSTEM_PROMPT,
            messages=[{"role": "user", "content": user_message}])

        text = response.content[0].text
        try:
            data = json.loads(text)
        except json.JSONDecodeError:
            m = re.search(r'```json?\s*\n(.*?)\n```', text, re.DOTALL)
            data = json.loads(m.group(1)) if m else {"summary": "Parse failed", "comments": []}

        return ReviewResult(
            summary=data.get("summary", ""), score=data.get("score", 0),
            approved=data.get("approved", False),
            comments=[ReviewComment(**c) for c in data.get("comments", [])])

    async def _post_review(self, repo, pr_number, review):
        gh_comments = [{"path": c.file, "line": c.line,
            "body": f"[{c.severity.upper()}] **{c.category}**: {c.body}"}
            for c in review.comments]
        event = "APPROVE" if review.approved else "COMMENT"
        await self.http.post(
            f"https://api.github.com/repos/{repo}/pulls/{pr_number}/reviews",
            json={"body": f"## Code Review\n{review.summary}\n**Score:** {review.score}/100",
                  "event": event, "comments": gh_comments})

עשו עכשיו 15 דקות

קחו את ה-CodeReviewAgent למעלה והריצו על PR אמיתי. אתם צריכים GitHub Personal Access Token. השוו את ה-review של הסוכן ל-review שאתם הייתם נותנים. מה הוא תפס שאתם לא? מה הוא פספס?

מימוש ב-TypeScript עם Anthropic SDK

// TypeScript - Code Review Agent with Anthropic SDK + Tool Use
import Anthropic from "@anthropic-ai/sdk";
import { Octokit } from "@octokit/rest";

const anthropic = new Anthropic();
const octokit = new Octokit({ auth: process.env.GITHUB_TOKEN });

const tools: Anthropic.Tool[] = [
  { name: "get_pr_diff", description: "Fetch PR diff",
    input_schema: { type: "object" as const, properties: {
      owner: { type: "string" }, repo: { type: "string" },
      pull_number: { type: "number" } }, required: ["owner", "repo", "pull_number"] } },
  { name: "get_changed_files", description: "List changed files in PR",
    input_schema: { type: "object" as const, properties: {
      owner: { type: "string" }, repo: { type: "string" },
      pull_number: { type: "number" } }, required: ["owner", "repo", "pull_number"] } },
  { name: "post_review", description: "Post inline review comments on PR",
    input_schema: { type: "object" as const, properties: {
      owner: { type: "string" }, repo: { type: "string" },
      pull_number: { type: "number" }, body: { type: "string" },
      comments: { type: "array", items: { type: "object", properties: {
        path: { type: "string" }, line: { type: "number" }, body: { type: "string" } } } },
      event: { type: "string", enum: ["APPROVE", "REQUEST_CHANGES", "COMMENT"] }
    }, required: ["owner", "repo", "pull_number", "body", "comments", "event"] } },
];

async function executeTool(name: string, input: Record<string, any>): Promise<string> {
  switch (name) {
    case "get_pr_diff": {
      const { data } = await octokit.pulls.get({ owner: input.owner, repo: input.repo,
        pull_number: input.pull_number, mediaType: { format: "diff" } });
      return String(data).substring(0, 50000); }
    case "get_changed_files": {
      const { data } = await octokit.pulls.listFiles({ owner: input.owner,
        repo: input.repo, pull_number: input.pull_number });
      return JSON.stringify(data.map(f => ({ filename: f.filename, status: f.status,
        additions: f.additions, deletions: f.deletions, patch: f.patch?.substring(0, 5000) }))); }
    case "post_review": {
      await octokit.pulls.createReview({ owner: input.owner, repo: input.repo,
        pull_number: input.pull_number, body: input.body, comments: input.comments,
        event: input.event });
      return JSON.stringify({ success: true }); }
    default: return JSON.stringify({ error: `Unknown tool: ${name}` });
  }
}

// Agent loop -- tool calling until done
async function reviewPR(owner: string, repo: string, prNumber: number) {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: `Review PR #${prNumber} in ${owner}/${repo}.
Fetch diff, analyze, post inline comments. Max 5-7 comments. Use [CRITICAL], [HIGH], [MEDIUM], [NIT].` }];

  let continueLoop = true;
  while (continueLoop) {
    const response = await anthropic.messages.create({
      model: "claude-sonnet-4-6-20260325", max_tokens: 4096,
      system: "You are an expert code reviewer. Fetch PR data, analyze, post review.",
      tools, messages });

    messages.push({ role: "assistant", content: response.content });
    const toolCalls = response.content.filter(
      (b): b is Anthropic.ToolUseBlock => b.type === "tool_use");

    if (toolCalls.length === 0 || response.stop_reason === "end_turn") { continueLoop = false; break; }

    const results: Anthropic.ToolResultBlockParam[] = [];
    for (const tc of toolCalls) {
      const result = await executeTool(tc.name, tc.input as Record<string, any>);
      results.push({ type: "tool_result", tool_use_id: tc.id, content: result });
    }
    messages.push({ role: "user", content: results });
  }
}

עשו עכשיו 10 דקות

השוו את שתי הגישות -- Python (direct flow) מול TypeScript (agentic tool-calling loop). רשמו 3 הבדלים מהותיים. באיזו גישה הסוכן "חכם" יותר? (רמז: ב-TS, הסוכן מחליט בעצמו מתי לקרוא ל-tools)

Security Analysis -- סריקת פגיעויות

advanced35 דקותpractice

Security review הוא ה-use case שבו הסוכן מוסיף הכי הרבה ערך. מפתחים אנושיים מתמקדים בפונקציונליות. Security issues נופלים בין הכיסאות. סוכן שסורק כל PR באופן אוטומטי הוא הראשון שרואה קוד חדש.

4 שכבות Security Scanning

שכבה	מה בודקים	כלי	דוגמה
1. Secret Detection	API keys, passwords, tokens	Regex + LLM	`AWS_SECRET = "AKIA..."`
2. Code Vulnerabilities	SQL injection, XSS, path traversal, SSRF	LLM + OWASP patterns	`query(f"SELECT * WHERE id={user_input}")`
3. Dependency Scanning	פגיעויות ידועות ב-packages (CVEs)	npm audit / pip-audit + LLM	lodash 4.17.20 -- CVE-2021-23337
4. Configuration Issues	CORS permissive, debug mode, weak crypto	LLM config analysis	`CORS: { origin: "*" }`

# Python - Security Scanner Agent
import re
from dataclasses import dataclass

@dataclass
class SecurityFinding:
    severity: str       # "critical", "high", "medium", "low"
    category: str       # "secret", "vulnerability", "dependency", "config"
    file: str
    line: int
    title: str
    description: str
    recommendation: str
    cwe: str = ""

class SecurityAgent:
    """Security Scanner -- scans PR diffs for vulnerabilities."""

    SECRET_PATTERNS = [
        (r'(?i)(api[_-]?key|api[_-]?secret)\s*[=:]\s*["\']?[a-zA-Z0-9]{20,}', "API Key/Secret"),
        (r'(?i)(password|passwd|pwd)\s*[=:]\s*["\'][^"\']+["\']', "Hardcoded password"),
        (r'AKIA[0-9A-Z]{16}', "AWS Access Key ID"),
        (r'(?i)(sk-[a-zA-Z0-9]{20,})', "OpenAI/Stripe Secret Key"),
        (r'ghp_[a-zA-Z0-9]{36}', "GitHub Personal Access Token"),
        (r'-----BEGIN (RSA |EC )?PRIVATE KEY-----', "Private key"),
    ]

    VULN_PATTERNS = {
        "sql_injection": {
            "patterns": [r'f".*SELECT.*\{.*\}"', r"f'.*SELECT.*\{.*\}'",
                         r'execute\s*\(\s*[f"\'`].*\{'],
            "title": "Potential SQL Injection", "cwe": "CWE-89", "severity": "critical"},
        "xss": {
            "patterns": [r'document\.write\s*\(', r'v-html\s*='],
            "title": "Potential Cross-Site Scripting (XSS)", "cwe": "CWE-79", "severity": "high"},
        "path_traversal": {
            "patterns": [r'open\s*\(\s*(?:request|req|user|input)'],
            "title": "Potential Path Traversal", "cwe": "CWE-22", "severity": "high"},
        "ssrf": {
            "patterns": [r'requests\.get\s*\(\s*(?:user|input|request)'],
            "title": "Potential SSRF", "cwe": "CWE-918", "severity": "high"},
    }

    def __init__(self):
        self.client = anthropic.Anthropic()

    async def scan(self, repo: str, pr_number: int) -> list[SecurityFinding]:
        diff = await self._get_diff(repo, pr_number)
        findings = self._regex_scan(diff)
        findings.extend(await self._llm_security_scan(diff))
        return self._deduplicate(findings)

    def _regex_scan(self, diff: str) -> list[SecurityFinding]:
        findings = []
        current_file, line_num = "", 0
        for line in diff.split("\n"):
            if line.startswith("+++ b/"): current_file = line[6:]; line_num = 0; continue
            if line.startswith("@@"):
                m = re.search(r'\+(\d+)', line)
                if m: line_num = int(m.group(1))
                continue
            if not line.startswith("+") or line.startswith("+++"):
                if not line.startswith("-"): line_num += 1
                continue
            added = line[1:]; line_num += 1

            for pattern, desc in self.SECRET_PATTERNS:
                if re.search(pattern, added):
                    findings.append(SecurityFinding(severity="critical", category="secret",
                        file=current_file, line=line_num, title="Secret Detected",
                        description=f"{desc}: {added.strip()[:80]}",
                        recommendation="Remove secret. Rotate key. Use env vars.", cwe="CWE-798"))

            for vtype, vinfo in self.VULN_PATTERNS.items():
                for pat in vinfo["patterns"]:
                    if re.search(pat, added):
                        findings.append(SecurityFinding(severity=vinfo["severity"],
                            category="vulnerability", file=current_file, line=line_num,
                            title=vinfo["title"], description=f"Pattern: {added.strip()[:80]}",
                            recommendation=f"Review for {vtype}. Use parameterized queries.",
                            cwe=vinfo["cwe"]))
        return findings

    async def _llm_security_scan(self, diff: str) -> list[SecurityFinding]:
        response = self.client.messages.create(
            model="claude-sonnet-4-6-20260325", max_tokens=3000,
            system="You are a security expert. Find: business logic flaws, race conditions, "
                   "insecure deserialization, info leakage, missing validation. "
                   "Return JSON array of findings or []. Only confident findings.",
            messages=[{"role": "user", "content": f"Review:\n{diff[:30000]}"}])
        try:
            m = re.search(r'\[.*\]', response.content[0].text, re.DOTALL)
            if m: return [SecurityFinding(category="vulnerability", **i) for i in json.loads(m.group(0))]
        except: pass
        return []

    def _deduplicate(self, findings):
        seen, unique = set(), []
        for f in sorted(findings, key=lambda x: {"critical":0,"high":1,"medium":2,"low":3}[x.severity]):
            key = (f.file, f.line, f.category)
            if key not in seen: seen.add(key); unique.append(f)
        return unique

עשו עכשיו 10 דקות

צרו קובץ test עם 3 פגיעויות מכוונות: (1) hardcoded API key, (2) SQL injection עם f-string, (3) XSS-prone DOM manipulation. הכניסו ל-PR ובדקו שה-SecurityAgent מוצא את כולם.

Test Generation Agent -- יצירת טסטים אוטומטית

advanced35 דקותpractice

מפתחים לא אוהבים לכתוב tests. ה-test generation agent מייצר draft ראשוני -- במקום 30-60 דקות מאפס, המפתח עורך tests קיימים ב-5-10 דקות.

אסטרטגיות יצירת Tests

אסטרטגיה	מה מייצרים	דוגמה
Happy Path	Function עובדת עם input תקין	`test_create_user_success()`
Edge Cases	ריק, null, ערכים קיצוניים	`test_create_user_empty_name()`
Error Cases	Errors מטופלים נכון	`test_create_user_duplicate_email()`
Boundary Values	0, -1, MAX_INT	`test_pagination_zero_page()`
Integration	אינטראקציה בין components	`test_user_signup_sends_email()`

# Python - Test Generation Agent
class TestGenerationAgent:
    SYSTEM_PROMPT = """You are an expert test engineer. Generate comprehensive unit tests.
Rules: test every new/modified function, cover happy/edge/error/boundary cases,
use pytest (Python) or Jest (TS/JS), include mocking, 3-5 tests per function."""

    async def generate(self, repo, pr_number) -> dict:
        files = await self._get_changed_files(repo, pr_number)
        test_files = {}
        for f in files:
            if self._is_testable(f["filename"]):
                content = await self._get_file_content(repo, f["filename"])
                tests = await self._generate_tests(f["filename"], content, f.get("patch", ""))
                if tests: test_files[self._test_filename(f["filename"])] = tests
        return {"test_files": test_files, "summary": f"Generated tests for {len(test_files)} files"}

    def _is_testable(self, fn):
        if any(x in fn for x in ["test_", "_test.", ".test.", "spec."]): return False
        return any(fn.endswith(e) for e in [".py", ".ts", ".js", ".tsx", ".jsx"])

דוגמה לפלט -- test file שהסוכן מייצר:

# Generated by DevOps Agent -- Tests for user_service.py
import pytest
from unittest.mock import AsyncMock
from user_service import UserService

@pytest.fixture
def user_service():
    return UserService(db=AsyncMock())

class TestCreateUser:
    async def test_success(self, user_service):
        """Happy path: valid user creation."""
        user_service.db.insert.return_value = {"id": "usr_123", "name": "Yael Cohen"}
        result = await user_service.create_user(name="Yael Cohen", email="yael@example.co.il")
        assert result["name"] == "Yael Cohen"

    async def test_duplicate_email(self, user_service):
        """Error: duplicate email raises ValueError."""
        user_service.db.insert.side_effect = ValueError("duplicate key")
        with pytest.raises(ValueError, match="duplicate"):
            await user_service.create_user(name="Test", email="existing@example.com")

    async def test_empty_name(self, user_service):
        """Edge case: empty name rejected."""
        with pytest.raises(ValueError, match="name"):
            await user_service.create_user(name="", email="test@example.com")

    async def test_invalid_email(self, user_service):
        """Boundary: invalid email format rejected."""
        with pytest.raises(ValueError, match="email"):
            await user_service.create_user(name="Test", email="not-an-email")

יצירת Tests ב-TypeScript/Jest

הסוכן לא מוגבל ל-Python. הנה דוגמה לקוד שהסוכן מייצר עבור TypeScript service עם Jest. שימו לב לשימוש ב-mocks, type safety, ו-async/await שמתאימים ל-ecosystem:

// Generated by DevOps Agent -- Tests for PaymentService.ts
import { PaymentService } from '../services/PaymentService';
import { StripeClient } from '../clients/StripeClient';

// Mock the Stripe client
jest.mock('../clients/StripeClient');
const MockStripeClient = StripeClient as jest.MockedClass<typeof StripeClient>;

describe('PaymentService', () => {
  let service: PaymentService;
  let mockStripe: jest.Mocked<StripeClient>;

  beforeEach(() => {
    mockStripe = new MockStripeClient() as jest.Mocked<StripeClient>;
    service = new PaymentService(mockStripe);
  });

  describe('processPayment', () => {
    it('should process valid payment and return confirmation', async () => {
      // Happy path
      mockStripe.charge.mockResolvedValue({ id: 'ch_123', status: 'succeeded' });
      const result = await service.processPayment({ amount: 5000, currency: 'ILS', customerId: 'cus_1' });
      expect(result.status).toBe('succeeded');
      expect(mockStripe.charge).toHaveBeenCalledWith(expect.objectContaining({ amount: 5000 }));
    });

    it('should throw on negative amount', async () => {
      // Boundary: negative amount
      await expect(service.processPayment({ amount: -100, currency: 'ILS', customerId: 'cus_1' }))
        .rejects.toThrow('Amount must be positive');
    });

    it('should throw on zero amount', async () => {
      // Boundary: zero amount
      await expect(service.processPayment({ amount: 0, currency: 'ILS', customerId: 'cus_1' }))
        .rejects.toThrow('Amount must be positive');
    });

    it('should handle Stripe timeout gracefully', async () => {
      // Error case: external service timeout
      mockStripe.charge.mockRejectedValue(new Error('Request timeout'));
      const result = await service.processPayment({ amount: 5000, currency: 'ILS', customerId: 'cus_1' });
      expect(result.status).toBe('failed');
      expect(result.error).toContain('timeout');
    });

    it('should handle card declined', async () => {
      // Error case: business logic error
      mockStripe.charge.mockRejectedValue({ type: 'card_error', code: 'card_declined' });
      const result = await service.processPayment({ amount: 5000, currency: 'ILS', customerId: 'cus_1' });
      expect(result.status).toBe('declined');
    });

    it('should enforce maximum transaction amount', async () => {
      // Boundary: max amount
      await expect(service.processPayment({ amount: 10_000_001, currency: 'ILS', customerId: 'cus_1' }))
        .rejects.toThrow('exceeds maximum');
    });
  });
});

Smart Test Generation -- בחירת אסטרטגיה לפי סוג הקוד

לא כל קוד צריך את אותן בדיקות. הסוכן מזהה את סוג הקוד ומתאים את אסטרטגיית היצירה:

סוג קוד	אסטרטגיה עיקרית	דגש מיוחד	דוגמת Test
API Endpoint	Request/Response testing	Status codes, validation, auth	`test_get_user_returns_404_when_not_found()`
Business Logic	State transitions	Edge cases, invariants	`test_order_cannot_ship_when_unpaid()`
Data Processing	Input/Output pairs	Empty data, malformed input	`test_parse_csv_with_missing_columns()`
Integration	Mock external services	Timeouts, retries, error responses	`test_payment_retries_on_timeout()`
Configuration	Validation tests	Missing required fields, invalid values	`test_config_rejects_empty_db_url()`

# Python - Smart Test Strategy Selection
class TestGenerationAgent:
    # ... (init and generate from above)

    def _select_strategy(self, filename: str, content: str) -> str:
        """Choose test generation strategy based on code type."""
        strategies = []
        # Detect API endpoints
        if any(kw in content for kw in ["@app.route", "@router.", "app.get(", "app.post("]):
            strategies.append("api_endpoint")
        # Detect data models
        if any(kw in content for kw in ["class.*Model", "dataclass", "interface ", "type "]):
            strategies.append("data_validation")
        # Detect business logic
        if any(kw in content for kw in ["if.*elif", "state", "status", "transition"]):
            strategies.append("state_machine")
        # Detect external integrations
        if any(kw in content for kw in ["requests.", "httpx.", "fetch(", "axios."]):
            strategies.append("integration_mock")
        return ", ".join(strategies) if strategies else "general"

    async def _generate_tests(self, filename, content, patch) -> str:
        strategy = self._select_strategy(filename, content)
        response = self.client.messages.create(
            model="claude-sonnet-4-6-20260325", max_tokens=4000,
            system=f"""{self.SYSTEM_PROMPT}
Strategy: {strategy}
- api_endpoint: test all HTTP methods, status codes, request validation, auth
- data_validation: test required fields, types, constraints, serialization
- state_machine: test all valid transitions, invalid transitions, edge states
- integration_mock: mock external calls, test timeouts, retries, error handling
- general: happy path, edge cases, error cases, boundary values""",
            messages=[{"role": "user",
                "content": f"Generate tests for:\n## {filename}\n```\n{content}\n```\n## Changes:\n{patch}"}])
        return response.content[0].text

עשו עכשיו 10 דקות

קחו function מפרויקט שלכם שאין לה tests. תנו אותה ל-Claude עם ה-system prompt. האם ה-tests רצים? הריצו pytest/jest. כמה עברו? מה צריך לתקן?

Deployment Assistant Agent

advanced35 דקותpractice

Deployment הוא הרגע שבו שינויי קוד הופכים לבעיות אמיתיות. ה-Deploy Assistant עוזר ב-3 שלבים: pre-deploy checks, deploy monitoring, ו-rollback detection.

Pre-Deploy Checklist

#	Check	אוטומטי?	מה בודקים
1	Tests passing	V	כל ה-test suites עוברים
2	No security issues	V	אין critical/high findings
3	Code reviewed	V	לפחות reviewer אחד אישר
4	No merge conflicts	V	Branch up to date
5	Version bumped	Semi	Version עודכן
6	Changelog updated	Semi	CHANGELOG.md מעודכן
7	DB migrations	Manual	Backwards compatible?
8	Feature flags	Manual	מוגדרים נכון?

# Python - Deployment Assistant
class DeployAssistant:
    async def pre_deploy_check(self, repo, branch="main") -> dict:
        checks = {}
        ci = await self._check_ci_status(repo, branch)
        checks["ci_passing"] = ci["all_passing"]
        security = await self._check_security(repo, branch)
        checks["no_critical_security"] = security["critical_count"] == 0
        checks["approved"] = (await self._check_approval(repo, branch))["is_approved"]
        checks["up_to_date"] = await self._check_up_to_date(repo, branch)
        checks["deploy_ready"] = all(checks.values())
        return checks

    async def monitor_deploy(self, environment="production", duration_minutes=15) -> dict:
        import asyncio
        baseline = await self._get_baseline_metrics(environment)
        alerts = []
        for minute in range(duration_minutes):
            await asyncio.sleep(60)
            current = await self._get_current_metrics(environment)
            anomalies = self._detect_anomalies(baseline, current)
            if anomalies:
                alerts.extend(anomalies)
                critical = [a for a in anomalies if a["severity"] == "critical"]
                if critical:
                    return {"status": "rollback_recommended", "reason": critical[0]["description"]}
        return {"status": "healthy", "duration_minutes": duration_minutes}

    def _detect_anomalies(self, baseline, current):
        anomalies = []
        if current.get("error_rate", 0) > baseline.get("error_rate", 1) * 3:
            anomalies.append({"severity": "critical", "metric": "error_rate",
                "description": f"Error rate spiked to {current['error_rate']:.1%}"})
        if current.get("p99_latency", 0) > baseline.get("p99_latency", 100) * 2:
            anomalies.append({"severity": "high", "metric": "p99_latency",
                "description": f"P99 latency: {current['p99_latency']}ms"})
        if current.get("rps", 0) < baseline.get("rps", 1) * 0.5:
            anomalies.append({"severity": "critical", "metric": "request_volume",
                "description": f"Request volume dropped to {current['rps']} rps"})
        return anomalies

Rollback Automation

כש-anomaly detection מזהה בעיה קריטית, ה-Deploy Assistant צריך לבצע rollback אוטומטי -- או לפחות להציע אחד ולחכות לאישור אנושי. הנה המימוש:

# Python - Rollback Manager
class RollbackManager:
    """Manages deployment rollbacks with safety checks."""

    def __init__(self, github_token: str, slack_webhook: str = None):
        self.http = httpx.AsyncClient(
            headers={"Authorization": f"token {github_token}"})
        self.slack_webhook = slack_webhook
        self.client = anthropic.Anthropic()

    async def evaluate_rollback(self, environment: str, anomalies: list) -> dict:
        """Decide whether to rollback based on anomaly severity."""
        critical = [a for a in anomalies if a["severity"] == "critical"]
        high = [a for a in anomalies if a["severity"] == "high"]

        if len(critical) >= 1:
            action = "auto_rollback"
            reason = f"{len(critical)} critical anomalies detected"
        elif len(high) >= 3:
            action = "auto_rollback"
            reason = f"{len(high)} high-severity anomalies detected"
        elif len(high) >= 1:
            action = "suggest_rollback"
            reason = f"{len(high)} high-severity anomalies -- review recommended"
        else:
            action = "monitor"
            reason = "No critical issues, continuing monitoring"

        decision = {"action": action, "reason": reason, "anomalies": anomalies}

        if action == "auto_rollback":
            result = await self._execute_rollback(environment)
            decision["rollback_result"] = result
            await self._notify_team(environment, decision)
        elif action == "suggest_rollback":
            await self._notify_team(environment, decision)

        return decision

    async def _execute_rollback(self, environment: str) -> dict:
        """Execute rollback to previous stable version."""
        previous = await self._get_previous_deployment(environment)
        if not previous:
            return {"status": "failed", "reason": "No previous deployment found"}
        # Trigger redeployment of previous version
        await self.http.post(
            f"https://api.github.com/repos/{previous['repo']}/deployments",
            json={"ref": previous["sha"], "environment": environment,
                  "description": f"Rollback from {previous['current_sha'][:7]}",
                  "auto_merge": False, "required_contexts": []})
        return {"status": "initiated", "previous_sha": previous["sha"],
                "rolled_back_sha": previous["current_sha"]}

    async def _notify_team(self, environment: str, decision: dict):
        """Send Slack notification about rollback decision."""
        if not self.slack_webhook: return
        emoji = {"auto_rollback": "🚨", "suggest_rollback": "⚠️", "monitor": "👀"}
        await self.http.post(self.slack_webhook, json={"text":
            f"{emoji.get(decision['action'], '📋')} *Deploy {decision['action']}* "
            f"({environment})\n{decision['reason']}"})

Canary Deployment Support

במקום "all or nothing", deploy agent יכול לנהל canary deployments -- להפנות 5% מהתעבורה לגרסה החדשה, לנטר, ולהגדיל בהדרגה:

שלב	תעבורה	משך	Criteria להמשך
1. Canary	5%	10 דקות	Error rate < baseline * 1.5
2. Partial	25%	15 דקות	P99 latency < baseline * 1.3
3. Majority	75%	15 דקות	No critical alerts
4. Full	100%	--	All metrics stable

# Python - Canary Deployment Controller
class CanaryController:
    STAGES = [
        {"name": "canary",   "traffic_pct": 5,   "duration_min": 10},
        {"name": "partial",  "traffic_pct": 25,  "duration_min": 15},
        {"name": "majority", "traffic_pct": 75,  "duration_min": 15},
        {"name": "full",     "traffic_pct": 100, "duration_min": 0},
    ]

    async def run_canary(self, environment, new_version, baseline_metrics):
        for stage in self.STAGES:
            await self._set_traffic_split(environment, new_version, stage["traffic_pct"])
            if stage["duration_min"] == 0:
                return {"status": "complete", "version": new_version}
            # Monitor for the stage duration
            result = await self._monitor_stage(
                environment, baseline_metrics, stage["duration_min"])
            if result["status"] != "healthy":
                await self._set_traffic_split(environment, new_version, 0)  # rollback
                return {"status": "rolled_back", "failed_stage": stage["name"],
                        "reason": result.get("reason")}
        return {"status": "complete", "version": new_version}

עשו עכשיו 10 דקות

כתבו pre-deploy checklist לפרויקט שלכם. לפחות 8 פריטים, מסווגים: automated / semi / manual. חשבו גם על rollback strategy -- תוך כמה דקות צריך להיות rollback מלא?

Production Monitoring Agent

advanced30 דקותpractice

ב-3 בלילה מגיע alert. ה-Monitoring Agent עושה את הניתוח אוטומטית -- קורא metrics, מתאם עם recent deploys, מחפש ב-logs, ומציע diagnosis.

# Python - Production Monitoring Agent
class MonitorAgent:
    SYSTEM_PROMPT = """You are a senior SRE analyzing a production incident.
Provide: 1) Root Cause Analysis (confidence: high/medium/low),
2) Impact Assessment, 3) Immediate Actions, 4) Resolution Steps, 5) Prevention."""

    RUNBOOKS = {
        "high_error_rate": [
            "1. Check for recent deploys (last 2 hours)",
            "2. Compare error signatures before/after deploy",
            "3. If new errors: rollback to previous version",
            "4. Check DB connection pool -- exhaustion causes 500s",
            "5. Check memory/CPU -- OOM kills cause cascading failures"],
        "high_latency": [
            "1. Check database query times",
            "2. Check external API response times",
            "3. Check CPU/memory -- scale if above 90%",
            "4. Check for lock contention in database",
            "5. Check CDN/cache hit rates"],
        "service_down": [
            "1. Check pod status: kubectl get pods -n production",
            "2. Check for OOM kills: kubectl describe pod",
            "3. Check crash loops: kubectl logs --previous",
            "4. If all down: check node health, cloud provider",
            "5. Attempt restart: kubectl rollout restart"],
    }

    async def analyze_alert(self, event) -> dict:
        alert_type = event.get("alert_type", "unknown")
        service = event.get("service", "unknown")
        context = await self._gather_context(service, alert_type)
        runbook = self.RUNBOOKS.get(alert_type)

        response = self.client.messages.create(
            model="claude-sonnet-4-6-20260325", max_tokens=2000,
            system=self.SYSTEM_PROMPT,
            messages=[{"role": "user", "content":
                f"Service: {service}, Alert: {alert_type}\n"
                f"Metrics: {json.dumps(context['current_metrics'])}\n"
                f"Baseline: {json.dumps(context['baseline_metrics'])}\n"
                f"Deploys: {json.dumps(context['recent_deploys'])}\n"
                f"Logs: {chr(10).join(context['recent_logs'])}\n"
                f"Runbook: {chr(10).join(runbook) if runbook else 'None'}"}])
        return {"analysis": response.content[0].text, "service": service}

Context Gathering -- המידע שהסוכן צריך

האיכות של incident analysis תלויה בכמות ואיכות המידע שהסוכן מקבל. הנה הפונקציה שאוספת את כל ה-context הנדרש:

# Python - Monitoring Context Gathering
class MonitorAgent:
    # ... (SYSTEM_PROMPT and RUNBOOKS from above)

    async def _gather_context(self, service: str, alert_type: str) -> dict:
        """Gather all relevant context for incident analysis."""
        import asyncio
        # Fetch all context in parallel for speed
        metrics_task = asyncio.create_task(self._get_metrics(service))
        baseline_task = asyncio.create_task(self._get_baseline(service))
        deploys_task = asyncio.create_task(self._get_recent_deploys(service))
        logs_task = asyncio.create_task(self._get_error_logs(service))
        dependencies_task = asyncio.create_task(self._check_dependencies(service))

        return {
            "current_metrics": await metrics_task,
            "baseline_metrics": await baseline_task,
            "recent_deploys": await deploys_task,
            "recent_logs": await logs_task,
            "dependency_status": await dependencies_task,
        }

    async def _get_metrics(self, service: str) -> dict:
        """Fetch current metrics from monitoring system (Datadog/Grafana)."""
        # Example: Datadog API query
        resp = await self.http.get(
            "https://api.datadoghq.com/api/v1/query",
            params={"from": int(time.time()) - 900, "to": int(time.time()),
                    "query": f"avg:http.request.duration{{service:{service}}}"},
            headers={"DD-API-KEY": os.environ["DD_API_KEY"]})
        data = resp.json()
        return {
            "error_rate": self._extract_latest(data, "error_rate"),
            "p50_latency": self._extract_latest(data, "p50"),
            "p99_latency": self._extract_latest(data, "p99"),
            "rps": self._extract_latest(data, "request_count"),
            "cpu_percent": self._extract_latest(data, "cpu"),
            "memory_percent": self._extract_latest(data, "memory"),
        }

    async def _check_dependencies(self, service: str) -> list[dict]:
        """Check health of dependent services (DB, cache, external APIs)."""
        deps = self._get_service_dependencies(service)
        results = []
        for dep in deps:
            try:
                resp = await self.http.get(dep["health_url"], timeout=5.0)
                results.append({"name": dep["name"], "status": "healthy",
                    "latency_ms": resp.elapsed.total_seconds() * 1000})
            except Exception as e:
                results.append({"name": dep["name"], "status": "unhealthy", "error": str(e)})
        return results

Incident Notification Pipeline

ניתוח incident לבד לא מספיק -- צריך גם להודיע לאנשים הנכונים, בערוץ הנכון, עם רמת הדחיפות הנכונה:

# Python - Incident Notification Pipeline
class IncidentNotifier:
    """Routes incident notifications to the right channels."""

    SEVERITY_ROUTING = {
        "critical": {
            "channels": ["slack_oncall", "pagerduty"],
            "mention": "@oncall-team",
            "color": "#FF0000"
        },
        "high": {
            "channels": ["slack_engineering"],
            "mention": "@engineering",
            "color": "#FF8C00"
        },
        "medium": {
            "channels": ["slack_engineering"],
            "mention": "",
            "color": "#FFD700"
        },
    }

    async def notify(self, severity: str, analysis: dict, service: str):
        routing = self.SEVERITY_ROUTING.get(severity, self.SEVERITY_ROUTING["medium"])
        message = self._format_incident_message(severity, analysis, service)
        for channel in routing["channels"]:
            if channel == "pagerduty":
                await self._trigger_pagerduty(severity, message, service)
            elif channel.startswith("slack_"):
                await self._send_slack(channel, message, routing)

    def _format_incident_message(self, severity, analysis, service) -> str:
        return (f"*[{severity.upper()}] Incident on {service}*\n"
                f"*Root Cause:* {analysis.get('root_cause', 'Investigating')}\n"
                f"*Impact:* {analysis.get('impact', 'Unknown')}\n"
                f"*Actions:* {analysis.get('immediate_actions', 'See thread')}\n"
                f"*Recent Deploy:* {analysis.get('related_deploy', 'None detected')}")

    async def _trigger_pagerduty(self, severity, message, service):
        await self.http.post("https://events.pagerduty.com/v2/enqueue",
            json={"routing_key": os.environ["PAGERDUTY_KEY"],
                  "event_action": "trigger",
                  "payload": {"summary": message[:1024], "severity": severity,
                              "source": service, "component": "devops-agent"}})

עשו עכשיו 10 דקות

כתבו runbook לincident נפוץ בפרויקט שלכם. 5-7 צעדים ברורים. לכל צעד: מה בודקים, criteria, ומה עושים. חשבו גם -- מי צריך לקבל notification ובאיזו רמת דחיפות?

CI/CD Integration ו-GitHub Actions

intermediate30 דקותpractice

בפרודקשן, הסוכנים צריכים לרוץ אוטומטית על כל PR. GitHub Actions הוא הדרך הטבעית.

# .github/workflows/devops-agent.yml
name: DevOps Agent Review
on:
  pull_request:
    types: [opened, synchronize, reopened]
permissions:
  pull-requests: write
  contents: read
jobs:
  ai-review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with: { fetch-depth: 0 }
      - uses: actions/setup-python@v5
        with: { python-version: '3.11' }
      - run: pip install anthropic httpx
      - name: Run DevOps Agent
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: python scripts/run_review.py --repo "${{ github.repository }}" --pr "${{ github.event.pull_request.number }}"

Webhook Setup -- הגדרת GitHub Webhook צעד אחרי צעד

אם אתם רוצים שהסוכן ירוץ מ-server שלכם (במקום GitHub Actions), תצטרכו להגדיר GitHub Webhook. זה התהליך המלא:

שלב 1: הגדרת Webhook ב-GitHub

# אפשר להגדיר webhook דרך ה-API:
curl -X POST \
  -H "Authorization: token YOUR_GITHUB_TOKEN" \
  -H "Accept: application/vnd.github.v3+json" \
  https://api.github.com/repos/OWNER/REPO/hooks \
  -d '{
    "name": "web",
    "active": true,
    "events": ["pull_request", "issue_comment", "deployment", "deployment_status"],
    "config": {
      "url": "https://your-server.com/webhook",
      "content_type": "json",
      "secret": "YOUR_WEBHOOK_SECRET",
      "insecure_ssl": "0"
    }
  }'

# או דרך ה-UI:
# 1. Settings -> Webhooks -> Add webhook
# 2. Payload URL: https://your-server.com/webhook
# 3. Content type: application/json
# 4. Secret: YOUR_WEBHOOK_SECRET (same as in your server)
# 5. Events: "Let me select individual events" -> Pull requests, Issue comments, Deployments

שלב 2: Webhook Server עם Signature Verification

// TypeScript - Complete Webhook Server
import express from "express";
import crypto from "crypto";

const app = express();
app.use(express.json());

// Signature verification middleware
function verifySignature(req: express.Request, res: express.Response, next: express.NextFunction) {
  const sig = req.headers["x-hub-signature-256"] as string;
  if (!sig) return res.status(401).send("Missing signature");

  const expected = "sha256=" + crypto.createHmac("sha256", process.env.WEBHOOK_SECRET!)
    .update(JSON.stringify(req.body)).digest("hex");

  if (!crypto.timingSafeEqual(Buffer.from(sig), Buffer.from(expected)))
    return res.status(401).send("Invalid signature");

  next();
}

// Health check endpoint (useful for monitoring)
app.get("/health", (req, res) => res.json({ status: "ok", uptime: process.uptime() }));

// Main webhook handler
app.post("/webhook", verifySignature, async (req, res) => {
  // IMPORTANT: Respond immediately, then process async
  // GitHub expects a response within 10 seconds, agent processing takes longer
  res.status(200).send("OK");

  const event = req.headers["x-github-event"] as string;
  const action = req.body.action;
  const repo = req.body.repository?.full_name;
  const pr = req.body.pull_request?.number;

  console.log(`[Webhook] ${event}.${action} on ${repo} PR #${pr}`);

  try {
    // Route to appropriate handler
    switch (`${event}.${action}`) {
      case "pull_request.opened":
      case "pull_request.synchronize":
      case "pull_request.reopened":
        await handlePullRequest(repo, pr, req.body);
        break;
      case "issue_comment.created":
        if (req.body.issue?.pull_request && req.body.comment?.body.includes("@devops-agent"))
          await handleBotMention(repo, req.body.issue.number, req.body.comment);
        break;
      case "deployment.created":
        await handleDeployment(repo, req.body.deployment);
        break;
    }
  } catch (err) {
    console.error(`[Webhook] Error processing ${event}.${action}:`, err);
    // Don't throw -- we already sent 200. Log and alert instead.
  }
});

app.listen(3000, () => console.log("Webhook server running on :3000"));

שלב 3: הרצה עם ngrok לפיתוח מקומי

# For local development, expose your server via ngrok:
# 1. Install ngrok: https://ngrok.com/download
ngrok http 3000

# 2. Copy the https URL (e.g., https://abc123.ngrok-free.app)
# 3. Set it as the webhook URL in GitHub
# 4. Test by opening a PR

# For production, deploy to:
# - Railway / Render / Fly.io (simple, $5-10/month)
# - Cloud Run (auto-scale, pay per request)
# - Your existing VPS / Kubernetes cluster

Slack Notification Integration

הסוכן מייצר reviews ו-security findings, אבל בלי notifications אף אחד לא ידע. Slack הוא הערוץ הטבעי להודעות לצוות. הנה integration מלא שמתאים את ההודעה לפי סוג האירוע:

# Python - Slack Notification Helper
import httpx

class SlackNotifier:
    """Send formatted notifications to Slack channels."""

    def __init__(self, webhook_url: str):
        self.webhook_url = webhook_url
        self.http = httpx.AsyncClient()

    async def notify_review_complete(self, repo: str, pr_number: int, review: dict):
        """Notify team when code review is complete."""
        score = review.get("score", 0)
        color = "#36a64f" if score >= 80 else "#ff9900" if score >= 60 else "#ff0000"
        comment_count = len(review.get("comments", []))
        critical = sum(1 for c in review.get("comments", []) if c.get("severity") == "critical")

        blocks = [
            {"type": "header", "text": {"type": "plain_text",
                "text": f"Code Review: {repo} PR #{pr_number}"}},
            {"type": "section", "fields": [
                {"type": "mrkdwn", "text": f"*Score:* {score}/100"},
                {"type": "mrkdwn", "text": f"*Comments:* {comment_count}"},
                {"type": "mrkdwn", "text": f"*Critical Issues:* {critical}"},
                {"type": "mrkdwn", "text": f"*Status:* {'Approved' if review.get('approved') else 'Changes Requested'}"},
            ]},
            {"type": "actions", "elements": [
                {"type": "button", "text": {"type": "plain_text", "text": "View PR"},
                 "url": f"https://github.com/{repo}/pull/{pr_number}"}
            ]}
        ]
        await self.http.post(self.webhook_url, json={"blocks": blocks})

    async def notify_security_findings(self, repo: str, pr_number: int, findings: list):
        """Alert on security findings -- only critical and high."""
        critical_findings = [f for f in findings if f.severity in ("critical", "high")]
        if not critical_findings: return  # Don't spam for low-severity
        text = (f"*Security Alert: {repo} PR #{pr_number}*\n"
                f"Found {len(critical_findings)} critical/high issues:\n"
                + "\n".join(f"  - [{f.severity.upper()}] {f.title} in `{f.file}:{f.line}`"
                    for f in critical_findings[:5]))
        await self.http.post(self.webhook_url, json={"text": text})

Framework: GitHub Actions vs Webhook Server -- Decision Matrix

ממד	GitHub Actions	Webhook Server
Setup	Low -- YAML file	High -- server, hosting, SSL
Cost	Free tier (2000 min/month)	$5-20/month hosting
Latency	15-30s (cold start)	2-5s (always running)
State	Limited	Full (database, memory)
Customization	Medium	Full

המלצה: התחילו עם GitHub Actions. עברו ל-webhook רק אם צריכים latency נמוך או state management.

עשו עכשיו 15 דקות

צרו workflow YAML ב-repo שלכם. הוסיפו ANTHROPIC_API_KEY כsecret. פתחו PR ובדקו שרץ.

Context, Learning, ושיפור מתמיד

intermediate25 דקותconcept + practice

סוכן review שלא מכיר את ה-codebase הוא כמו reviewer חיצוני ליום אחד. הוא יתפוס bugs גנריים, אבל יפספס context: "אצלנו תמיד Result type, לא exceptions", "camelCase לAPI, snake_case לDB."

3 שכבות Context

שכבה	מקור	מתי
Repository	README, CONTRIBUTING, .eslintrc	כל review (cached)
Historical	Past reviews, common issues	כל review (DB)
Session	ה-PR הספציפי, comments	כל review (specific)

# Python - Context Manager
import sqlite3

class ReviewContextManager:
    def __init__(self, db_path="devops_agent.db"):
        self.db = sqlite3.connect(db_path)
        self.db.executescript("""
            CREATE TABLE IF NOT EXISTS review_history (
                id INTEGER PRIMARY KEY, repo TEXT, pr_number INTEGER,
                file_path TEXT, comment_body TEXT, severity TEXT, category TEXT,
                human_agreed BOOLEAN DEFAULT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);
            CREATE TABLE IF NOT EXISTS learned_rules (
                id INTEGER PRIMARY KEY, repo TEXT, rule_text TEXT,
                source TEXT, confidence REAL DEFAULT 0.5, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);
        """)

    def record_human_feedback(self, repo, comment_id, agreed):
        self.db.execute("UPDATE review_history SET human_agreed=? WHERE id=?", (agreed, comment_id))
        if not agreed:
            row = self.db.execute("SELECT comment_body, category FROM review_history WHERE id=?", (comment_id,)).fetchone()
            if row:
                self.db.execute("INSERT INTO learned_rules (repo, rule_text, source, confidence) VALUES (?,?,?,?)",
                    (repo, f"AVOID: {row[0][:200]} ({row[1]})", "human_correction", 0.8))
        self.db.commit()

    def get_context(self, repo):
        rules = self.db.execute("SELECT rule_text FROM learned_rules WHERE repo=? AND confidence>0.5 ORDER BY confidence DESC LIMIT 20", (repo,)).fetchall()
        return "## Learned Rules\n" + "\n".join(f"- {r[0]}" for r in rules) if rules else ""

עשו עכשיו 5 דקות

חשבו על 5 conventions של הפרויקט שלכם שreviewer חיצוני לא היה יודע. כתבו אותם כ-rules לסוכן.

Evaluation -- האם הסוכן באמת טוב?

intermediate25 דקותconcept + practice

האם הסוכן באמת מועיל? או שהוא מייצר noise? Evaluation מבדיל בין "כלי מרשים" ל"כלי שמשנה workflow."

4 מדדים קריטיים

מדד	מה מודדים	Target	איך
Precision	כמה הערות נכונות?	>80%	Human labels valid/FP
Recall	כמה בעיות אמיתיות נמצאו?	>60%	Compare to human review
False Positive Rate	כמה FPs?	<20%	Human reviews comments
Dev Satisfaction	Reviews מועילים?	>70%	Survey: helpful Y/N

# Python - Evaluation Framework
class ReviewEvaluator:
    def calculate_metrics(self, repo, days=30):
        stats = self.db.execute("""
            SELECT COUNT(*), SUM(CASE WHEN human_agreed=1 THEN 1 ELSE 0 END),
                   SUM(CASE WHEN human_agreed=0 THEN 1 ELSE 0 END)
            FROM review_history WHERE repo=? AND created_at>datetime('now',?)
        """, (repo, f"-{days} days")).fetchone()
        total, agreed, disagreed = stats[0], stats[1] or 0, stats[2] or 0
        reviewed = agreed + disagreed
        return {"precision": agreed/reviewed if reviewed else 0,
                "false_positive_rate": disagreed/reviewed if reviewed else 0}

Evaluation Report -- דוח חודשי אוטומטי

הסוכן צריך לייצר דוח evaluation חודשי שמראה מגמות לאורך זמן. הנה מימוש שמנתח את הנתונים ומייצר סיכום:

# Python - Monthly Evaluation Report Generator
class EvaluationReporter:
    def __init__(self, db_path="devops_agent.db"):
        self.db = sqlite3.connect(db_path)

    def generate_monthly_report(self, repo: str) -> dict:
        """Generate comprehensive monthly evaluation report."""
        metrics_30d = self._calculate_period_metrics(repo, 30)
        metrics_7d = self._calculate_period_metrics(repo, 7)
        metrics_prev_30d = self._calculate_period_metrics(repo, 60, offset=30)

        # Category breakdown -- which types of comments are most accurate?
        category_stats = self.db.execute("""
            SELECT category,
                   COUNT(*) as total,
                   SUM(CASE WHEN human_agreed=1 THEN 1 ELSE 0 END) as agreed,
                   SUM(CASE WHEN human_agreed=0 THEN 1 ELSE 0 END) as disagreed
            FROM review_history
            WHERE repo=? AND created_at>datetime('now','-30 days')
                AND human_agreed IS NOT NULL
            GROUP BY category ORDER BY total DESC
        """, (repo,)).fetchall()

        # Top false positive patterns -- what to fix in prompt
        fp_patterns = self.db.execute("""
            SELECT category, comment_body, COUNT(*) as count
            FROM review_history
            WHERE repo=? AND human_agreed=0 AND created_at>datetime('now','-30 days')
            GROUP BY category, substr(comment_body, 1, 100)
            ORDER BY count DESC LIMIT 5
        """, (repo,)).fetchall()

        return {
            "period": "30 days",
            "overall": metrics_30d,
            "trend": {
                "precision_delta": metrics_30d["precision"] - metrics_prev_30d["precision"],
                "fp_rate_delta": metrics_30d["false_positive_rate"] - metrics_prev_30d["false_positive_rate"],
            },
            "last_7_days": metrics_7d,
            "by_category": [{"category": c[0], "total": c[1],
                "precision": c[2]/(c[2]+c[3]) if (c[2]+c[3]) else 0} for c in category_stats],
            "top_fp_patterns": [{"category": p[0], "pattern": p[1][:100], "count": p[2]}
                for p in fp_patterns],
            "recommendations": self._generate_recommendations(metrics_30d, category_stats, fp_patterns),
        }

    def _generate_recommendations(self, metrics, categories, fp_patterns) -> list[str]:
        recs = []
        if metrics["precision"] < 0.8:
            recs.append("Precision below 80% -- review prompt instructions, add 'only confident findings'")
        if metrics["false_positive_rate"] > 0.2:
            recs.append(f"FP rate above 20% -- top FP category: {fp_patterns[0][0] if fp_patterns else 'unknown'}")
        for cat in categories:
            cat_precision = cat[2]/(cat[2]+cat[3]) if (cat[2]+cat[3]) else 0
            if cat_precision < 0.5 and cat[1] > 5:
                recs.append(f"Category '{cat[0]}' has {cat_precision:.0%} precision -- consider disabling or retuning")
        return recs

A/B Testing של Prompt Variations

כדי לשפר את הסוכן באופן שיטתי, הריצו A/B testing על prompt variations. זו הדרך היחידה לדעת אם שינוי ב-prompt באמת משפר תוצאות:

ניסוי	Variant A (control)	Variant B (test)	מדד
Comment limit	"Max 5-7 comments"	"Max 3-5 comments, critical only"	Precision, developer satisfaction
Context depth	Diff only	Full file + diff	False positive rate
Severity filter	All severities	Critical + High only	Signal-to-noise ratio
Tone	Direct ("Fix this")	Suggestive ("Consider...")	Developer satisfaction

Israeli Dev Teams -- שיקולים מיוחדים

שפת commits: חלק כותבים בעברית. Agent צריך להבין שתי שפות
תרבות "move fast": מקסימום 5 הערות, רק מהותיות
Security: בוגרי 8200/Unit 81 מצפים לsecurity scanning ברמה גבוהה
Time zones: צוותים עם US/EU offices מרוויחים הכי הרבה מAgent review

טעויות נפוצות -- ואיך להימנע מהן

beginner15 דקותconcept

טעות 1: "Review Everything" -- Agent שמגיב על הכל

מה קורה: 15-20 comments, רוב nit-picks.

למה רע: Alert fatigue. מפתחים מפסיקים לקרוא.

פתרון: מקסימום 5-7 comments. Prioritize: critical > high > medium.

טעות 2: "Hallucinated Bugs" -- bugs שלא קיימים

מה קורה: Agent מדווח bug שלא שם -- למשל null check שכבר קיים מחוץ ל-diff.

למה רע: False positives הורסים אמון.

פתרון: תמיד תנו קובץ מלא, לא רק diff. בprompt: "Only report confident issues."

טעות 3: Agent בלי Context -- "Outsider Reviewer"

מה קורה: Agent מציע patterns שמנוגדים לarchitecture decisions קיימים.

למה רע: מפתחים מאבדים אמון.

פתרון: Context system -- README, CONTRIBUTING, learned rules. עדכון חודשי.

טעות 4: Security Scanner עם יותר מדי False Positives

מה קורה: מדווח על כל DOM manipulation, כל eval, כל f-string -- גם כשבטוחים.

למה רע: "The Boy Who Cried Wolf" -- כשיהיה finding אמיתי, אף אחד לא ישים לב.

פתרון: שילוב regex + LLM. Regex מוצא candidates, LLM מפלטר FPs. רק findings ששניהם מסכימים = reported.

שגרת עבודה -- פרק 18

תדירות	משימה	זמן
יומי	בדקו שAgent רץ על PRs -- אין errors?	2 דק'
שבועי	סקרו 5 reviews -- precision? FP rate? developer reactions?	15 דק'
שבועי	בדקו security findings -- הכל handled?	10 דק'
חודשי	Evaluation report. עדכנו prompt ו-context	30 דק'
חודשי	סקרו generated tests -- כמה נכנסו ל-codebase?	20 דק'
רבעוני	ROI: שעות reviewer שנחסכו, bugs שנתפסו, satisfaction	1 שעה

אם אתם עושים רק דבר אחד מהפרק הזה 30 דקות

בנו את ה-Code Review Agent הבסיסי. קחו את CodeReviewAgent, הגדירו GitHub token, הריצו על PR אחד אמיתי. תראו inline comments ב-GitHub -- תוך דקות, לא ימים. אחרי שעובד, הוסיפו security ו-test generation כ-phase 2.

תרגילים

תרגיל 1: סוכן ביקורת קוד על PRs אמיתיים 60 דקות

בנו CodeReviewAgent עם הכלים: git diff parser, file reader, LLM analyzer
הריצו על 5 PRs אמיתיים מריפו שלכם (או ריפו קוד פתוח)
לכל PR רשמו: מה הסוכן מצא, מה human reviewer מצא, מה חפף
חשבו: precision (כמה מהממצאים רלוונטיים), recall (כמה באגים אמיתיים נתפסו), FP rate
שפרו את ה-prompt עד שתגיעו ל: precision מעל 75%, FP rate מתחת ל-25%

תוצר: סוכן ביקורת קוד עובד + טבלת השוואה מול human review.

תרגיל 2: סורק אבטחה עם פגיעויות מכוונות 45 דקות

צרו branch חדש עם 10 פגיעויות מכוונות: 5 secrets (API keys בקוד), 3 פגיעויות קוד (SQL injection, XSS, path traversal), 2 בעיות הגדרה (CORS פתוח, debug mode)
הריצו את סוכן האבטחה על ה-branch
מדדו recall — כמה מתוך 10 הפגיעויות זוהו?
הוסיפו 5 שורות שנראות חשודות אבל בטוחות — מדדו FP rate
כוונו את הסף עד שתקבלו recall מעל 80% עם FP מתחת ל-30%

תוצר: סורק אבטחה מכויל + דוח precision/recall.

תרגיל 3: שילוב ב-CI/CD מקצה לקצה 60 דקות

צרו קובץ GitHub Actions workflow שמפעיל את הסוכן על כל PR חדש
הגדירו secrets ב-GitHub: API key למודל, GitHub token לתגובות
הגדירו שהסוכן כותב inline comments על שורות בעייתיות ב-PR
פתחו PR לבדיקה — וודאו שהסוכן רץ אוטומטית ומגיב
הוסיפו התראת Slack/Telegram כשהסוכן מוצא בעיה ברמת HIGH

תוצר: workflow YAML עובד + PR עם תגובות אוטומטיות מהסוכן.

תרגיל 4: סוכן DevOps מלא 90 דקות

חברו את כל הסוכנים מהתרגילים הקודמים לסוכן אחד עם Event Router
הגדירו תגובה לכל אירוע: PR Opened (review + security + tests), Merged (pre-deploy check), Deploy (monitor 15 דקות), Alert (analysis + runbook)
בנו dashboard פשוט שמראה: כמה PRs נסרקו, כמה ממצאים, זמן תגובה ממוצע
הריצו על 3 PRs אמיתיים ותעדו את התוצאות

תוצר: סוכן DevOps משולב עם GitHub Actions + dashboard.

בדוק את עצמך -- 5 שאלות

מה ההבדל בין review agent שמשתמש ב-diff בלבד לעומת קובץ מלא + diff? דוגמה לFP מdiff-only?
תארו 4 שכבות Security Scanning. למה regex לבד לא מספיק?
מה 3 אסטרטגיות ליצירת tests? למה Happy Path לבד לא מספיק?
הבדל בין GitHub Actions ל-Webhook Server. מתי כל אחד מתאים?
מה Precision ו-False Positive Rate? למה FP rate גבוה יותר מסוכן מrecall נמוך?

עברתם 4 מתוך 5? מצוין -- אתם מוכנים לפרק 19.

סיכום הפרק

בפרק הזה בניתם מערכת DevOps Agent מלאה. התחלתם עם Architecture -- Event Router ל-4 סוכנים מתמחים. בניתם Code Review Agent שמנתח 5 קטגוריות ומפרסם inline comments. הוספתם Security Scanner עם 4 שכבות: secrets, vulnerabilities, dependencies, config. בניתם Test Generation Agent עם pytest/Jest. יצרתם Deployment Assistant עם pre-deploy checklist ו-anomaly detection. בניתם Production Monitor עם incident analysis ו-runbooks. שילבתם ב-CI/CD עם GitHub Actions ו-webhook server. הוספתם Context system שלומד מ-history ו-feedback. בניתם Evaluation framework עם 4 מדדים.

הנקודה המרכזית: DevOps Agent טוב מוצא את ה-issues הנכונים, בצורה שמפתחים סומכים עליה. Precision > Recall. Quality > Quantity. 5 הערות מדויקות שווות יותר מ-50 עם 30% FPs.

בפרק 19 תעבירו את כל הסוכנים לפרודקשן -- hosting, monitoring, scaling, cost optimization.

צ'קליסט -- סיכום פרק 18

מבין/ה Architecture -- Event Router + 4 Specialist Agents
בנית Code Review Agent עם inline comments ב-GitHub
מבין/ה 5 קטגוריות Review: correctness, security, performance, readability, testing
בנית Security Scanner עם 4 שכבות
בנית Test Generation Agent עם pytest / Jest
בנית Deployment Assistant עם pre-deploy checklist
בנית Production Monitor עם incident analysis
שילבת ב-GitHub Actions
מבין/ה GitHub Actions vs Webhook Server
בנית Context system שלומד מhistory ו-feedback
מבין/ה 4 מדדי Evaluation
מכיר/ה טעויות נפוצות: over-commenting, hallucinated bugs, no context, FP overload
יודע/ת לשלב Slack notifications
הרצת evaluation -- precision > 75%, FP rate < 25%
Deliverable: GitHub-integrated DevOps agent