AI एजेंट सुरक्षा: Production एजेंट Fleets के लिए Threat Model

हर AI एजेंट framework आपको एजेंट्स build करने के लिए tools देता है। लगभग कोई भी आपको उन्हें contain करने के लिए tools नहीं देता। जब एक एजेंट APIs call कर सकता है, web browse कर सकता है, code execute कर सकता है, और databases access कर सकता है, सुरक्षा प्रश्न यह नहीं है कि क्या कुछ गलत हो सकता है — यह है कि blast radius कैसा दिखता है जब होता है।

AI एजेंट सुरक्षा स्वायत्त एजेंट्स को constrain करने का practice है ताकि एक compromised, misconfigured, या misbehaving एजेंट credentials leak न करे, data exfiltrate न करे, budgets drain न करे, या privileges escalate न करे। OpenLegion इसे एक core architectural concern मानता है, add-on नहीं। हर एजेंट vault-proxied credentials, प्रति-एजेंट बजट controls, और एक permission matrix के साथ एक isolated कंटेनर में चलता है — सभी default रूप से enabled।

अपनी खुद की LLM API keys लाएँ। Model usage पर कोई markup नहीं।

AI एजेंट सुरक्षा क्या है?

AI एजेंट सुरक्षा उन controls को encompass करती है जो स्वायत्त AI एजेंट्स को नुकसान पहुँचाने से रोकती हैं — चाहे credential leakage, prompt injection, resource abuse, data exfiltration, या excessive agency के माध्यम से। इसमें runtime isolation, credential management, cost enforcement, permission controls, और infrastructure level पर applied input validation शामिल हैं।

TL;DR

Threat real है। Research दिखाता है कि AI deployments वाले 77% organizations ने 2024 में security incidents का अनुभव किया। केवल 5% अपने AI सुरक्षा measures में विश्वास व्यक्त करते हैं।
चार primary threats: Credential leakage, prompt injection, resource abuse (denial of wallet), और data exfiltration। प्रत्येक को एक भिन्न mitigation चाहिए।
कोई बड़ा framework अंतर्निहित सुरक्षा प्रदान नहीं करता। Public documentation के आधार पर, LangGraph, CrewAI, AutoGen, और OpenClaw सभी credentials के लिए environment variables पर rely करते हैं बिना native isolation या बजट enforcement के।
OpenLegion का six-layer defense: कंटेनर isolation, कंटेनर hardening, credential separation (vault proxy), permission enforcement, input validation, और unicode sanitization — सभी default रूप से enabled।
Secure AI एजेंट्स BYO keys के साथ संभव हैं — vault proxy model का अर्थ है आपकी keys trusted zone में रहती हैं और एजेंट्स एक proxy के माध्यम से interact करते हैं जो कच्चे secrets कभी expose नहीं करता।

AI एजेंट्स के लिए Threat Model

Threat 1: Credential leakage

क्या होता है। एक एजेंट जिसके पास API keys तक access है — environment variables, configuration files, या in-context passing के माध्यम से — उन keys को prompt injection, logging, error messages, या malicious tool calls के माध्यम से leak करता है।

कितना common। 2026 की शुरुआत में published research ने पाया कि 3,984 scanned एजेंट skills में से 283 (7.1%) में critical credential-handling flaws थे, plaintext में LLM context के माध्यम से API keys और passwords pass करते थे। अलग से, 76 skills में credential theft के लिए designed deliberately malicious payloads थे। High-profile incidents में शामिल हैं एक xAI employee द्वारा GitHub पर API key leak करना जो 60+ private LLMs तक दो महीने के लिए access प्रदान करती थी, और एक popular LLM platform में एक vulnerability जो एक unauthenticated endpoint के माध्यम से API keys expose करती थी।

OpenLegion इसे कैसे mitigate करता है। OpenLegion एक vault proxy के माध्यम से vault-proxied credentials का उपयोग करता है। API keys Mesh Host (Zone 2) में stored हैं। जब एक एजेंट को एक external API call करने की आवश्यकता है, request vault proxy के माध्यम से route करती है, जो credential को network layer पर inject करती है। एजेंट कच्ची key को कभी नहीं देखता, log नहीं करता, या उसकी memory access नहीं रखता। एक पूरी तरह compromised एजेंट भी credentials extract नहीं कर सकता क्योंकि वे एजेंट के कंटेनर में कभी present नहीं होते।

Threat 2: Prompt injection

क्या होता है। एक attacker उस content में malicious instructions embed करता है जिसे एजेंट process करता है — web pages, documents, emails, database records, user inputs। एजेंट injected instructions का पालन करता है इसके intended task के बजाय (या इसके अतिरिक्त)।

कितना common। Prompt injection security audits के दौरान assessed 73% से अधिक production AI deployments में दिखाई देता है। OpenAI ने December 2025 में कहा कि prompt injection "is unlikely to ever be fully solved।" OWASP इसे LLM applications के लिए #1 vulnerability rank करता है। Real-world incidents में शामिल हैं एक browser एजेंट जिसे hidden instructions के माध्यम से 150 seconds में credentials चुराने के लिए tricked किया गया, और enterprise RAG systems जहाँ public documents में malicious content ने एजेंट्स को proprietary data leak करने का कारण बनाया।

OpenLegion इसे कैसे mitigate करता है। OpenLegion multiple layers में defense in depth apply करता है। Unicode sanitization invisible characters (bidi overrides, tag characters, zero-width characters) को 56 choke points पर strip करता है content के LLM context तक पहुँचने से पहले — इन characters का सामान्यतः injected instructions hide करने के लिए उपयोग किया जाता है। Input validation path traversal को रोकता है और safe condition evaluation enforce करता है। कंटेनर isolation blast radius सीमित करता है: यहाँ तक कि अगर एक एजेंट successfully injected है, यह केवल अपने sandboxed कंटेनर को अपनी scoped permissions के साथ access कर सकता है। यह अन्य एजेंट्स के data, credential vault, या host system को access नहीं कर सकता।

कोई system prompt injection से पूर्ण immunity guarantee नहीं कर सकता। OpenLegion का approach attack surface को minimize करना और damage को contain करना है।

Threat 3: Resource abuse (Denial of Wallet)

क्या होता है। एक एजेंट recursive loop में प्रवेश करता है, अत्यधिक API calls करता है, या जितना आवश्यक है उससे कहीं अधिक resources consume करने के लिए manipulated होता है। Multi-agent systems में, यह compounds करता है — एक 5-agent workflow एक single एजेंट से 5x costs करता है, और एक runaway loop किसी के notice करने से पहले minutes में सैकड़ों dollars burn कर सकता है।

कितना common। यह OWASP LLM10:2025 (Unbounded Consumption) के रूप में listed है। अधिकांश cloud billing systems automatically charges नहीं रोकते जब budgets exceeded होते हैं — alerts fire करते हैं, लेकिन meter चलता रहता है। CrewAI और LangGraph users की community reports token-burning loops का वर्णन करती हैं जिन्होंने 10x expected budgets consume किए।

OpenLegion इसे कैसे mitigate करता है। Hard cutoff के साथ प्रति-एजेंट दैनिक और मासिक बजट controls। Fleet में प्रत्येक एजेंट के पास real time में tracked अपना token बजट है। जब limit पहुँच जाती है, orchestration layer उस specific एजेंट को halt करती है। बाकी workflow gracefully जारी रहता है या pause होता है। कोई "soft warning" नहीं जो ignore हो — cutoff infrastructure level पर enforced है।

Threat 4: Data exfiltration

क्या होता है। एक एजेंट को sensitive data attacker-controlled endpoint पर भेजने के लिए manipulated किया जाता है। Techniques में शामिल हैं: एजेंट को URL parameters में data encode करने का instruct करना (जो logged होते हैं या link previews के माध्यम से भेजे जाते हैं), एजेंट के browser का उपयोग करके attacker-controlled pages visit करना, या tool calls का exploit करके external APIs को data forward करना।

कितना common। Zero-click exfiltration techniques messaging platforms में operating एजेंट्स के विरुद्ध demonstrate किए गए हैं (जहाँ link previews automatically URLs fetch करते हैं), enterprise collaboration tools, और code repositories। Banking एजेंट्स पर research ने data exfiltration attacks के लिए लगभग 20% success rates दिखाए।

OpenLegion इसे कैसे mitigate करता है। Container-level network isolation restrict करता है कि प्रत्येक एजेंट कौन से external endpoints पर पहुँच सकता है। Permission matrix प्रति एजेंट allowed tools, files, और mesh operations define करता है। Outbound requests controlled channels के माध्यम से route करते हैं। Credential isolation (एजेंट के पास exfiltrate करने के लिए कोई credentials नहीं) और fleet-model coordination (जो हर action log करता है) के साथ combined, exfiltration के लिए attack surface unrestricted network access के साथ shared process spaces में चलने वाले एजेंट्स की तुलना में significantly कम है।

Threat 5: Sandbox escape

क्या होता है। एक एजेंट या इसका executed code अपने कंटेनर से बाहर निकलता है और host system, अन्य containers, या orchestration layer तक access प्राप्त करता है। Container escape vulnerabilities regularly discover होती हैं — November 2025 में कई high-severity runC CVEs disclosed किए गए जो बड़े cloud providers में Docker और Kubernetes को affect करते हैं।

OpenLegion इसे कैसे mitigate करता है। कंटेनर hardening: non-root execution (UID 1000), no-new-privileges flag, configurable memory limits (384MB default), configurable CPU limits (0.15 default), और containers के बीच कोई shared filesystem नहीं। प्रत्येक एजेंट को अपना /data volume मिलता है। चार-zone trust model (plus एक operator-or-internal tier) का अर्थ है कि यहाँ तक कि अगर एक एजेंट अपने कंटेनर से escape करता है, यह एक zone में land करता है जिसकी credential vault या अन्य एजेंट्स के containers तक कोई direct access नहीं है। मजबूत isolation की आवश्यकता वाले environments के लिए, architecture Docker Sandbox microVMs को support करता है।

Threat 6: Supply chain attacks

क्या होता है। Malicious code एजेंट skills, MCP tool servers, shared configurations, या framework dependencies के माध्यम से introduce किया जाता है। Malicious MCP servers npm पर legitimate services को impersonate करते हुए पाए गए हैं। Crowdsourced configuration files hidden LLM-triggered prompts के साथ weaponize किए गए हैं।

OpenLegion इसे कैसे mitigate करता है। OpenLegion zero external framework dependencies का उपयोग करता है — कोई LangChain नहीं, कोई Redis नहीं, कोई Kubernetes नहीं। Core pure Python + SQLite है। MCP tool servers supported हैं लेकिन permission matrix के माध्यम से sandboxed। Fleet-model coordination का अर्थ है कि tool calls workflow definition में explicitly declared हैं, runtime पर dynamically discovered नहीं — unexpected tool injection के लिए surface को कम करता है।

OpenLegion में AI एजेंट Isolation कैसे काम करता है

OpenLegion का चार-zone trust model plus एक operator-or-internal tier हर deployment को distinct security boundaries में separate करता है:

Zone 0 — Untrusted External Input। Users या third parties से आने वाली कोई भी चीज़: CLI, Telegram, Discord, Slack, WhatsApp, और webhook endpoints। Inputs Zone 2 में प्रवेश करने से पहले prompt-injection guards के माध्यम से validated और sanitized होते हैं।

Zone 1 — Sandboxed Agent Containers (Untrusted)। प्रत्येक एजेंट अपने स्वयं के Docker कंटेनर में एक isolated FastAPI instance के रूप में चलता है। प्रत्येक कंटेनर के पास अपना /data volume, अपना memory database (SQLite + vector search), configurable resource caps (384MB RAM / 0.15 CPU default), non-root execution (UID 1000), cap_drop=ALL, no-new-privileges, एक read-only root filesystem है, और Docker socket, credential vault, या अन्य एजेंट्स के containers तक कोई access नहीं है।

Zone 2 — Mesh Host (Trusted)। Credentials तक access वाला एकमात्र component। Blackboard (shared state + WAL), PubSub router, Credential Vault (blind injection proxy), ACL matrix, Container Manager, Cost Tracker, और Browser Service (:8500 पर प्रति-एजेंट Camoufox) चलाता है। यह zone hardened है और एजेंट code को expose नहीं है।

Zone 2.5 — Operator-or-Internal। Operator एजेंट या internal mesh tooling के लिए available reserved control-plane operations — fleet management, एजेंट edits, permission grants (Operator can_spawn या can_use_wallet grant नहीं कर सकता)।

Zone 3 — Loopback-Only Internal। सबसे-restricted tier: endpoints जिन्हें x-mesh-internal: 1 header और एक loopback source IP दोनों चाहिए। केवल mesh-internal coordination calls के लिए उपयोग।

इस architecture का अर्थ है कि Zone 1 में एक compromised एजेंट Zone 2 (credentials) या अन्य Zone 1 containers (अन्य एजेंट्स के data) तक नहीं पहुँच सकता। किसी भी single एजेंट compromise का blast radius उस एजेंट के sandbox तक contained है।

AI एजेंट Credential Management: Vault Proxy बनाम Environment Variables

AI एजेंट frameworks में सबसे common credential management pattern environment variables है। आपकी API key एक .env file में बैठती है या OAI_CONFIG_LIST के माध्यम से pass होती है। एजेंट process इसे directly पढ़ता है। इसका अर्थ है:

Key एजेंट के memory space में मौजूद है
एक prompt injection attack एजेंट को key print या exfiltrate करने का instruct कर सकता है
Logs, error messages, और debug output key contain कर सकते हैं
यदि एजेंट compromised है, attacker के पास सभी injected credentials तक direct access है

OpenLegion का vault proxy इस architecture को fundamentally बदलता है। API keys Mesh Host के Credential Vault (Zone 2) में stored हैं। जब एक एजेंट को एक authenticated API call करनी है, यह vault proxy को request भेजता है। Proxy network layer पर credential inject करता है, authenticated call करता है, और एजेंट को result return करता है। एजेंट कच्ची key को कभी नहीं देखता, store करता है, या उसकी memory access नहीं रखता।

यह vault-proxied credentials है — HashiCorp Vault जैसे enterprise secret management systems द्वारा उपयोग किया जाने वाला समान principle, लेकिन अलग infrastructure की आवश्यकता के बजाय AI एजेंट orchestration layer में built।

Containerized AI एजेंट्स: Process-Level Isolation पर्याप्त क्यों नहीं है

कई frameworks isolation का कुछ form प्रदान करते हैं, लेकिन implementation details मायने रखते हैं:

Framework	Isolation approach	वास्तव में क्या isolated है	क्या shared है
OpenLegion	प्रति एजेंट Docker कंटेनर (अनिवार्य)	Process, filesystem, network, memory, credentials	कुछ नहीं — एजेंट्स पूरी तरह isolated हैं
OpenClaw	Docker कंटेनर (optional)	Process, filesystem	Docker socket default रूप से mounted; host network accessible
LangGraph	अंतर्निहित नहीं	N/A	सब कुछ — एजेंट्स Python process share करते हैं
CrewAI	CodeInterpreter के लिए Docker	Code execution output	Agent processes Python runtime share करते हैं
AutoGen	Code execution के लिए Docker	Code execution output	Agent processes Python runtime share करते हैं

Critical distinction: OpenLegion एजेंट को ही एक कंटेनर में isolate करता है। अन्य frameworks जो Docker isolation प्रदान करते हैं typically केवल code execution output को isolate करते हैं — एजेंट process, इसकी memory, और इसकी credential access shared रहती है। इसका अर्थ है कि एक prompt injection जो LangGraph या CrewAI में एक एजेंट को compromise करता है, shared process में सभी credentials और state तक access रखता है। OpenLegion में, समान compromise credential access के बिना एक single sandboxed कंटेनर तक contained है।

AI एजेंट Cost Controls: सुरक्षा के रूप में Budget Enforcement

Cost controls केवल financial governance नहीं हैं — वे एक security mechanism हैं। Unlimited tokens consume करने वाला एक runaway एजेंट एक resource abuse attack है, चाहे malicious prompt injection या एजेंट के reasoning loop में एक simple bug द्वारा triggered।

OpenLegion का बजट enforcement orchestrator level पर काम करता है:

प्रत्येक एजेंट के पास एक configurable दैनिक और मासिक token बजट है
Token usage Zone 2 में Cost Tracker द्वारा real time में tracked है
जब एक एजेंट अपनी limit hit करता है, orchestrator एक hard cutoff issue करता है — एजेंट halted है
बाकी workflow pipeline gracefully जारी रहता है या pause होता है
Cost data प्रति-एजेंट breakdowns के साथ fleet dashboard में visible है

कोई अन्य बड़ा AI एजेंट framework इस capability को अंतर्निहित प्रदान नहीं करता, लिखने के समय public documentation के आधार पर।

Compliance और Audit Considerations

OpenLegion उन environments के लिए designed है जिन्हें compliance controls चाहिए, including:

Request tracing: Auditable fleet-model coordination का अर्थ है कि हर workflow step explicit और traceable है। अंतर्निहित request tracing system real-time observability के लिए task transitions, tool calls, और token expenditure record करता है। Blackboard (shared state) एजेंट्स के बीच coordination context प्रदान करता है।
Auditable fleet-model coordination: fleet-model coordination (blackboard + pub/sub + handoff) execution से पहले audit किया जा सकता है — आप system चलाए बिना data, permissions, और एजेंट interactions के complete flow को verify कर सकते हैं।
Data isolation: Dedicated /data volumes के साथ प्रति-एजेंट containers ensure करते हैं कि एक एजेंट द्वारा processed sensitive data अन्य एजेंट्स के लिए accessible नहीं है।
Air-gap support: कोई external services नहीं (कोई Redis नहीं, कोई Kubernetes नहीं, कोई cloud services आवश्यक नहीं) का अर्थ है OpenLegion on-premises environments में चल सकता है।

Important: OpenLegion वर्तमान में SOC 2, ISO 27001, HIPAA, या अन्य compliance certifications नहीं रखता। Architecture इन आवश्यकताओं वाले environments को support करने के लिए built है, लेकिन certification आपके deployment, configuration, और organizational controls का function है — केवल framework का नहीं।

Default रूप से secure एजेंट्स deploy करें।

शुरू करें Docs पढ़ें

अक्सर पूछे जाने वाले प्रश्न

AI एजेंट सुरक्षा का क्या अर्थ है?

AI एजेंट सुरक्षा controls का set है जो स्वायत्त AI एजेंट्स को credential leakage, prompt injection, resource abuse, data exfiltration, sandbox escape, या excessive agency के माध्यम से नुकसान पहुँचाने से रोकता है। यह runtime isolation (sandboxing एजेंट्स), credential management (key exposure रोकना), cost enforcement (runaway spending रोकना), permission controls (एजेंट्स क्या कर सकते हैं इसे सीमित करना), और input validation (malicious inputs filter करना) में फैला है।

आप API keys के साथ AI एजेंट्स को कैसे secure करते हैं?

सबसे secure approach vault-proxied credentials है: API keys एक vault में store करें जिसे एजेंट्स directly access नहीं कर सकते। जब एक एजेंट को एक authenticated call करनी हो, request एक proxy के माध्यम से route करती है जो network layer पर credential inject करता है। एजेंट कच्ची key को कभी नहीं देखता। OpenLegion इसे चार-zone trust model (plus एक operator-or-internal tier) के Zone 2 में अपने vault proxy के माध्यम से implement करता है। सबसे कम secure (और सबसे common) approach environment variables है, जहाँ keys एजेंट की memory में मौजूद हैं और prompt injection, logging, या error output के माध्यम से leaked हो सकती हैं।

AI एजेंट isolation कैसे काम करता है?

एजेंट isolation का अर्थ है प्रत्येक एजेंट को अपने sandboxed environment में चलाना — separate process, filesystem, network namespace, और memory space। OpenLegion में, प्रत्येक एजेंट configurable resource limits (384MB RAM, 0.15 CPU default), non-root execution, और कोई shared filesystem के साथ एक dedicated Docker कंटेनर में चलता है। इसका अर्थ है कि एक compromised एजेंट अन्य एजेंट्स के data, credential vault, या host system को access नहीं कर सकता। यह उन frameworks से भिन्न है जहाँ एजेंट्स एक Python process share करते हैं और एक-दूसरे की memory को access कर सकते हैं।

AI एजेंट्स को बजट / cost controls क्यों चाहिए?

स्वायत्त एजेंट्स recursive loops में प्रवेश कर सकते हैं, अत्यधिक API calls कर सकते हैं, या जितना आवश्यक है उससे कहीं अधिक resources consume करने के लिए manipulated हो सकते हैं। बजट controls के बिना, एक single runaway एजेंट minutes में सैकड़ों dollars tokens में drain कर सकता है। Multi-agent systems में यह compounds करता है — प्रत्येक एजेंट risk को गुणा करता है। OpenLegion orchestrator level पर hard cutoffs के साथ प्रति-एजेंट दैनिक और मासिक budgets enforce करता है, किसी भी single एजेंट को unbounded cost का कारण बनने से रोकता है।

क्या secure AI एजेंट्स BYO keys के साथ संभव हैं?

हाँ। BYO (Bring Your Own) key model वास्तव में proper architecture के साथ अधिक secure है। OpenLegion में, आपकी keys Mesh Host के Credential Vault में store हैं और network layer पर एक vault proxy के माध्यम से inject की जाती हैं। एजेंट्स कच्ची keys कभी नहीं देखते। यह आपको full cost transparency देता है (आप exactly देखते हैं कि प्रत्येक एजेंट प्रत्येक provider के साथ क्या खर्च करता है), provider flexibility (प्रति एजेंट models swap करें), और समान credential isolation guarantees इसकी परवाह किए बिना कि आप कौन सा provider उपयोग करते हैं। अपनी खुद की LLM API keys लाएँ। Model usage पर कोई markup नहीं।

AI एजेंट्स के लिए OWASP Top 10 क्या है?

OWASP ने December 2025 में Agentic Applications के लिए Top 10 publish किया। #1 risk Agent Goal Hijacking है — जहाँ एक attacker एक एजेंट को user के intended goals से भिन्न goals का पीछा करने के लिए manipulate करता है। अन्य top risks में शामिल हैं credential leakage, excessive agency (एजेंट्स अपने scope से परे actions ले रहे हैं), और supply chain vulnerabilities (malicious tools या plugins)। OpenLegion इन्हें vault-proxied credentials, कंटेनर isolation, permission matrices, और fleet-model coordination (blackboard + pub/sub + handoff) के माध्यम से address करता है।

OpenLegion सुरक्षा पर OpenClaw से कैसे तुलना करता है?

Public documentation के आधार पर, OpenLegion कठोर security defaults प्रदान करता है। OpenClaw के default local deployment को Docker socket mounting (broad host access प्रदान करता है) की आवश्यकता है, इसके security analyzer में consistent activation के साथ reported issues हुए हैं, और यह एजेंट process के लिए accessible configuration में credentials store करता है। OpenLegion अनिवार्य isolated containers में एजेंट्स चलाता है, vault-proxied credentials के लिए एक vault proxy का उपयोग करता है, प्रति-एजेंट budgets enforce करता है, और कई choke points पर unicode sanitization apply करता है। विस्तृत comparison के लिए, OpenLegion बनाम OpenClaw देखें।

AI एजेंट्स पर कौन से compliance frameworks apply होते हैं?

Key frameworks में शामिल हैं OWASP Top 10 for LLM Applications (2025) और Agentic Applications (2026), NIST AI Risk Management Framework (आगामी AI Agent Standards के साथ), ISO/IEC 42001 (AI management systems), EU AI Act (enforcement August 2026 में शुरू), और आपके domain के आधार पर HIPAA, SOC 2, और SOX जैसे industry-specific regulations। OpenLegion का architecture उन environments के लिए designed है जिन्हें ये controls चाहिए लेकिन यह स्वयं certifications नहीं रखता।

Include करने के लिए आंतरिक Links

Anchor Text	Destination
AI एजेंट प्लेटफ़ॉर्म	/learn/ai-agent-platform
AI एजेंट orchestration	/learn/ai-agent-orchestration
AI एजेंट frameworks comparison	/learn/ai-agent-frameworks
AI एजेंट सुरक्षा	/learn/ai-agent-security
OpenClaw विकल्प	/openclaw-alternative
OpenLegion बनाम OpenClaw	/comparison/openclaw
Documentation	/docs
GitHub	https://github.com/openlegion-ai/openlegion