Reasoning Boundaries
Limit what the AI can reason about based on context
Behavioral guardrails are contextual controls that adapt to workflow intent, preventing AI misuse while enabling productive operation.
Reasoning Boundaries
Limit what the AI can reason about based on context
Output Integrity
Ensure output quality, accuracy, and safety
Behavioral Drift
Detect deviation from expected patterns
Control the scope of AI reasoning:
from duragraph.governance import Guardrail, ReasoningBoundary
# Topic restrictionstopic_guard = ReasoningBoundary( name="support_topics", allowed_topics=["billing", "technical_support", "account_management"], blocked_topics=["competitor_products", "investment_advice", "medical_guidance"],)
# Knowledge cutoffsknowledge_guard = ReasoningBoundary( name="verified_only", require_grounding=True, # Must cite sources speculation_allowed=False, knowledge_sources=["product_docs", "faq_database"],)guardrail: type: topic_restriction config: allowed: - billing_inquiries - product_features - account_management blocked: - competitor_comparisons - legal_advice - medical_recommendations action_on_violation: redirect # redirect, block, warnEnsure AI outputs meet quality and safety standards:
from duragraph.governance import OutputIntegrity, HallucinationDetector
hallucination_guard = HallucinationDetector( name="fact_checker", strategies=[ "source_verification", # Check claims against sources "self_consistency", # Multiple generations agree "confidence_threshold", # Require high certainty ], threshold=0.8, action="flag_for_review",)
# Apply to workflow@llm_node(guardrails=[hallucination_guard])async def respond(self, state): response = await self.llm.complete(state.messages) # Guardrail automatically checks output return stateconsistency_guard = OutputIntegrity( name="logical_consistency", checks=[ "no_contradictions", # Output doesn't contradict itself "matches_context", # Aligns with conversation history "factual_alignment", # Facts match known data ],)attribution_guard = OutputIntegrity( name="require_citations", require_sources=True, source_format="inline", # or "footnote", "appendix" minimum_sources=1,)Monitor for deviations from expected AI behavior:
from duragraph.governance import DriftDetector
drift_guard = DriftDetector( name="persona_enforcement", baseline_behavior={ "tone": "professional", "response_length": {"min": 50, "max": 500}, "topics": ["customer_support"], }, sensitivity=0.7, # How strictly to enforce alert_threshold=3, # Consecutive violations before alert)guardrail: type: anomaly_detection config: metrics: - response_length_variance - sentiment_deviation - topic_drift_score baseline_window: 100 # Compare against last 100 responses alert_on: 2_sigma_deviationGuardrails that adjust based on context:
from duragraph.governance import AdaptiveGuardrail
adaptive_guard = AdaptiveGuardrail( name="context_sensitive", profiles={ "low_risk": { "guardrails": ["basic_safety"], "audit_level": "minimal", }, "medium_risk": { "guardrails": ["safety", "attribution", "consistency"], "audit_level": "standard", }, "high_risk": { "guardrails": ["all"], "audit_level": "full", "require_human_review": True, }, }, context_evaluator=lambda ctx: calculate_risk_level(ctx),)Low Risk Context (e.g., internal FAQ): - Minimal guardrails - Allow creative responses - Basic logging only
Medium Risk Context (e.g., customer support): - Standard guardrails - Require source attribution - Full audit logging
High Risk Context (e.g., financial advice): - Maximum guardrails - Human review required - Real-time compliance checksguardrails: - name: pii_protection type: output_filter enabled: true config: detect_patterns: - ssn: '\d{3}-\d{2}-\d{4}' - credit_card: '\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}' - email: '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}' action: redact replacement: '[REDACTED]'
- name: safety_filter type: content_safety config: categories: - hate_speech - violence - self_harm threshold: 0.8 action: block
- name: response_quality type: output_integrity config: min_length: 20 max_length: 2000 require_complete_sentences: truefrom duragraph.governance import GuardrailEngine
engine = GuardrailEngine()
# Load from YAMLengine.load_config("guardrails.yml")
# Or configure programmaticallyengine.add_guardrail( name="custom_filter", type="output_filter", config={"block_patterns": ["forbidden_term"]},)Monitor guardrail effectiveness:
# Get guardrail metricsmetrics = await governance.get_guardrail_metrics()
# Returns:{ "total_evaluations": 10000, "guardrail_triggers": { "pii_protection": 45, "hallucination_detector": 12, "topic_restriction": 8, }, "trigger_rate": 0.0065, "false_positive_rate": 0.02, "response_latency_ms": 15,}GET /api/v1/governance/guardrails/metrics{ "summary": { "total_requests": 50000, "blocked": 125, "flagged": 340, "passed": 49535 }, "by_guardrail": [ { "name": "pii_protection", "triggers": 89, "action_taken": "redact", "avg_latency_ms": 12 } ], "trends": { "trigger_rate_7d": [0.005, 0.006, 0.004, 0.007, 0.005, 0.006, 0.005] }}Guardrails work with policies for comprehensive governance:
from duragraph.governance import Policy, Guardrail
policy = Policy( name="financial_advisory", # Guardrails specific to this policy guardrails=[ Guardrail(type="topic_restriction", config={"allowed": ["investments", "planning"]}), Guardrail(type="disclaimer_required", config={"text": "This is not financial advice."}), Guardrail(type="human_review", config={"for_amounts_over": 100000}), ], # Audit requirements audit_level="comprehensive", retention_days=2555, # 7 years for financial records)Trust Framework
Build strategic trust with transparent decision trails
Governance Overview
Return to governance overview for architecture details