\ Something very interesting happens when you combine AI agents, Model Context Protocol (MCP), and serverless computing. We're not just talking about the smarter chatbots that can hit a few APIs; we're building AI systems that can actually understand who you are, what you're allowed to do, and can work across different company systems without breaking security rules or stepping on other users' toes.
Traditional AI applications face a fundamental problem: how do you maintain user context and permissions when an AI agent needs to access multiple services on behalf of different users? Most implementations either sacrifice security (by using shared credentials) or user experience (by requiring constant re-authentication).
The solution lies in a sophisticated JWT propagation pattern that maintains user identity throughout the entire request chain:
\ This creates a secure chain of trust where user identity is never inferred from AI responses but always cryptographically verified.
Think of MCP as breaking AI out of its cage. Instead of building one massive AI app that tries to do everything, you can now create smaller, specialized AI services that talk to each other. Rather than hardcoding every possible tool an AI might need, MCP lets your AI discover and use new tools on the fly, even if those tools live on completely different servers.
The key insight is treating tools as microservices rather than embedded functions. Each MCP server becomes a domain-specific intelligence hub that can serve multiple agents while maintaining its own security and business logic.
\
// MCP tools become user-aware automatically export async function getTravelPolicies(userId, userRole) { // Policy enforcement happens at the tool level return policies.filter(p => p.appliesToRole(userRole)); }
\
Serverless computing solves three critical challenges for AI agents:
1. Stateless by Design: Each invocation starts fresh, eliminating state pollution between users and requests.
2. Automatic Scaling: Handle concurrent users without capacity planning—essential when AI agents might trigger complex tool chains.
3. Cost Efficiency: Pay only for actual inference and tool execution time, not idle capacity.
The architecture becomes elegantly simple:
Traditional web applications maintain session state in memory or databases. AI agents require a different approach because their "state" includes conversation history, tool results, and learned context—potentially gigabytes of data.
Externalizing this to S3 with the Strands SDK creates fascinating possibilities:
# Agent state becomes portable and analyzable session_manager = S3SessionManager( bucket="agent-sessions", key_prefix=f"user/{user_id}/conversations/" ) # State can be shared, analyzed, or migrated agent = StrandsAgent.from_session(session_manager)
This enables features like conversation handoffs between agents, audit trails, and even AI-to-AI collaboration patterns.
Building the travel agent example revealed several non-obvious patterns:
Tool Composition: MCP servers can call other MCP servers, creating tool hierarchies. A booking tool might call policy tools, pricing tools, and availability tools in sequence.
Failure Isolation: When one MCP server fails, others continue working. The agent gracefully degrades functionality rather than failing.
Dynamic Authorization: User permissions can change mid-conversation. The JWT refresh pattern ensures tools always operate with current permissions.
This architecture pattern extends far beyond travel booking. Consider:
The combination of MCP and serverless is enabling a new class of AI applications that are:
We're moving from "AI that can use tools" to "AI that can orchestrate distributed business processes while maintaining perfect security and user context."
The future isn't just smarter chatbots; it's intelligent systems that can safely operate across the full spectrum of enterprise applications, with each user getting their own personalized, secure, and contextually aware AI assistant.
\


