Immersive Blogs
Publications about innovation and new functionality.
6 Skills, 46 Agents: What Happens When Your AI Assistant Actually Specializes
Ask most observability AI assistants a question and you're talking to one model, one prompt, one context window trying to be everything at once. Diagnosis, code review, risk assessment, architecture decisions, security analysis: all crammed into a single conversation thread that tries to be an expert at everything.
That's not how expertise works. A security researcher thinks differently than a systems architect. A risk manager asks different questions than a code reviewer. Cramming all of those perspectives into one prompt produces mediocre versions of each.
Tessa doesn't work that way. She's 46 specialized agents organized into 6 skills, built on skills ported from the Jerry framework, an open-source multi-agent orchestration system by Adam Nowak. Each agent has its own domain expertise, its own constraints, its own tools, and its own workflow patterns. When you ask a question, the routing system matches your intent to the right specialist. You don't need to know agent names or skill categories. You just ask.
Here's what's inside.
Skill 1: Problem Solving (9 Agents)
The generalist skill. When you need to understand something complex, this is what activates.
Nine agents cover every phase of structured thinking. A researcher gathers information with web search and codebase exploration, producing cited reports. An analyst does root cause analysis, trade-off evaluation, and risk assessment. An architect produces formal Architecture Decision Records. A critic runs creator-critic-revision loops, scoring deliverables on six quality dimensions. A validator verifies constraints with evidence. A synthesizer extracts patterns across multiple documents. A reviewer does code, design, and security reviews. An investigator applies 5 Whys and Ishikawa diagrams to trace causal chains. A reporter synthesizes progress across work streams.
These agents don't just answer questions. They follow structured methodologies. The investigator doesn't just guess at root causes. It builds an Ishikawa diagram, distinguishes symptoms from causes, and traces the causal chain to the origin. The architect doesn't just suggest a design. It produces a Nygard-format ADR with alternatives evaluated, trade-offs documented, and risks identified.
The problem solving skill is what activates when you ask Tessa "why is checkout slow?" or "compare these two approaches" or "what's the root cause of this failure?"
Skill 2: Red Team (11 Agents)
Full MITRE ATT&CK kill chain coverage. This is the skill that most people don't expect to find inside an observability assistant.
Eleven agents cover reconnaissance, vulnerability analysis, exploitation methodology, privilege escalation, lateral movement, persistence, data exfiltration, social engineering, C2 infrastructure, and comprehensive reporting. The entire offensive security workflow, from scoping to final report.
Here's what makes this responsible rather than reckless: the skill is scope-gated. An Engagement Lead agent must establish Rules of Engagement before any offensive agent can operate. Three agents (persistence, exfiltration, and social engineering) require explicit RoE authorization on top of that. You can't accidentally run an exfiltration assessment. The authorization chain is mandatory.
Why does an observability tool need red team capabilities? Because the teams using APM tools are increasingly responsible for security posture, not just uptime. SREs and platform engineers are expected to understand their attack surface. Tessa can help them think about it systematically, using the same methodologies that professional penetration testers use.
Skill 3: Engineering Team (10 Agents)
Security-first software development lifecycle, from architecture through post-deployment incident response.
Ten agents cover the full engineering workflow: a solution architect for threat modeling (STRIDE, DREAD, PASTA), a lead for standards enforcement and implementation planning, backend and frontend specialists for server-side and client-side security, an infrastructure specialist for IaC hardening and supply chain security (SLSA, SBOM), a DevSecOps engineer for SAST/DAST pipeline configuration, a QA engineer for security testing strategy, a security reviewer for manual code review with CWE classification, a final reviewer as the quality gate before release, and an incident responder for post-deployment monitoring and runbooks.
This isn't a generic "review my code" assistant. The backend agent knows OWASP Top 10 and ASVS 5.0. The infrastructure agent knows CIS Benchmarks. The DevSecOps agent knows how to configure Semgrep, Gitleaks, and Stryker pipelines. Each agent brings domain-specific knowledge that a general-purpose LLM simply doesn't have in its default prompt.
Skill 4: NASA Systems Engineering (10 Agents)
This is the one that surprises people. Tessa implements NPR 7123.1D, the actual NASA systems engineering process standard. Not a simplified version. Not "inspired by." The real processes.
Ten agents cover the full SE lifecycle: requirements engineering (stakeholder needs, requirements definition, requirements management), technical architecture (logical decomposition, design solutions, decision analysis), verification and validation (test, analysis, inspection, demonstration), system integration (interface management, ICD compliance), risk management (NPR 8000.4C, 5x5 likelihood/consequence matrices), configuration management (baselines, change tracking), technical review gates (SRR, PDR, CDR, FRR with entrance/exit criteria), exploration (divergent thinking, trade space analysis), quality assurance (work product validation against NPR standards), and SE status reporting across all processes.
Why NASA SE in an observability tool? Because observability platforms monitor complex systems, and complex systems benefit from rigorous engineering processes. Teams building mission-critical infrastructure (healthcare, finance, aerospace, energy) need more than "move fast and break things." They need traceability matrices, verification evidence, and formal review gates. Tessa can facilitate those processes.
Skill 5: Adversary (3 Agents)
Tessa's self-critique mechanism. This is the skill that reviews Tessa's own output.
Three agents form the adversarial quality pipeline. A selector maps criticality levels (C1 advisory through C4 mission-critical) to the appropriate adversarial strategy. An executor runs those strategies against deliverables, producing structured finding reports with severity classification. A scorer implements LLM-as-Judge rubric scoring across six dimensions (completeness, correctness, clarity, consistency, depth, actionability), producing a weighted composite score with a verdict: PASS (0.90+), REVISE (0.70-0.89), or ESCALATE (below 0.70).
This is the quality gate. When Tessa produces an architecture decision, a risk assessment, or a security review, the adversary skill can review that output through adversarial lenses before presenting it. The AI assistant critiques itself, identifies weaknesses in its own reasoning, and either approves, requests revision, or escalates for human review.
Most AI assistants have no internal quality mechanism. They generate output and present it with equal confidence regardless of quality. Tessa can score her own work and tell you when she's not confident.
Skill 6: Prompt Engineering (3 Agents)
The meta-skill. This one helps you build better prompts for any AI system, not just Tessa.
A builder guides you through the 5-element prompt anatomy (identity, task, context, constraints, output format) with interactive assembly. A constraint generator selects patterns and formats constraint blocks with required/forbidden/boundary/quality specifications. A scorer evaluates prompts against seven criteria (clarity, specificity, format definition, context completeness, edge case coverage, persona consistency, testability) with improvement suggestions.
This exists because prompt quality is the single biggest lever on AI output quality, and most users have never been taught how to write effective prompts.
Built on the Jerry Framework
Tessa's skill system is ported from the Jerry framework, an open-source multi-agent orchestration system created by Adam Nowak. Jerry defines the skill architectures, agent specifications, routing patterns, and quality scoring methodologies that power all 46 agents. We ported Jerry's skill definitions into our .NET assistant library, preserving the original domain expertise, constraint models, and workflow patterns while adapting them to run natively in IAPM's infrastructure.
Jerry is Apache-2.0 licensed. If you're building multi-agent AI systems, it's worth studying. The skill/agent decomposition, the adversarial quality patterns, and the structured problem-solving methodology are all defined there. Every ported skill in Tessa carries provenance headers citing the Jerry source path, version, and license.
Beyond Skills: What Else Tessa Can Do
The skill system is the most distinctive part of Tessa's architecture, but there's more under the hood.
Multimodal vision input. Paste a screenshot into Tessa and ask "what's wrong here?" She accepts images alongside text, so you can show her a dashboard, an error screen, or a log output and get analysis without describing what you see. Works in the desktop app (Ctrl+V paste, drag-drop) and the 3D client (Ctrl+V paste).
DAG workflow orchestration. For complex multi-step tasks, Tessa can chain agents into directed acyclic graphs with conditional routing, gate nodes for human approval, and checkpointing for resume-from-failure. A triage agent produces output, a router node decides "critical" or "routine" based on the result, and the workflow branches accordingly. Gate nodes block execution until a human approves, with timeout policies that default to cancel (never auto-approve).
Self-scoring quality gates. When a skill produces output, Tessa can score it across six dimensions (completeness, correctness, clarity, consistency, depth, actionability) using LLM-as-Judge methodology. If the score is below threshold, she feeds the critique back to the agent for revision, up to three iterations. This is the adversary skill in action: the AI critiques its own work before presenting it to you.
Hat-based role switching. Tessa can wear different "hats" (researcher, security analyst, coder, architect, reviewer) that shift her expertise lens without changing her personality. Wearing the security hat soft-prefers red team and engineering skills in the router. The hat provides behavioral focus. The skills provide the specialist agents. They compose naturally.
Context preservation. Long conversations don't degrade. Instead of naive truncation, Tessa summarizes older messages in the background and keeps recent messages at full fidelity. The summary is regenerated incrementally, with periodic full re-derivation to prevent drift. Secrets are scrubbed from summaries before storage.
2,326 tests. The assistant library has over 2,300 passing tests across unit, integration, and build verification. The skill system alone has 39 dedicated tests, and the DAG workflow engine has 53.
How Routing Works (Without You Thinking About It)
You don't need to know that Tessa has 46 agents. You don't need to pick a skill. You just ask a question.
The routing system handles intent matching automatically. "Why is the payment service throwing 500 errors?" activates problem solving. "Review this code for security vulnerabilities" activates the engineering team. "What's the attack surface of this API?" activates the red team (after scope verification). "Create an ADR for this migration" activates the architect agent within problem solving.
The keyword router does the first pass. If confidence is high, the query goes directly to the matching skill. If confidence is below threshold, an LLM classifier does a second pass to determine intent. The two-phase approach means common queries route instantly while ambiguous queries still reach the right specialist.
All 46 agents run on the same underlying GPT-5.4 model family (via intelligent tier routing). The skill system provides the domain expertise, the constraints, and the workflow structure. The model provides the reasoning. Together, they produce output that a single generic prompt cannot match.
See It in Action
Watch Tessa diagnose a production error, trace it through distributed services, and generate a complete root cause analysis document in one conversation: From Error to Full RCA in One Conversation.
Or see the full 3D environment where Tessa operates: Introduction to Immersive APM.
To try it yourself, start free and populate your grid with realistic telemetry:
go install github.com/ImmersiveFusion/if-opentelemetry-tracegen/cmd/tracegen@latest
tracegen -endpoint otlp.iapm.app:443 -headers "api-key=YOUR_KEY" -complexity light
Then ask Tessa anything. She'll route to the right specialist.
Start Free. Immersive. AI-guided. Full-stack observability. Enter the World of Your Application®.
Dan Kowalski
Father, technology aficionado, gamer, Gridmaster
About Immersive Fusion
Immersive Fusion (immersivefusion.com) is pioneering the next generation of observability by merging spatial computing and AI to make complex systems intuitive, interactive, and intelligent. As the creators of IAPM, we deliver solutions that combine web, 3D/VR, and AI technologies, empowering teams to visualize and troubleshoot their applications in entirely new ways. This approach enables rapid root-cause analysis, reduces downtime, and drives higher productivity—transforming observability from static dashboards into an immersive, intelligent experience. Learn more about or join Immersive Fusion on LinkedIn, Mastodon, X, YouTube, Facebook, Instagram, GitHub, Discord>.The Better Way to Monitor and Manage Your Software
Streamlined Setup
Simple integration
Cloud-native and open source friendly
Rapid Root Cause Analysis
Intuitive tooling
Find answers in a single glance. Know the health of your application
AI Powered
AI Assistant by your side
Unlock the power of AI for assistance and resolution
Intuitive Solutions
Conventional and Immersive
Expert tools for every user:
DevOps, SRE, Infra, Education