6 Skills, 46 Agents: What Happens When Your AI Assistant Actually Specializes

Name: IAPM
Author: Immersive Fusion

Dan Kowalski - 2026-05-07

Updated 2026-05-12: Added a new section, "What's the Point: 46 Agents, One Click," on how Mosey makes the 46 agents reachable in one click from inside the 3D environment, and why the red-team and engineering-team skills in particular are the moment Mosey earns his keep.

Ask most observability AI assistants a question and you're talking to one model, one prompt, one context window trying to be everything at once. Diagnosis, code review, risk assessment, architecture decisions, security analysis: all crammed into a single conversation thread that tries to be an expert at everything.

That's not how expertise works. A security researcher thinks differently than a systems architect. A risk manager asks different questions than a code reviewer. Cramming all of those perspectives into one prompt produces mediocre versions of each.

Tessa doesn't work that way. She's 46 specialized agents organized into 6 skills, built on skills ported from the open-source Jerry framework. Each agent has its own domain expertise, its own constraints, its own tools, and its own workflow patterns. When you ask a question, the routing system matches your intent to the right specialist. You don't need to know agent names or skill categories. You just ask.

Here's what's inside.

6 skills. 46 agents. One assistant.

Skill 1: Problem Solving (9 Agents)

The generalist skill. When you need to understand something complex, this is what activates.

Nine agents cover every phase of structured thinking. A researcher gathers information with web search and codebase exploration, producing cited reports. An analyst does root cause analysis, trade-off evaluation, and risk assessment. An architect produces formal Architecture Decision Records. A critic runs creator-critic-revision loops, scoring deliverables on six quality dimensions. A validator verifies constraints with evidence. A synthesizer extracts patterns across multiple documents. A reviewer does code, design, and security reviews. An investigator applies 5 Whys and Ishikawa diagrams to trace causal chains. A reporter synthesizes progress across work streams.

These agents don't just answer questions. They follow structured methodologies. The investigator doesn't just guess at root causes. It builds an Ishikawa diagram, distinguishes symptoms from causes, and traces the causal chain to the origin. The architect doesn't just suggest a design. It produces a Nygard-format ADR with alternatives evaluated, trade-offs documented, and risks identified.

The problem solving skill is what activates when you ask Tessa "why is checkout slow?" or "compare these two approaches" or "what's the root cause of this failure?"

Skill 2: Red Team (11 Agents)

Full MITRE ATT&CK kill chain coverage. This is the skill that most people don't expect to find inside an observability assistant.

Eleven agents cover reconnaissance, vulnerability analysis, exploitation methodology, privilege escalation, lateral movement, persistence, data exfiltration, social engineering, C2 infrastructure, and comprehensive reporting. The entire offensive security workflow, from scoping to final report.

Here's what makes this responsible rather than reckless: the skill is scope-gated. An Engagement Lead agent must establish Rules of Engagement before any offensive agent can operate. Three agents (persistence, exfiltration, and social engineering) require explicit RoE authorization on top of that. You can't accidentally run an exfiltration assessment. The authorization chain is mandatory.

Why does an observability tool need red team capabilities? Because the teams using APM tools are increasingly responsible for security posture, not just uptime. SREs and platform engineers are expected to understand their attack surface. Tessa can help them think about it systematically, using the same methodologies that professional penetration testers use.

Skill 3: Engineering Team (10 Agents)

Security-first software development lifecycle, from architecture through post-deployment incident response.

Ten agents cover the full engineering workflow: a solution architect for threat modeling (STRIDE, DREAD, PASTA), a lead for standards enforcement and implementation planning, backend and frontend specialists for server-side and client-side security, an infrastructure specialist for IaC hardening and supply chain security (SLSA, SBOM), a DevSecOps engineer for SAST/DAST pipeline configuration, a QA engineer for security testing strategy, a security reviewer for manual code review with CWE classification, a final reviewer as the quality gate before release, and an incident responder for post-deployment monitoring and runbooks.

This isn't a generic "review my code" assistant. The backend agent knows OWASP Top 10 and ASVS 5.0. The infrastructure agent knows CIS Benchmarks. The DevSecOps agent knows how to configure Semgrep, Gitleaks, and Stryker pipelines. Each agent brings domain-specific knowledge that a general-purpose LLM simply doesn't have in its default prompt.

Not one prompt pretending to be an expert. Forty-six specialists, each with a job.

Skill 4: NASA Systems Engineering (10 Agents)

This is the one that surprises people. Tessa implements NPR 7123.1D, the actual NASA systems engineering process standard. Not a simplified version. Not "inspired by." The real processes.

Ten agents cover the full SE lifecycle: requirements engineering (stakeholder needs, requirements definition, requirements management), technical architecture (logical decomposition, design solutions, decision analysis), verification and validation (test, analysis, inspection, demonstration), system integration (interface management, ICD compliance), risk management (NPR 8000.4C, 5x5 likelihood/consequence matrices), configuration management (baselines, change tracking), technical review gates (SRR, PDR, CDR, FRR with entrance/exit criteria), exploration (divergent thinking, trade space analysis), quality assurance (work product validation against NPR standards), and SE status reporting across all processes.

Why NASA SE in an observability tool? Because observability platforms monitor complex systems, and complex systems benefit from rigorous engineering processes. Teams building mission-critical infrastructure (healthcare, finance, aerospace, energy) need more than "move fast and break things." They need traceability matrices, verification evidence, and formal review gates. Tessa can facilitate those processes.

Skill 5: Adversary (3 Agents)

Tessa's self-critique mechanism. This is the skill that reviews Tessa's own output.

Three agents form the adversarial quality pipeline. A selector maps criticality levels (C1 advisory through C4 mission-critical) to the appropriate adversarial strategy. An executor runs those strategies against deliverables, producing structured finding reports with severity classification. A scorer implements LLM-as-Judge rubric scoring across six dimensions (completeness, correctness, clarity, consistency, depth, actionability), producing a weighted composite score with a verdict: PASS (0.90+), REVISE (0.70-0.89), or ESCALATE (below 0.70).

This is the quality gate. When Tessa produces an architecture decision, a risk assessment, or a security review, the adversary skill can review that output through adversarial lenses before presenting it. The AI assistant critiques itself, identifies weaknesses in its own reasoning, and either approves, requests revision, or escalates for human review.

Most AI assistants have no internal quality mechanism. They generate output and present it with equal confidence regardless of quality. Tessa can score her own work and tell you when she's not confident.

Skill 6: Prompt Engineering (3 Agents)

The meta-skill. This one helps you build better prompts for any AI system, not just Tessa.

A builder guides you through the 5-element prompt anatomy (identity, task, context, constraints, output format) with interactive assembly. A constraint generator selects patterns and formats constraint blocks with required/forbidden/boundary/quality specifications. A scorer evaluates prompts against seven criteria (clarity, specificity, format definition, context completeness, edge case coverage, persona consistency, testability) with improvement suggestions.

This exists because prompt quality is the single biggest lever on AI output quality, and most users have never been taught how to write effective prompts.

The adversary skill reviews Tessa's own output. The AI critiques itself.

Built on the Jerry Framework

Tessa's skills are ported from the open-source Jerry framework (Apache-2.0), created by Adam Nowak. We adapted Jerry's skill and agent definitions to run natively in our .NET assistant library, and every ported skill carries a provenance header citing the Jerry source path, version, and license.

Beyond Skills: What Else Tessa Can Do

The skill system is the most distinctive part of Tessa's architecture, but there's more under the hood.

Multimodal vision input. Paste a screenshot into Tessa and ask "what's wrong here?" She accepts images alongside text, so you can show her a dashboard, an error screen, or a log output and get analysis without describing what you see. Works in the desktop app (Ctrl+V paste, drag-drop) and the 3D client (Ctrl+V paste).

DAG workflow orchestration. For complex multi-step tasks, Tessa can chain agents into directed acyclic graphs with conditional routing, gate nodes for human approval, and checkpointing for resume-from-failure. A triage agent produces output, a router node decides "critical" or "routine" based on the result, and the workflow branches accordingly. Gate nodes block execution until a human approves, with timeout policies that default to cancel (never auto-approve).

Self-scoring quality gates. When a skill produces output, Tessa can score it across six dimensions (completeness, correctness, clarity, consistency, depth, actionability) using LLM-as-Judge methodology. If the score is below threshold, she feeds the critique back to the agent for revision, up to three iterations. This is the adversary skill in action: the AI critiques its own work before presenting it to you.

Hat-based role switching. Tessa can wear different "hats" (researcher, security analyst, coder, architect, reviewer) that shift her expertise lens without changing her personality. Wearing the security hat soft-prefers red team and engineering skills in the router. The hat provides behavioral focus. The skills provide the specialist agents. They compose naturally.

Context preservation. Long conversations don't degrade. Instead of naive truncation, Tessa summarizes older messages in the background and keeps recent messages at full fidelity. The summary is regenerated incrementally, with periodic full re-derivation to prevent drift. Secrets are scrubbed from summaries before storage.

2,326 tests. The assistant library has over 2,300 passing tests across unit, integration, and build verification. The skill system alone has 39 dedicated tests, and the DAG workflow engine has 53.

How Routing Works (Without You Thinking About It)

You don't need to know that Tessa has 46 agents. You don't need to pick a skill. You just ask a question.

The routing system handles intent matching automatically. "Why is the payment service throwing 500 errors?" activates problem solving. "Review this code for security vulnerabilities" activates the engineering team. "What's the attack surface of this API?" activates the red team (after scope verification). "Create an ADR for this migration" activates the architect agent within problem solving.

The keyword router does the first pass. If confidence is high, the query goes directly to the matching skill. If confidence is below threshold, an LLM classifier does a second pass to determine intent. The two-phase approach means common queries route instantly while ambiguous queries still reach the right specialist.

All 46 agents run on the same underlying GPT-5.4 model family (via intelligent tier routing). The skill system provides the domain expertise, the constraints, and the workflow structure. The model provides the reasoning. Together, they produce output that a single generic prompt cannot match.

What's the Point: 46 Agents, One Click

46 agents is a lot. Most of the value is in the ones you didn't know to ask for.

At 2pm on a Tuesday, when a service is misbehaving, you don't know that you should run an adversarial review on your own diagnosis. You don't know that your trace looks like a security incident if you stand at the right angle. You don't know that the architecture decision you're about to commit would benefit from a steelman pass and a pre-mortem before you push it. Most users won't go open a chat console and type "please apply the S-002 devil's advocate strategy and the S-004 pre-mortem strategy to this RCA." They will just ship.

This is what Mosey is for. Mosey is the friendly companion in our 3D environment who lives next to whatever you're looking at. Click him and a small menu opens with three buckets: Help, Quick Read, and Deep Analysis. The Deep Analysis bucket is where the 46 agents become a one-click experience.

Standing in front of a trace? Deep Analysis offers Investigate with me (problem solving), Run a full RCA report (problem solving + adversary), and Audit my LLM spend (engineering team).
Standing in front of a service? Deep Analysis offers Threat-model this service (engineering team, STRIDE), Audit auth (engineering team + red team), and Plan a security review (engineering team).
Standing in front of your code workspace? Deep Analysis offers Draft an ADR (problem solving, architect), Compare approaches (problem solving, analyst), Find related ADRs (problem solving, synthesizer), and Audit a pattern across the codebase (engineering team).

The red-team and engineering-team skills, in particular, are why Mosey matters. Most observability assistants don't have a red team mode at all. The ones that do bury it three menus deep behind a security console. Mosey puts it on the same shelf as "explain this trace" and lets you reach for it without changing tools, changing tabs, or remembering anyone's name.

46 specialists, in other words, are only as useful as the moment you can summon them. Mosey is the summon. Tessa is the answer. The skills are the depth behind the answer. You don't have to assemble that stack yourself.

See It in Action

Watch Tessa diagnose a production error, trace it through distributed services, and generate a complete root cause analysis document in one conversation: From Error to Full RCA in One Conversation.

Or see the full 3D environment where Tessa operates: Introduction to Immersive APM.

To try it yourself, start free and populate your grid with realistic telemetry:

go install github.com/ImmersiveFusion/if-opentelemetry-tracegen/cmd/tracegen@latest
tracegen -endpoint otlp.iapm.app:443 -headers "api-key=YOUR_KEY" -complexity light

Then ask Tessa anything. She'll route to the right specialist.

Enter the World of Your Application®

Start Free. Immersive. AI-guided. Full-stack observability. Enter the World of Your Application®.

Dan Kowalski

Father, technology aficionado, gamer, Gridmaster

About Immersive Fusion

Immersive Fusion (immersivefusion.com) is pioneering the next generation of observability by merging spatial computing and AI to make complex systems intuitive, interactive, and intelligent. As the creators of IAPM, we deliver solutions that combine web, 3D/VR, and AI technologies, empowering teams to visualize and troubleshoot their applications in entirely new ways. This approach enables rapid root-cause analysis, reduces downtime, and drives higher productivity, transforming observability from static dashboards into an immersive, intelligent experience. Learn more about or join Immersive Fusion on LinkedIn, Mastodon, Bluesky, X, YouTube, Facebook, Instagram, GitHub, Twitch, Discord.

Press inquiries: press@immersivefusion.com.

Streamlined Setup

Simple integration

Cloud-native and open source friendly

Rapid Root Cause Analysis

Intuitive tooling

Find answers in a single glance. Know the health of your application

AI Powered

AI Assistant by your side

Unlock the power of AI for assistance and resolution

Intuitive Solutions

Conventional and Immersive

Expert tools for every user:
DevOps, SRE, Infra, Education

info@immersivefusion.com

Email

Chat right from the web site

Online chat

888-992-3429

Immersive Blogs

Publications about innovation and new functionality.