Agent Skills: The Functional Blueprint for AI Agents

TL;DR: A Skill is essentially a function that runs on an LLM — it encapsulates a fixed workflow, defines inputs/outputs, and can be reused across any AI Agent. This post breaks down what Skills really are, their design principles, and practical engineering guidelines for building them.

Background

The Problem with Amnesic Agents

The Agentic AI space keeps minting new concepts, and Skills has been the most discussed — and in my view, the most important — over the past few months. This post documents my journey exploring Skills: what I learned, how I understand them, and what I’ve put into practice.

First Principles of Skills

First Principles: Skills as Functions for LLMs

A Skill is fundamentally a function in programming.

In traditional programming, we abstract repetitive business logic into functions for reuse. Skills do exactly the same thing, except the executor is an LLM instead of a CPU. A function runs on hardware; a Skill runs on a language model. This means core principles of modular programming — single responsibility, clean interfaces, composability — apply directly to Skill design.

What Is a Skill?

Official definition: A Skill is a simple folder that encapsulates a specific task or a set of workflow instructions. It lets an Agent learn once and reuse your preferences, processes, and domain knowledge across all future interactions.

Standard file structure:

your-skill-name/
├── SKILL.md             # Required: core instruction file with YAML metadata + workflow
├── scripts/             # Optional: executable scripts (Python, Bash, etc.)
├── references/          # Optional: supplementary docs loaded on demand
└── assets/              # Optional: templates, icons, static resources

When Do You Need a Skill?

If we treat prompts as natural-language programming, then Skills answer the question: “When should I extract a function?”

When a similar pattern keeps appearing in your workflow, abstract it into a Skill.

Example: I have a workflow for generating full-stack technical documents — it pulls a PRD from Feishu, fetches Figma designs, reads the codebase, and outputs a structured doc. This breaks down into four independent Skills:

Fetch Feishu Doc Skill — downloads the PRD in an AI-readable format
Fetch Figma Design Skill — parses design files for AI consumption
Fetch Codebase Skill — pulls the target repository
Write Tech Doc Skill — generates a professional document from the above inputs

Each Skill is independently reusable, and together they form an end-to-end document generation pipeline. Without Skills, you’d have to teach the Agent every step from scratch each time — like onboarding a new hire for every single task.

The Workflow Vision: Skill Chaining

Skill vs MCP

Defining the Boundaries: Skill vs MCP

Many people confuse Skills with MCP, or assume Skills will replace MCP. They operate on entirely different dimensions.

A Skill is the recipe. MCP is the kitchen.

MCP provides the capability layer — which systems the Agent can connect to and what actions it can perform. Think stoves, knives, and refrigerators.
Skill provides the methodology — what to do when, in what order, how to handle edge cases, and how to validate the result.

Skills frequently depend on one or more MCPs. They collaborate, not compete.

How Skills Work Under the Hood

Progressive Disclosure: Load Context on Demand

Progressive Disclosure

The key difference between a Skill and a long prompt is the loading strategy: expose the minimum necessary information first, then expand layer by layer once relevance is confirmed.

Layer 1 (YAML Frontmatter): Always loaded in the system prompt. Contains only the name, purpose, and trigger conditions. Its job is to tell the model: “When should you think of me?”
Layer 2 (SKILL.md Body): Loaded only when the Agent determines the Skill is relevant. Contains the full instructions and workflow.
Layer 3 (Supplementary Files): Detailed docs, examples, and templates in references/ or scripts/, loaded only when the workflow specifically requires them.

The core value: accumulate extensive experience, but surface only what’s needed right now.

Single Responsibility: Design Like a Microservice

The Microservice Mindset

A Skill should not be an “all-in-one manager.” It should be a clearly bounded workflow unit.

Real-world tasks are often multi-stage. For incident response, you might need a “Log Query” Skill to locate anomalies, a “Change Audit” Skill to compare recent deployments, and a “Conclusion Summary” Skill to write the report. If every Skill tried to do everything end-to-end, you’d end up with overlapping, bloated prompts rather than stronger capabilities.

A good Skill owns only the part it does best and composes well with others.

Plain Text, Zero Lock-In

Skills are plain text files (Markdown + helper scripts), bound to no specific tool, framework, model, or deployment environment. As long as the target environment supports the Skill spec and has the required dependencies, the Skill just works. This makes experience transferable and portable.

How to Design a Good Skill

Engineering Principles

Atomic & Single Responsibility: One Skill, one well-defined job. Avoid “Swiss Army knife” Skills — split them into smaller units.
Stable I/O Contract: Trigger conditions and outputs should be predictable. The description field is the most important part of this contract.
Idempotency: Operations with side effects (create, delete) must be safe to run multiple times. Include “already exists?” checks.
Observability: Define log output formats for key steps to enable quick debugging.

YAML Metadata: The Trigger Brain

The YAML Front Matter in SKILL.md directly determines whether and when the Agent loads your Skill.

Minimum format:

---
name: your-skill-name
description: What it does. Use when user asks to [specific phrases].
---

The description must communicate two things: what this Skill does, and when to use it.

Good examples:

1
2

# Specific with trigger phrases
description: Analyzes Figma design files and generates developer handoff documentation. Use when user uploads .fig files, asks for "design specs" or "design-to-code handoff".

Bad examples:

1 2	# Too vague, no trigger conditions description: Helps with projects.

File Naming Conventions

Folders use kebab-case, e.g., notion-project-setup
The core file must be named SKILL.md (case-sensitive)
No README.md inside Skill folders

Evaluation and Testing

QA for Agents: The Evaluation Matrix

Like traditional software testing, Skill validation needs three layers:

Trigger Tests — Does the Skill activate in the right scenarios?
Functional Tests — Are API calls and outputs correct?
Performance Comparison — Does the Skill actually improve efficiency over manual prompts?

Best Practice: How to Create Skills Quickly

My current workflow: Start a task → complete it with Prompt + MCP → use skill-creator to generate a Skill → reuse on similar tasks → continuously improve.

The key is to run the process manually first, confirm it works, then abstract it into a Skill — rather than designing in a vacuum.

Skills are the function abstraction of the Agentic AI era. Mastering Skill design is like mastering function design — it determines how much of your experience you can crystallize into reusable capabilities, rather than starting from scratch every time.

🌑

☀️

Hi Folks.