Open Knowledge Format in .NET: Building Portable Agent Knowledge

Learn how OKF enables portable, vendor-neutral knowledge systems for AI agents in .NET. Technical architecture, implementation patterns, and enterprise integration.

0

The Knowledge Fragmentation Crisis

If you’ve built more than one AI agent, you’ve felt this: every new agent needs context reassembled from scattered sources. Your documentation lives in Confluence. Your data schemas are in a catalog platform. Your runbooks are in a wiki. Your API contracts are in OpenAPI specs. Your incident postmortems are in a ticket system. Your team knowledge is fragmented across incompatible silos, and each agent rebuild means starting the context assembly from scratch.

This isn’t a documentation problem. It’s an infrastructure problem. As AI models improve, the bottleneck shifts from model capability to context quality. Your agents are constrained not by their reasoning but by how efficiently you can feed them relevant, structured knowledge.

Open Knowledge Format (OKF) solves this by making knowledge portable. Not through another platform. Through elegant simplicity: markdown, YAML, and files. OKF is a vendor-neutral specification that lets you structure knowledge once, version it in Git, and consume it across any agent or system without translation layers, SDKs, or lock-in.

What OKF Actually Is

OKF formalizes what the AI community calls the ‘LLM-wiki pattern.’ The insight, credited to Andrej Karpathy, is straightforward: LLMs don’t get bored, don’t forget cross-references, and can process 15 files in a single pass. That means your knowledge doesn’t need a database or a query layer. It needs to be readable, linkable, and versionable.

The format is minimal by design. Each knowledge item is a markdown file with YAML frontmatter in a hierarchical directory structure. That’s it. No compression schemes. No new runtime. No proprietary format.

The spec requires only one field: type. Everything else is optional and producer-defined. Core fields include title, description, resource, tags, and timestamp, but you add whatever fields make sense for your domain. This is the first design principle: minimally opinionated.

The second principle is producer and consumer independence. You write OKF bundles for your domain (data catalogs, API documentation, runbooks, schemas). An agent consumes them without needing to know how you produced them. No tight coupling. No version hell.

The third principle is format, not platform. OKF is a specification, not a service. You store bundles in Git, on S3, in a container, on a local filesystem. You version them however you version code. You don’t depend on anyone’s platform staying alive.

The Technical Architecture

Let’s look at what an OKF bundle actually looks like. Here’s a simple example: a data schema concept for a customer entity.

---
type: schema
title: Customer
description: Core customer entity representing individuals and organizations
resource: https://api.example.com/schemas/customer
tags:
  - core
  - crm
  - pii
timestamp: 2024-01-15T10:30:00Z
version: 1.2
owner: data-platform-team
---

## Overview

The Customer schema represents both individuals and organizations in our system.

## Fields

- `id`: Unique identifier (UUID)
- `type`: Either "individual" or "organization"
- `name`: Full name or company name
- `email`: Primary contact email
- `created_at`: ISO 8601 timestamp

## Constraints

- Email must be unique
- Type is immutable after creation
- Organizations require tax ID

## Related Concepts

- See: Account (billing relationship)
- See: ContactPerson (individual within organization)

The structure is straightforward. YAML frontmatter holds metadata. Markdown body holds human-readable and agent-parseable content. The hierarchy of files in directories creates your knowledge graph without needing a graph database.

Here’s how you might organize a bundle for a data platform:

data-catalog/
  schemas/
    customer.md
    account.md
    transaction.md
  datasets/
    ga4-events.md
    user-behavior.md
  runbooks/
    data-refresh.md
    incident-response.md
  joins/
    customer-to-account.md
  metadata.yaml

The metadata.yaml at the bundle root declares the bundle itself: its name, version, producer, and any global tags. This lets agents understand the bundle’s scope before consuming individual files.

That’s the entire architecture. No database. No API layer required. Just files, YAML, and markdown that agents can read in a single pass while humans can read and edit in any editor.

Integrating OKF with ASP.NET Core and AI Agents

In practice, you’ll want to serve OKF bundles to agents through an ASP.NET Core API. This keeps your agent code clean and your knowledge sourcing flexible. You can swap backends without changing agent logic.

Here’s a minimal implementation pattern. First, a model to represent an OKF concept:

public class OkfConcept
{
    public string Type { get; set; }
    public string Title { get; set; }
    public string Description { get; set; }
    public string Resource { get; set; }
    public List<string> Tags { get; set; }
    public DateTime Timestamp { get; set; }
    public string Body { get; set; }
    public Dictionary<string, object> Metadata { get; set; }
}

Next, a service that loads OKF bundles from the filesystem and parses them:

public class OkfBundleLoader
{
    private readonly string _bundlePath;

    public OkfBundleLoader(string bundlePath)
    {
        _bundlePath = bundlePath;
    }

    public async Task<OkfConcept> LoadConceptAsync(string conceptPath)
    {
        var filePath = Path.Combine(_bundlePath, conceptPath + ".md");
        var content = await File.ReadAllTextAsync(filePath);

        var parts = content.Split(new[] { "---" }, StringSplitOptions.None);
        if (parts.Length < 3)
            throw new InvalidOperationException("Invalid OKF format");

        var frontmatter = parts[1];
        var body = parts[2].TrimStart();

        var metadata = ParseYaml(frontmatter);
        var concept = new OkfConcept
        {
            Type = metadata["type"]?.ToString(),
            Title = metadata["title"]?.ToString(),
            Description = metadata["description"]?.ToString(),
            Resource = metadata["resource"]?.ToString(),
            Tags = (metadata["tags"] as List<object>)?.Cast<string>().ToList() ?? new(),
            Timestamp = DateTime.Parse(metadata["timestamp"]?.ToString() ?? DateTime.UtcNow.ToString()),
            Body = body,
            Metadata = metadata
        };

        return concept;
    }

    public async Task<List<OkfConcept>> LoadBundleAsync(string bundleName)
    {
        var bundleDir = Path.Combine(_bundlePath, bundleName);
        var concepts = new List<OkfConcept>();

        foreach (var file in Directory.GetFiles(bundleDir, "*.md", SearchOption.AllDirectories))
        {
            var relativePath = Path.GetRelativePath(bundleDir, file);
            var conceptPath = Path.Combine(bundleName, relativePath).Replace(".md", "");
            concepts.Add(await LoadConceptAsync(conceptPath));
        }

        return concepts;
    }

    private Dictionary<string, object> ParseYaml(string yaml)
    {
        var result = new Dictionary<string, object>();
        var lines = yaml.Split(new[] { "
" }, StringSplitOptions.None);

        foreach (var line in lines)
        {
            if (string.IsNullOrWhiteSpace(line)) continue;

            var colonIndex = line.IndexOf(':');
            if (colonIndex == -1) continue;

            var key = line.Substring(0, colonIndex).Trim();
            var value = line.Substring(colonIndex + 1).Trim();

            result[key] = value;
        }

        return result;
    }
}

Now expose this through a controller:

[ApiController]
[Route("api/knowledge")]
public class KnowledgeController : ControllerBase
{
    private readonly OkfBundleLoader _loader;

    public KnowledgeController(OkfBundleLoader loader)
    {
        _loader = loader;
    }

    [HttpGet("bundles/{bundleName}")]
    public async Task<ActionResult<List<OkfConcept>>> GetBundle(string bundleName)
    {
        var concepts = await _loader.LoadBundleAsync(bundleName);
        return Ok(concepts);
    }

    [HttpGet("concepts/{conceptPath}")]
    public async Task<ActionResult<OkfConcept>> GetConcept(string conceptPath)
    {
        var concept = await _loader.LoadConceptAsync(conceptPath);
        return Ok(concept);
    }
}

Register this in your startup:

services.AddSingleton(new OkfBundleLoader("/var/okf-bundles"));

Now your agent can fetch knowledge through a simple HTTP call. The agent doesn’t care that the knowledge is stored as markdown files. The API is the contract.

Using OKF with Microsoft Agent Framework

When you’re building agents with Microsoft Agent Framework, OKF knowledge fits naturally into the context assembly pattern. Here’s a typical flow:

An agent needs to understand your data schemas. Instead of embedding schemas in the system prompt or doing expensive retrieval each time, you fetch the relevant OKF bundle once and inject it into the agent’s context. The agent processes markdown that’s already formatted for readability and parsing.

Your agent’s system prompt might include:

You are a data platform assistant. You help users understand schemas, suggest queries, and debug data issues.

You have access to the following knowledge bundles:
- data-schemas: Core entity definitions
- data-joins: Relationships between entities
- runbooks: Procedures for common tasks

When answering questions, reference specific concepts by their title and type. If a concept has a resource URL, include it in your response.

Then, when a user asks about the customer schema, your agent framework code fetches it:

var concept = await _loader.LoadConceptAsync("schemas/customer");
var contextMessage = $"Reference this schema definition:

{concept.Body}";
// Add to agent context

The agent sees structured, version-controlled knowledge. When you update the schema in your Git repository, the next agent execution uses the new version. No cache invalidation. No platform sync delays.

Why OKF Matters for Enterprise AI Infrastructure

The real value emerges at scale. Consider a typical enterprise AI scenario: you’re building multiple agents across different teams. Each team has its own data, APIs, and runbooks. Without OKF, each agent development cycle recreates context from scattered sources.

With OKF, you establish a single source of truth for knowledge. Your data team publishes schemas as OKF. Your platform team publishes APIs as OKF. Your ops team publishes runbooks as OKF. Agents consume these bundles through a standard interface. When knowledge changes, you update one file in Git, and all agents see the new version.

More importantly, OKF survives platform migrations. If you switch from one AI platform to another, your knowledge doesn’t get locked into the old system. It’s just files. Just YAML. Just markdown. You can consume it anywhere.

This addresses a real cost that most organizations don’t track: the redundant AI development that happens because knowledge is trapped. Every time you build a new agent or migrate to a new platform, you’re reassembling context instead of reusing it. OKF eliminates that waste.

Structuring Knowledge for Agent Consumption

Not all markdown is equally agent-friendly. Here are patterns that work well in practice:

Schemas: Use consistent field descriptions. Include type information and constraints. Example: a Customer schema lists fields with types (UUID, STRING, TIMESTAMP), constraints (email must be unique), and relationships (links to Account). Agents parse this to validate data and construct queries.

Runbooks: Structure as sequential steps with clear preconditions and postconditions. Example: a data-refresh runbook lists prerequisites (database credentials, network access), steps (connect, validate, transform, load), and success criteria (row counts match, no errors in logs). Agents can follow or recommend these steps.

Joins: Explicitly state the relationship type and cardinality. Example: customer-to-account is a one-to-many relationship on customer_id. Include join keys and any business rules (only active accounts, only recent transactions). Agents use this to construct queries without guessing.

Incidents: Link to related concepts and include timeline and resolution. Example: a data pipeline incident references the affected schema, the runbook used to resolve it, and the root cause (schema change broke downstream join). Agents learn patterns from incident postmortems.

The key principle: write for both humans and agents. Humans read markdown naturally. Agents parse structure. If you format consistently, both audiences are satisfied.

Practical Patterns

A few patterns emerge from teams using OKF effectively:

Version-controlled data catalogs: Your entire data schema lives in Git. Changes go through pull requests. History is auditable. Agents always see the canonical version.

Agent-designed documentation: Instead of writing documentation then building agents, design documentation for agent consumption first. Humans benefit from the structured format.

Metadata-as-code repositories: Treat your knowledge bundles like code. Use CI/CD to validate schemas. Use linting to ensure consistency. Deploy bundles like you deploy services.

Obsidian vaults as OKF sources: Teams using Obsidian can export vaults as OKF bundles. This bridges personal knowledge management with enterprise AI infrastructure.

OKF vs. Traditional Approaches

Knowledge graphs: Knowledge graphs require a database and a query language. OKF needs only a filesystem. For many use cases, OKF is simpler and more portable. Knowledge graphs excel when you need complex traversals at query time. OKF is better when you want to version knowledge and consume it across systems.

Data catalogs: Data catalogs are platforms that add features like governance, lineage, and discovery. OKF is a format that any platform can consume. Use OKF as your canonical format. Feed it into catalogs if you want catalog features.

Wiki systems: Wikis are human-focused. OKF is structured and version-controlled. Wikis are better for exploratory documentation. OKF is better for machine-parseable knowledge that needs to survive migrations.

Documentation generators: Tools like Swagger or Pydantic generate documentation from code. OKF is the inverse: you write structured knowledge that both humans and machines can read. Use both patterns. Generate OKF from your code where it makes sense. Write OKF for domain knowledge that lives outside code.

Getting Started

Start small. Pick one knowledge domain. Write 5 to 10 OKF concepts. Store them in a Git repository. Build a simple loader like the one above. Integrate it with one agent. See how it feels.

The spec is minimal, so there’s little to learn. The main investment is in thinking about how to structure your knowledge. Once you’ve done that, OKF is just markdown and YAML.

As you scale, you’ll find that version-controlled knowledge becomes infrastructure. Your agents become more consistent. Your knowledge becomes reusable. Your platform migrations become cheaper. Your documentation becomes executable.

That’s the promise of OKF: not a new platform, but a simple format that makes knowledge portable, versionable, and agent-ready. It’s how enterprise AI infrastructure evolves when knowledge becomes the bottleneck.

What is the Open Knowledge Format specification?

OKF is a vendor-neutral specification for structuring knowledge as markdown files with YAML frontmatter in hierarchical directories. It requires only one field (type) and is designed to be minimally opinionated, allowing producers and consumers to work independently without tight coupling or lock-in.

How does OKF differ from a knowledge graph database?

Knowledge graphs require a database and query language. OKF uses only files, YAML, and markdown. OKF is simpler and more portable across systems, making it better for knowledge that needs to survive platform migrations. Knowledge graphs excel when you need complex traversals at query time. Use OKF for versionable, portable knowledge and feed it into graph systems if you need their features.

Can I use OKF with AI agents built on other platforms?

Yes. OKF is a format, not a platform. Because it’s just files and markdown, any system can consume it. You can serve OKF bundles through an API, store them in version control, or embed them in containers. Agents built with any framework can fetch and parse OKF knowledge.

How do I handle knowledge updates across multiple agents?

Store OKF bundles in Git. When you update a concept, commit the change. Agents fetch the latest version from your API or filesystem on each execution. Version control gives you history and rollback capability. If you need multiple versions of the same bundle, use Git branches or semantic versioning in your directory structure.

What happens to my OKF knowledge if I switch AI platforms?

Your knowledge stays intact. OKF is just markdown and YAML files. There’s no proprietary format or vendor lock-in. You can consume the same bundles with a new platform’s agents. This is the core value proposition: knowledge portability without translation layers.

Leave a Reply

Your email address will not be published. Required fields are marked *