xml4r
diff --git a/‎.github/workflows/docs.yml‎
Lines changed: 66 additions & 0 deletions b/‎.github/workflows/docs.yml‎
Lines changed: 66 additions & 0 deletions
diff --git a/‎docs/architecture/memory.md‎
Lines changed: 88 additions & 0 deletions b/‎docs/architecture/memory.md‎
Lines changed: 88 additions & 0 deletions
diff --git a/‎docs/architecture/registry.md‎
Lines changed: 75 additions & 0 deletions b/‎docs/architecture/registry.md‎
Lines changed: 75 additions & 0 deletions
diff --git a/‎docs/getting_started.md‎
Lines changed: 88 additions & 0 deletions b/‎docs/getting_started.md‎
Lines changed: 88 additions & 0 deletions
diff --git a/‎docs/index.md‎
Lines changed: 35 additions & 0 deletions b/‎docs/index.md‎
Lines changed: 35 additions & 0 deletions
@@ -0,0 +1,66 @@
+name: Deploy Docs
+
+on:
+  push:
+    branches: [master]
+    paths:
+      - 'docs/**'
+      - 'lib/**'
+      - 'ext/**'
+      - 'zensical.toml'
+      - '.github/workflows/docs.yml'
+  workflow_dispatch:
+
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+concurrency:
+  group: pages
+  cancel-in-progress: false
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.12'
+
+      - name: Set up Ruby
+        uses: ruby/setup-ruby@v1
+        with:
+          ruby-version: '4.0'
+          bundler-cache: false
+
+      - name: Install zensical
+        run: pip install zensical
+
+      - name: Copy changelog to docs
+        run: cp CHANGELOG.md docs/changelog.md
+
+      - name: Build guide docs
+        run: zensical build --clean
+
+      - name: Build API reference
+        run: rdoc --format aliki --output site/reference --title 'LibXML Ruby API' --line-numbers --charset=utf-8 --exclude lib/xml.rb --exclude lib/xml/libxml.rb --main README.md ext/**/libxml.c ext/**/ruby_xml.c ext/**/*.c lib/**/*.rb README.md
+
+      - name: Upload artifact
+        uses: actions/upload-pages-artifact@v3
+        with:
+          path: site
+
+  deploy:
+    needs: build
+    runs-on: ubuntu-latest
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+    steps:
+      - name: Deploy to GitHub Pages
+        id: deployment
+        uses: actions/deploy-pages@v4
@@ -0,0 +1,88 @@
+# Memory Management
+
+libxml-ruby automatically manages memory for the underlying libxml2 C library. This page explains the ownership model and how the bindings keep Ruby objects and libxml2 C structures in sync.
+
+## Ownership Model
+
+libxml2 has a simple ownership rule: an `xmlDocPtr` owns the tree attached to it, and `xmlFreeDoc` frees the document and the entire attached tree. When code unlinks a node with `xmlUnlinkNode`, that detached subtree is no longer document-owned and must either be reattached or freed with `xmlFreeNode`.
+
+libxml-ruby sits on top of that model. In the normal case, the document is the owner. Ruby node and attr objects do not own the libxml node or attr they point at. They are references into libxml-owned memory, and their mark functions keep the owning Ruby document alive while Ruby still has live references into the tree.
+
+In the diagram below:
+
+- solid lines mean `owns`
+- blue dashed lines mean `references` a libxml C object
+- red dashed lines mean `mark`, which is a Ruby-to-Ruby GC reference
+
+```mermaid
+flowchart TB
+  DocWrap["Ruby XML::Document"]
+  XDoc["xmlDocPtr"]
+  NodeWrap["Ruby XML::Node"]
+  XNode["xmlNodePtr"]
+  AttrWrap["Ruby XML::Attr"]
+  XAttr["xmlAttrPtr"]
+
+  DocWrap -->|owns| XDoc
+  XDoc -->|owns| XNode
+  XNode -->|owns| XAttr
+
+  NodeWrap -.references.-> XNode
+  AttrWrap -.references.-> XAttr
+  NodeWrap -.mark.-> DocWrap
+  AttrWrap -.mark.-> DocWrap
+  DocWrap ~~~ NodeWrap
+  DocWrap ~~~ AttrWrap
+  NodeWrap ~~~ AttrWrap
+
+  classDef ruby fill:#f4a0a0,stroke:#8b1f1b,stroke-width:2px;
+  classDef xml fill:#e8f1ff,stroke:#5b84c4,stroke-width:2px;
+  class DocWrap,NodeWrap,AttrWrap ruby;
+  class XDoc,XNode,XAttr xml;
+  linkStyle 3,4 stroke:#5b84c4,stroke-width:2px,stroke-dasharray: 6 4;
+  linkStyle 5,6 stroke:#cc342d,stroke-width:2px,stroke-dasharray: 6 4;
+```
+
+The solid ownership chain is the important part. `XML::Document` owns the `xmlDocPtr`. The `xmlDocPtr` owns the tree, and the `xmlNodePtr` owns its attrs. The dashed lines are references, not ownership. The blue dashed edges mean Ruby objects reference libxml objects. The red dashed `mark` edges mean a live Ruby node or attr keeps the Ruby document alive during GC so the underlying tree is not freed while Ruby still references it.
+
+## Detached Root Nodes
+
+[Detached nodes](../xml/nodes.md#detached-nodes) are the one exception to the document-owns-everything model. A newly created node is Ruby-owned until it is inserted into a document tree. Removing a node transfers ownership back to Ruby.
+
+Internally, this is managed by `rxml_node_manage` (Ruby takes ownership), `rxml_node_unmanage` (libxml takes ownership), and `rxml_node_free` (frees a detached node on GC).
+
+## Object Identity
+
+Because temporary wrappers are created on demand, accessing the same node twice may return different Ruby objects:
+
+```ruby
+child1 = node.children[0]
+child2 = node.children[0]
+
+child1 == child2     # => true  (same underlying node)
+child1.equal?(child2) # => false (different Ruby objects)
+```
+
+Use `==` or `eql?` to compare nodes, not `equal?`.
+
+Documents and detached root nodes do maintain identity through the [registry](registry.md) — retrieving the same document or detached root always returns the same Ruby object.
+
+## Preventing Premature Collection
+
+Keep a reference to the document (or a managed root node) as long as you use any of its nodes:
+
+```ruby
+# Safe - doc stays in scope
+doc = XML::Parser.file('data.xml').parse
+nodes = doc.find('//item')
+nodes.each { |n| process(n) }
+
+# Risky - doc may be collected
+nodes = XML::Parser.file('data.xml').parse.find('//item')
+GC.start  # doc could be freed here
+nodes.first.name  # potential crash
+```
+
+## GC Sweep Order
+
+During garbage collection (or at program exit), Ruby does not guarantee the order in which objects are freed. The document object is almost always freed before any child node wrappers. This is safe because child node wrappers are non-owning — they have no free function. The document's free function calls `xmlFreeDoc`, which recursively frees the entire tree. The child wrappers simply become stale and are collected without action.
@@ -0,0 +1,75 @@
+# Pointer Registry
+
+The bindings need to map libxml2 C pointers back to their Ruby wrapper objects. This is used for two purposes:
+
+1. **Object identity** - returning the same Ruby object when the same C pointer is encountered again (documents and detached root nodes)
+2. **GC reachability** - mark functions look up the owning Ruby document to keep it alive while Ruby references exist into the tree
+
+## Design
+
+The registry is a pointer-keyed `st_table` in `ruby_xml_registry.c` with three operations:
+
+```c
+void  rxml_registry_register(void *ptr, VALUE obj);
+void  rxml_registry_unregister(void *ptr);
+VALUE rxml_registry_lookup(void *ptr);   /* Qnil on miss */
+```
+
+The registry is **not** a GC root. It does not keep objects alive. Objects stay alive through the normal mark chains — mark functions look up the registry instead of holding direct references.
+
+## What Gets Registered
+
+Only objects that own their underlying C structure are registered:
+
+| C pointer | Ruby wrapper | Registered when |
+|-----------|-------------|-----------------|
+| `xmlDocPtr` | `XML::Document` | Document is created or parsed |
+| detached root `xmlNodePtr` | `XML::Node` | Node is created or detached via `remove!` |
+
+Document-owned child nodes are **not** registered. They are lightweight, non-owning wrappers that get fresh Ruby objects each time they are accessed.
+
+## How Mark Functions Use It
+
+When Ruby's GC runs the mark phase, node and attr mark functions look up the owning document through the registry:
+
+```mermaid
+flowchart TD
+  Registry["internal registry"]
+  DocWrap["Ruby XML::Document"]
+  XDoc["xmlDocPtr"]
+  DetachedWrap["Detached Ruby XML::Node"]
+  DetachedNode["detached root xmlNodePtr"]
+  ChildWrap["Ruby XML::Node"]
+  ChildNode["document-owned xmlNodePtr"]
+
+  DocWrap -->|owns| XDoc
+  XDoc -->|owns| ChildNode
+  DetachedWrap -->|owns| DetachedNode
+
+  ChildWrap -.references.-> ChildNode
+  ChildWrap -.mark.-> DocWrap
+
+  XDoc -.references.-> Registry
+  DetachedNode -.references.-> Registry
+  Registry -.references.-> DocWrap
+  Registry -.references.-> DetachedWrap
+
+  classDef ruby fill:#f4a0a0,stroke:#8b1f1b,stroke-width:2px;
+  classDef xml fill:#e8f1ff,stroke:#5b84c4,stroke-width:2px;
+  classDef registry fill:#f5ebcf,stroke:#b89632,stroke-width:2px;
+  class DocWrap,DetachedWrap,ChildWrap ruby;
+  class XDoc,DetachedNode,ChildNode xml;
+  class Registry registry;
+  linkStyle 3,5,6 stroke:#5b84c4,stroke-width:2px,stroke-dasharray: 6 4;
+  linkStyle 4,7,8 stroke:#cc342d,stroke-width:2px,stroke-dasharray: 6 4;
+```
+
+For an attached node, the mark function reads `xnode->doc` (maintained by libxml2), looks up the document in the registry, and marks the Ruby document object. For a detached subtree, it walks to the root via parent pointers, looks up the root in the registry, and marks it.
+
+## Lifecycle
+
+Registered pointers must be unregistered before the underlying C structure is freed:
+
+- `rxml_document_free` unregisters the `xmlDocPtr` before calling `xmlFreeDoc`
+- `rxml_node_free` unregisters the detached root before calling `xmlFreeNode`
+- `rxml_node_unmanage` unregisters when a detached node is attached to a document (libxml takes ownership)
@@ -0,0 +1,88 @@
+# Getting Started
+
+## Requiring the Library
+
+There are several ways to load libxml-ruby:
+
+```ruby
+# Recommended - keeps everything under the LibXML namespace
+require 'libxml-ruby'
+document = LibXML::XML::Document.new
+```
+
+```ruby
+# Convenience - mixes LibXML into the global namespace
+require 'xml'
+document = XML::Document.new
+```
+
+```ruby
+# In your own namespace
+require 'libxml-ruby'
+
+module MyApp
+  include LibXML
+
+  class Processor
+    def parse(file)
+      XML::Document.file(file)
+    end
+  end
+end
+```
+
+## Choosing a Parser
+
+libxml-ruby provides four parsers, each suited to different use cases:
+
+| Parser | Best For |
+|--------|----------|
+| `XML::Parser` | General-purpose DOM parsing. Loads the entire document into a tree. |
+| `XML::Reader` | Large documents that don't fit in memory. Pull-based streaming API. |
+| `XML::HTMLParser` | Parsing HTML documents (tolerates malformed markup). |
+| `XML::SaxParser` | Event-driven parsing with callbacks. |
+
+## Data Sources
+
+All parsers support multiple data sources:
+
+```ruby
+# From a file
+doc = XML::Parser.file('data.xml').parse
+
+# From a string
+doc = XML::Parser.string('<root/>').parse
+
+# From an IO object
+File.open('data.xml') do |io|
+  doc = XML::Parser.io(io).parse
+end
+```
+
+## A Complete Example
+
+```ruby
+require 'libxml-ruby'
+
+# Parse
+doc = LibXML::XML::Document.file('books.xml')
+
+# Navigate
+root = doc.root
+puts root.name
+
+# Find nodes with XPath
+doc.find('//book[@year > 2000]').each do |book|
+  title = book.find_first('title').content
+  puts title
+end
+
+# Create new content
+new_book = LibXML::XML::Node.new('book')
+new_book['year'] = '2024'
+new_book << LibXML::XML::Node.new('title', 'New Book')
+root << new_book
+
+# Save
+doc.save('books_updated.xml', indent: true)
+```
@@ -0,0 +1,35 @@
+# libxml-ruby
+
+Ruby language bindings for the [GNOME Libxml2](http://xmlsoft.org/) XML toolkit. It is free software, released under the MIT License.
+
+libxml-ruby stands out because of:
+
+* **Speed** - Much faster than REXML
+* **Features** - Full DOM, SAX, Reader, Writer, XPath, validation (DTD, RelaxNG, XML Schema) and more
+* **Conformance** - Passes all 1800+ tests from the OASIS XML Tests Suite
+
+## Quick Example
+
+```ruby
+require 'libxml-ruby'
+
+# Parse a document
+doc = LibXML::XML::Document.file('books.xml')
+
+# Find nodes with XPath
+doc.find('//book').each do |node|
+  puts node['title']
+end
+
+# Validate against a schema
+schema = LibXML::XML::Schema.new('books.xsd')
+doc.validate_schema(schema)
+```
+
+## Requirements
+
+libxml-ruby requires Ruby 3.2 or higher and depends on [libxml2](http://xmlsoft.org/).
+
+## License
+
+libxml-ruby is released under the [MIT License](https://github.com/xml4r/libxml-ruby/blob/master/LICENSE).