Data sources

Data sources provide items and layouts. A Nanoc site has one or more data sources. By default, a site uses the filesystem data source.

The site configuration contains the configurations for each used data source. For example, this is the data source configuration for the Nanoc website:

data_sources:
  -
    type: filesystem
  -
    type: cli
    items_root: /doc/reference/commands/

The data source configuration is a list of hashes. Each hash can have the following keys:

type: The identifier of data source to use
items_root: The root where items should be mounted (optional)
layouts_root: The root where layouts should be mounted (optional)

The items_root and layouts_root values will be prefixed to the identifiers of any items and layouts, respectively, obtained from this data source. For example, a data source might provide an item with identifier /denis.md, which, when the items_root is set to /people, will become /people/denis.md.

Note that individual data sources can have more configuration options.

The filesystem data source

Nanoc comes with a filesystem data source implementation, which, as the name suggests, loads items and layouts from the filesystem. More specifically, it loads items from the content/ directory, and layouts from the layouts/ directory.

% tree content
content
├── 404.html
├── about.md
├── assets
│   └── style
│       ├── print.scss
│       └── screen.scss
├── contributing.md
└── index.md

% tree layouts
layouts
├── article.erb
├── default.erb
└── home.erb

The attributes for an item or a layout are typically stored in the metadata section or frontmatter, inside the file itself. The metadata is contained within a pair of triple dashes, like this:

---
full_title: "Nanoc: a static-site generator written in Ruby > home"
short_title: "Home"
has_raw_layout: true
---

Main content goes here…

Attributes can also be stored in a separate file (the “meta file”) with the same base name but with the .yaml extension. This is necessary for binary items. For example, the following two files correspond to a single item; the metadata is stored in the YAML file:

% tree content/assets/images
images
├── dataflow.png
└── dataflow.yaml

The identifier of items and layouts are obtained by taking the filename and stripping off everything up until the content or layouts directory, respectively. For example, the /Users/denis/stoneship/content/about.md file will have the identifier /about.md.

The filesystem data source adds the following attributes to all items:

:filename
:content_filename: The content filename. For example, given the files content/about.md and content/about.yaml, then the content filename will be the absolute path to content/about.md. Can be nil.
:meta_filename: The filename of the meta file, containing the item’s attributes. For example, given the files content/about.md and content/about.yaml, then the content filename will be the absolute path to content/about.yaml. Can be nil.
:extension: The extension of the content filename, if any. Can be nil.
:mtime: The modification time of the content filename or the meta filename, whichever is the most recent.

Configuration

content_dir: content
layouts_dir: layouts
encoding: utf-8
extra_files:
  - "**/.htaccess"

The content_dir option contains the path to the directory where the content is stored. By default, it is content.

The layouts_dir option contains the path to the directory where the layouts are stored. By default, it is layouts.

The encoding option sets the encoding used for reading files. It should be a value understood by Ruby’s Encoding. If no encoding is set in the configuration, one will be inferred from the environment.

The extra_files option contains filename glob patterns and tells Nanoc to load files that would by default be ignored. For example, even though the filesystem data source ignores hidden files by default, the sample configuration above will make sure that .htaccess files will be loaded.

Writing data sources

Here is an example data source implementation that provides a single item with the identifier /release-notes.md and containing the content of the NEWS.md file of Nanoc:

class ReleaseNotesDataSource < Nanoc::DataSource
  identifier :release_notes

  def items
    gem_path = Bundler.rubygems.find_name('nanoc').first.full_gem_path
    content = File.read("#{gem_path}/NEWS.md")

    item = new_item(
      content,
      { title: 'Release notes' },
      Nanoc::Identifier.new('/release-notes.md'),
    )

    [item]
  end
end

Each data source has an identifier. This identifier is used in the configuration file to specify which data source should be used to fetch data. In the example above, the identifier is release_notes.

The #items and #layouts methods return a collection of items and layouts, respectively. To instantiate items and layouts, use #new_item and #new_layout, respectively. These methods take three arguments:

content_or_filename: The content of the item or layout, or a filename if the item is a binary item.
attributes: A hash with attributes of this item or layout.
identifier: A Nanoc::Identifier instance, or a string that will be converted to a full identifier. See the Identifiers and patterns page for details.

You can pass the following options to #new_item and #new_layout:

:binary (defaults to false): A boolean that indicates whether or not this item is a binary item. (Only applies to items; not applicable to layouts.)
:checksum_data (defaults to nil): Data to be used for generating a checksum. This can be any Ruby object, though it will typically be a String. If unspecified, the checksum will be generated from the content and attributes. Passing in custom checksum data can lead to a speed-up, provided that the calculation of the checksum data is not slow.

The configuration for this data source is available in @config.

If a data source needs to do work before data becomes available, such as establishing a connection, it can do so in the #up method. The #down method can be used to undo the work, such as tearing down the connection. Here is an example data source that reads from an SQLite database:

class HRDataSource < ::Nanoc::DataSource
  identifier :hr

  def up
    @db = Sequel.sqlite('employees.db')
  end

  def down
    @db.disconnect
  end

  def items
    @db[:employees].map do |row|
      new_item(
        row[:bio],
        row,
        "/employees/#{row[:id]}/"
      )
    end
  end
end

See the Using external sources page for more information on using external data sources such as databases.