Identifiers and patterns

In Nanoc, every item (page or asset) and every layout has a unique identifier: a string derived from the file’s path. A pattern is an expression that is used to select items or layouts based on their identifier.

Identifiers

Identifiers come in two types: the full type, new in Nanoc 4, and the legacy type, used in Nanoc 3.

full
An identifier with the full type is the filename, with the path to the content directory removed. For example, the file /Users/denis/stoneship/content/about.md will have the full identifier /about.md.
legacy
An identifier with the legacy type is the filename, with the path to the content directory removed, the extension removed, and a slash appended. For example, the file /Users/denis/stoneship/content/about.md will have the legacy identifier /about/. This corresponds closely with paths in clean URLs.

The following methods are useful for full identifiers:

identifier.extString

The last extension of this identifier. For example:

Nanoc::Identifier.new('/about.md').ext
# => "md"

Nanoc::Identifier.new('/about.html.erb').ext
# => "erb"
identifier.extsArray of Strings

All extensions of this identifier. For example:

Nanoc::Identifier.new('/about.html.erb').exts
# => ["html", "erb"]
identifier.componentsArray of Strings

Identifier split by slash. For example:

Nanoc::Identifier.new('/software/nanoc.md').components
# => ["software", "nanoc.md"]
identifier.match?(pattern)true, false

True if the identifier matches the pattern (either a String or a Regexp), false otherwise. For example:

Nanoc::Identifier.new('/software/nanoc.md').match?('/software/*')
# => true

Nanoc::Identifier.new('/software/nanoc.md').match?('/soft*')
# => false
identifier.without_extString

Identifier with the last extension removed. For example:

Nanoc::Identifier.new('/software/nanoc.md').without_ext
# => "/software/nanoc"

Nanoc::Identifier.new('/about.html.erb').without_ext
# => "/about.html"
identifier.without_extsString

Identifier with all extensions removed For example:

Nanoc::Identifier.new('/about.html.erb').without_exts
# => "/about"
identifier + stringString

Identifier with the given string appended. For example:

Nanoc::Identifier.new('/software') + '/nanoc'
# => "/software/nanoc"
identifier =~ pat

Truthy if the identifier matches the pattern (either a String or a Regexp), falsy otherwise. For example:

Nanoc::Identifier.new('/software/nanoc.md') =~ '/software/*'
# => 0

The following method is useful for legacy identifiers:

identifier.chopString

Identifier with the last character removed. For example:

identifier = Nanoc::Identifier.new('/about/', type: :legacy)

identifier.to_s
# => "/about/"

identifier.chop
# => "/about"

identifier.chop + '.html'
# => "/about.html"

identifier + 'index.html'
# => "/about/index.html"

Patterns

Patterns are used to find items and layouts based on their identifier. They come in three varieties:

  • glob patterns
  • regular expression patterns
  • legacy patterns

Glob patterns

Glob patterns are strings that contain wildcard characters. Wildcard characters are characters that can be substituted for other characters in an identifier. An example of a glob pattern is /projects/*.md, which matches all files with a md extension in the /projects directory.

Globs are commonplace in Unix-like environments. For example, the Unix command for listing all files with the md extension in the current directory is ls *.md. In this example, the argument to the ls command is a wildcard.

Nanoc supports the following wildcards in glob patterns:

*
Matches any file or directory name. Does not cross directory boundaries. For example, /projects/*.md matches /projects/nanoc.md, but not /projects/cri.adoc nor /projects/nanoc/about.md.
**/
Matches zero or more levels of nested directories. For example, /projects/**/*.md matches both /projects/nanoc.md and /projects/nanoc/history.md.
?
Matches a single character.
[abc]
Matches any single character in the set. For example, /people/[kt]im.md matches only /people/kim.md and /people/tim.md.
{foo,bar}
Matches either string in the comma-separated list. More than two strings are possible. For example, /c{at,ub,ount}s.txt matches /cats.txt, /cubs.txt, and /counts.txt, but not /cabs.txt.

A glob pattern that matches every item is /**/*. A glob pattern that matches every item/layout with the extension md is /**/*.md.

Regular expression patterns

You can use a regular expression to select items and layouts.

For matching identifiers, the %r{…} syntax is (arguably) nicer than the /…/ syntax. The latter is not a good fit for identifiers (or filenames), because all slashes need to be escaped. The \A and \z anchors are also useful to make sure the entire identifier is matched.

An example of a regular expression pattern is %r{\A/projects/(cri|nanoc)\.md\z}, which matches both /projects/nanoc.md and /projects/cri.md.

Legacy patterns

Legacy patterns are strings that contain wildcard characters. The wildcard characters behave differently than the glob wildcard characters.

To enable legacy patterns, set string_pattern_type to "legacy" in the configuration. For example:

string_pattern_type: "legacy"

For legacy patterns, Nanoc supports the following wildcards:

*
Matches zero or more characters, including a slash. For example, /projects/*/ matches /projects/nanoc/ and /projects/nanoc/about/, but not /projects/.
+
Matches one or more characters, including a slash. For example, /projects/+ matches /projects/nanoc/ and /projects/nanoc/about/, but not /projects/.