Table of Contents
rnv — RELAX NG Compact Syntax Validator in C
rnv
{ -q | -p | -c | -s | -v | -h } grammar.rnc document.xml
The options are:
-q
names of files being processed are not printed; in error messages, expected elements and attributes are not listed;
-n
NUM
sets the maximum number of reported expected elements
and attributes, -q
sets this to 0 and can
be overriden;
-p
copies the input to the output;
-c
if the only argument is a grammar, checks the grammar and exits;
-s
uses less memory and runs slower;
-v
prints version number;
-h
displays usage summary and exits.
This tool has the following limitations:
RNV assumes that the encoding of the syntax file is UTF-8.
Support for XML Schema Part 2: Datatypes is partial.
The schema parser does not check that all restrictions are obeyed, in particular, restrictions 7.3 and 7.4 are not checked.
RNV for Win32 platforms is a Unix program compiled on
Win32. It expects file paths to be written with normal
slashes; if a schema is in a different directory and
includes or refers external files, then the schema's path
must be written in the Unix way for the relative paths to
work. For example, under Windows, rnv that uses
..\schema\docbook.rnc
to validate
userguide.dbx
should be
invoked as
rnv.exe ../schema/docbook.rnc userguide.dbx
arx — Automatically determine the type of a document from its name and contents
arx
{ -n | -v | -h } document.xml arx.conf {arx.conf}
ARX either prints a string corresponding to the document's type or nothing if the type cannot be determined. The options are:
-n
turns off prepending base path of the configuration file to the result, even if it looks like a relative path (useful when the configuration file and the grammars are in separate directories, or for association with something that is not a file);
-v
prints version number;
-h
displays usage summary and exits.
The configuration file must conform to the following grammar:
arx = grammars route* grammars = "grammars" "{" type2string+ "}" type2string = type "=" literal type = nmtoken route = match|nomatch|valid|invalid match = "=~" regexp "=>" type nomatch = "!~" regexp "=>" type valid = "valid" "{" rng "}" "=>" type invalid = "!valid" "{" rng "}" "=>" type literal=string in '"', '"' inside must be prepended by '\' regexp=string in '/', '/' inside must be prepended by '\' rng=Relax NG Compact Syntax Comments start with # and continue till the end of line.
Rules are processed sequentially, the first matching rule determines the file's type. RELAX NG templates are matched against file contents, regular expressions are applied to file names. The sample below associates documents with grammars for XSLT, DocBook or XSL FO.
grammars { docbook="docbook.rnc" xslt="xslt.rnc" xslfo="fo.rnc" } valid { start = element (book|article|chapter|reference) {any} any = (element * {any}|attribute * {text}|text)* } => docbook !valid { default namespace xsl = "http://www.w3.org/1999/XSL/Transform" start = element *-xsl:* {not-xsl} not-xsl = (element *-xsl:* {not-xsl}|attribute * {text}|text)* } => xslt =~/.*\.xsl/ => xslt =~/.*\.fo/ => xslfo
ARX can also be used to link documents to any type of information or processing.
rvp — RELAX NG Validation Pipe
rvp
{ -q | -s | -v | -h } grammar.rnc
The options are:
-q
returns only error numbers, suppresses messages;
-s
uses less memory and runs slower;
-v
prints version number;
-h
displays usage summary and exits.
RVP is abbreviation for Relax NG Validation Pipe. It reads validation primitives from the standard input and reports result to the standard output; it's main purpose is to ease embedding of a RELAX NG validator into various languages and environment. An application would launch RVP as a parallel process and use a simple protocol to perform validation. The protocol, in BNF, is:
query ::= ( quit | start | start-tag-open | attribute | start-tag-close | text | end-tag) z. quit ::= "quit". start ::= "start" [gramno]. start-tag-open ::= "start-tag-open" patno name. text ::= ("text"|"mixed") patno text. end-tag ::= "end-tag" patno name. response ::= (ok | er | error) z. ok ::= "ok" patno. er ::= "er" patno erno. error ::= "error" patno erno error. z ::= "\0" .
RVP assumes that the last colon in a name separates the local part from the namespace URI (it is what one gets if specifies : as namespace separator to Expat).
Error codes can be grabbed from rvp
sources by grep _ER_
*.h
and OR-ing them with corresponding
masks from erbit.h
. Additionally, error
0 is the protocol format error.
Either er or error responses are returned, not both;
-q
chooses between concise and verbose
forms (invocation syntax described later).
start passes the index of a grammar (first grammar in the list of command-line arguments has number 0); if the number is omitted, 0 is assumed.
quit is not opposite of start; instead, it quits RVP.
To assist embedding RVP, samples in Perl
(tools/rvp.pl
) and Python
(tools/rvp.py
) are provided. The scripts
use Expat wrappers for each of the languages to parse documents;
they take a RELAX NG grammar (in the compact syntax) as the
command line argument and read the XML from the standard input.
For example, the following commands validate rnv.dbx
against
docbook.rnc
:
perl rvp.pl docbook.rnc < rnv.dbx python rvp.py docbook.rnc < rnv.dbx
The scripts are kept simple and unobscured to illustrate the technique, rather than being designed as general-purpose modules. Programmers using Perl, Python, Ruby and other languages are encouraged to implement and share reusable RVP-based components for their languages of choice.