&ActiveSchema; is a powerful XML schema technology built on &ActiveTags; technologies.
&ActiveSchema; has the ability to select its content models contextually, and to refactor them dynamically. That's why &ActiveSchema;ta are active and much more efficient than other schema technologies.
Moreover, &ActiveSchema; can be used to define reusable active data type libraries that can also be used in &ActiveTags; applications.
The following specifications are part of the
An XML Schema is the expression of some assertions expected on an XML document class. Assertions on XML documents ensure that applications will process them without causing faults. Expressing assertions with schemata ensure that applications developpers will spend most of their time in designing data process and few of their time in controlling them.
Well known schema technologies are :
Name | Syntax style | Type | Editor | Specification location | Elem nb |
---|---|---|---|---|---|
Document Type Definition (DTD) | non XML syntax | model based | W3C | 8 | |
W3C XML Schema (WXS) | XML syntax | 42 | |||
Schematron | rule based | ISO | 19 | ||
Relax NG (RNG) | XML syntax XML compact syntax |
pattern based | OASIS ISO |
28 | |
Newcomer | |||||
&ActiveSchema; (ASL) | XML syntax | active | INRIA | 20 (*) |
The general purposes of a schema technology are :
|
(*) 20 elements used in schema instances + 4 elements used in Schematron, mentioned above, was designed for validation. Unlike other schema technologies, it is not obvious to use it for structured edition. |
Other applications that uses schemata are emerging, such as data binding.
Any schema technology is designed to cover numbers of assertions expressed. However, the existing schema technologies can't express many constraints like the following listed below. Some technologies will cover the feature, others won't ; sometime none.
As shown in the picture below, &ASL; covers constraints types listed above and many others that existing schemata technologies can't.
&ActiveSchema; is an &ActiveTags; module, and ruled by the relevant concepts described in the &ActiveTags; specifications.
&ActiveSchema; is a schema technology based on very simple concepts. Enhanced with &ActiveTags;, &ActiveSchema; deals with schemata problematics with greater efficiency than other schema technologies.
An &ActiveSchema; may be used both for validation and for structured edition, and many other purposes.
With its simple concepts and low number of elements, &ActiveSchema; is easy to learn and easy to use, because the schema follows the structure of the document. An &ActiveSchema; is also friendly human-readable : it's easy to understand at first glance the content model of an element.
Finally, the capabilities of the &ActiveSchema; technology cover the following purposes :
&ActiveSchema; has been designed with the intention to keep XML document classes as is, without structure adjustment on the pretext that a content model can't be expressed with the schema technology choosen. The motivation to design an XML structure must not be lead by any schema technology.
&ActiveSchema; deals with XML documents representing both &ActiveSchema;ta and instances through an abstract data model. XML documents representing &ActiveSchema;ta and instances must be well-formed in conformance with XML 1.0 and must conform to the constraints of XML Namespaces.
An &ActiveSchema; is a flat set of definitions. The materials defined inside an &ActiveSchema; must endorse the same namespace URI, but several storage units (files) may be part of the same schema if they share the same target namespace URI.
A material definition is composed of elementary steps that are processed independently. Steps may be primitive models or step containers for others steps. Each primitive model is processed in three phases; for example, when validating:
Steps that are applied on element definitions are called active content models.
&ActiveSchema; can't constraint comments, processing instructions, and namespaces declarations obviously; however, specific assertions may restrict their usage anyway.
The term "material" is used to represent :
The term "content material" is used to represent :
A candidate material is the material or content material -according to the context- to check with the schema. It may be :
Additionally, a candidate material may hold the place before the first material of a list (the child nodes of an element) or after its last material ("cap candidate").
A schema client handler is a component of an application that uses &ActiveSchema;ta ; it processes lists of allowed material provided by the schema at runtime.
For example, a validator handler checks if the material found in the source document matches a list computed in a given step.
A schema client handler uses callbacks to process lists because it doesn't select the step to apply ; the schema engine does. Anyway, the entry point of an application that processes an &ActiveSchema; is an element, or a document ; such application should define what to do with the callbacks :
Additionally, when an element has been processed, the schema client handler may process its subelements at user option.
This use case illustrates that &ActiveSchema;ta are context dependant.
In this scenario, two companies ACME and EMCA are exchanging XML documents. They are sharing the same base set of schemata, but both are extending it for special purpose usage:
The schema soup consist on a legacy DTD (without namespaces), a Relax NG schema, a brand new &ActiveSchema;, and other well known schemata for XHTML and SVG.
&ActiveSchema; in conjunction with &ActiveTags; offers all means to process such a case very efficiently:
Content models are element content definitions that defines which material content is allowed, when it is allowed, and how many times.
An &ActiveSchema; is a model based schema ; however, unlike other schema technologies, the models defined are active, that is to say that :
For this purpose, content models are divided in elementary checking steps, that may produce a maximum of one of the following primitive model types :
Steps set the scopes of the model types, but can also be used as step containers
(with the
Thus, a content model is processed step by step, each step may be repeated or discarded on behalf of the following step, or on behalf of an interim step. When repeating, its content model may be kept or refactoried.
Finally, additional constraints may be computed to check the validity of an element or to check if an element can be inserted
(with the
A step is an elementary unit of process that consists on drawing up lists of materials available and assertions. A step may be a container step, that may contain substeps, or a primitive content model step, that can't contain substeps.
When the content of an element must be checked, the steps defined in the element definition are evaluated on a global sequence. During this process, after a material (text or element) found within the element has been checked with the current step, the next material to check is then selected. According to its settings, the step used may be reused as is, refactoried, paused, or terminated ; the next step is then used.
In addition to content models,
a step may also be used to draw up lists of attributes,
lists of assertions, and lists of data type matchers.
Attributes lists can't be mixed with content materials lists ;
assertions lists can be draw up in any step ;
data type matchers lists are only found within
attribute definitions (
Steps and material to check are globally progressing on a synchronized reading process.
According to a given step and a given material to check, the following process is applied :
When a content material must be checked for example to test if it is possible to insert an element, the host element definition must be performed step by step until the position of insertion.
Moreover :
A primitive model type is a special step used to establish a list of available material. Once selected, the model type establishes lists of allowed materials and assertions that are transmitted to the host application for candidate material checking. For example, a validator would apply these lists on the material to check (candidate material).
Once a candidate material is selected by the host application, it is used to check if it matches the material of the list :
primitive model type | application | repeating |
---|---|---|
sequence ( |
the first material of the list may match the candidate material | the list is updated |
selection ( |
any material of the list may match the candidate material | |
choice ( |
the list remains the same |
To check if a candidate material matches a choice or a selection, the list is browsed sequencially ; the first item that matches the candidate material is retained. To check if a candidate material matches a sequence, the items are tested sequencially according to the occurrences boundaries.
Once a step matches a candidate material, it may be refactoried on user request if it is reused, or kept as is in the conditions of the repeating mentioned in the table above. When a list is updated while repeating, the use counter of the material is incremented ; the material used is discarded from the list if it is no longer usable, according to the occurrences boundaries set.
Once a step ends, the step container process goes on.
Occurences can be set on steps with the attributes :
Elements that allow using this attributes always use 1 as the default value for both attributes.
The value "unbounded" for the
2 predefined properties have been defined to allow the min occurs value to be based on the max occurs value, or the contrary :
For example, min-occurs="{count(//foo}" max-occurs="{$asl:min-occurs}" is correct.
Occurences can be set only on steps. Sequences can't have occurrences boundaries (occurences are reported on the material referenced inside). Additionally, when sequences are defined... sequentially, they can be merged. A sequence is always a stable list with no occurrences boundaries.
Instead of :
...use the short form :
Occurences may be used in material reference inside select models,
but grouping adjacent select models doesn't express the same model.
In fact, sequence models are also slightly differents when the subactions are involving
the
Occurences
When a candidate element has matched a material that specified occurences, the numbers of occurences are decremented for the next usage.
The &ASL; element definition below mimicks the following familiar DTD declaration :
<!ELEMENT Chapter (Title, ((Content, Chapter*) | Chapter+))>
...where
When involved in a stable step, occurs values are kept unchanged ; when involved in an unstable step, occurs values are actualized.
If a step must be repeating, according to its occurrences boundaries,
its content may be kept as is or refactoried, according to the value of the
Once a primitive content model occurs the minimum times expected, it must exit as soon as the candidate material doesn't match the material, or as soon as the maximum times expected is reached.
Once a container step occurs the minimum times expected, it must exit as soon as the maximum times expected is reached, or as soon as its substeps are no longer in use.
Steps must inform that they were used with a bubble message. A primitive content model was used if a matching occurs. A container step was used if it received a bubble message that indicates that a substep was used.
A list is an ordered set of material ;
as each list item may represent a group of material when a class or type reference is used,
or when a namespace URI reference is used,
a sublist that disables (
Of course, an exception must build a list compliant with its target list (in the example above, only elements are concerned).
An element definition may refer to attributes with the
Thus, attribute definitions may occur on the top level elements of the schema (and shared with all schemata), or directly within an element definition ; the latters may be without a namespace URI because unprefixed attributes are "belonging" to their host element.
Hereafter, the
Now, it refers to a sharable and global attribute :
Within an element definition, more than one attribute reference or inline definition may occur ; attributes lists are separated lists which must be computed in a separate step to content models.
As attributes are unordered inside their host element, attributes references and local definitions
are allowed directly under the
Only the
When validating, once an element definition ends, all its attributes must have been matched, except namespaces declarations that are not checked. The same attribute can't be matched by several lists.
Like with other materials, each list item may have a sublist, and items may be arbitrary enabled or disabled in a top list.
Additionally, items are data typed.
Global attributes are defined with the
Attribute references or inline definition may be specified with occurrences boundaries ; static and runtime values are allowed.
As usual, the default values for both attributes is 1, which denotes that the attribute is mandatory. Hereafter an element definition references an optional attribute :
Element content models may contain element references or text items ; &ActiveSchema; allow to define which text content is enabled, and where it is enabled, even in mixed contents.
Text content list items are very close to attribute values, except that they are unnamed items (however, a convenient way to "name" text content is to use data types) and appear exclusively in primitive content models, exactly like element references. Attribute values and text contents are text values that data types may constraint.
When processing text, comments and processing instructions are ignored, and adjacent texts are merged.
Within primitive content models, the
A whitespace, in the sense of XML, is a text that contains spaces, tabs, and returns (carriage return, linefeed, or both).
Whitespace candidates are discarded in the following conditions : when a content model is defined with elements and texts that can contains whitespaces, if the candidate material is a whitespace followed by an element that matches an item of the content model, then the whitespace candidate is ignored.
A text item uses the same matchers than those used to define data types and those used in attribute values.
A text definition involves the
However, as they may be mixed with element references, only a single matcher can be used at a time, that is to say that 2 text matchers can't be found side by side. When a choice of text matcher is needed, it must be enclosed within an inline type definition, or defined with a type reference. Schemata designers must take care that a step that ends with a text matcher can't be followed by a step that begins with a mandatory text matcher, because the last text matched has been totally consumed by its matcher.
Here is an element that must contain one string beyond a predefined set :
...and another that may contain any string :
A mixed content may also be defined :
A content model may precisely indicates where and which text content is allowed :
...that could match :
...but can't match :
When used directly, the
...could be defined by the following schema that refers to a custom data type :
The definition of this type is shown in the chapter about data types. Notice that a type may be used indifferently in a text content or in an attribute value :
A type may also be defined anonymously (and can be used also for attributes definitions) :
Please refer to the chapter about data types.
Assertions lists are separated lists than can be computed at each step. Once a list of assertions is established, it is applied on the element to check, or transmitted to the host application.
Assertions are additive controls that can't be expressed by content models.
Assertions are defined with the
For example, the following assertion limits the deepest of an element :
The
The
An interim step is an unstable step launched only when its host model matched.
Each time the |
This structure is somewhat unusual in other schema technologies : when a content model is defined within an element referenced, it means that this content model is applied on the children of the candidate element.
The
There is no structure that defines groups of attributes in &ActiveSchema;, but it is possible anyway to select one set or another with an interim step : once an attribute matched, an additional attribute list may be provided.
The following snippet schema express that wether the a, b, and c attributes must be present together, or the d and e attributes must be present together.
An interim step may be advantageously used for complex combination descriptions. It is possible to define an interim step that occurs when an element has been matched, but that draw up an attribute list, or the opposite. It is also possible to define an interim step inside an element or attribute that have been involved in another interim step.
As the
An
The
The &ASL; element definition below mimicks the following familiar DTD declaration :
<!ELEMENT Chapter (Title, ((Content, Chapter*) | Chapter+))>
If the
If the
When an interim step replaces definitively an upper model, this model is discarded without further occurrence boundaries checking.
Numbers of &ASL; elements have an
Any identified attribute may be reused thanks to the
Additionaly, when only a part of a definition would be convenient to reuse,
the
It is strongly recommended for identifiers to be qualified names ; the namespace URI of identifiers should be the same of the target namespace URI of the host schema.
An ID bound to a namespace URI is looked up within the set of schemata bound to the same namespace URI.
For example, the ASL schema for OASIS XML Catalog uses this elements.
Types differ whether they are related to textual datas or elements.
This specification talks about data types (
Data types apply both on attribute values and text content, designated as textual data. A textual data is a string that can be parsed into a typed data. Parsing a textual data is the operation that consist on sequencially converting the characters into the typed data according to a data type. A typed data consists on:
Data types may be composite, that is to say composed of sequences of data types. Once the first data type of the sequence ends to parse the textual data, the second try to parse the remainder, and so on.
From the point of view of an attribute or a text content, the parsing succeeds if and only if a typed data has been parsed succesfully with no remainder. That is to say that if the data type is a composite data type, the last data type of the sequence must consume all the remainder, otherwise the entire parsing fails.
&ActiveSchema; provides means to define new data types, for example by adding constraints on an existing type, like &Wxs; does. It is possible for example to restrict the values of an integer to be between 1 and 365.
When defining a data type, it is possible to apply constraints during or after parsing. Constraints may be applied on the lexical value and/or the logical value and its components (see internal data model representation).
A data type can be defined with a name with the
Named types are defined at the top level with the
Data types are defined on behalf of :
The same type definition may be referred both in an attribute value and in a text content.
For example, the following type definition is reusable :
The above definition is explicitely a choice step ; the first type that matches the text value is kept.
If an attribute is defined with a single type, its definition uses the
The attribute definition act as a choice step ; the first type that matches the attribute value is kept.
The last mean to define a type, is to extend an existing type,
by using the
The definition consists on steps that use matchers.
As explained hereafter, a type may be :
A composite data is a typed data produced by a composite type, that is to say,
a typed data that may contain other typed datas.
A non-composite data such as an
The formal type of a composite type is
A typed data is a
When parsing a text value, the engine try to build an internal data model ;
the parsing fails when the target object fails to construct,
or if some additional assertions -introduced with the
Runtime data types are parsed as
The parsing result may be constructed with the help of other types ;
in this way, the data model obtained may be any arbitrary complex structure.
&ActiveSchema; provide the
For example, the fr:date type could be defined to parse a value such as 10 juin 1969, and return an object that could be accessed thanks to XPath ; in the context of its value :
Facets are attributes exposed in addition to the data model. They have a name and a value that is not necessary a string, and can be constraint.
For example, an
&WXS; datatypes are exposed in &ActiveTags; in a slightly different manner than in the &Wxs; specification, because the base concepts are somewhat different, specifically on the hierarchy model. However, as the same features are covered and as they share the same semantics, they are compatible. &ActiveTags; just provides a different view of the &WXS; datatypes.
The &Adt; specification describes how &WXS; datatypes can be used in &ActiveTags;. In particular, it names the &WXS; facets to use as attributes in typed datas.
The core facets are :
The facets are bound to the &adt; namespace URI for convenience : typed datas may have their own attributes (user defined) that can't be in conflict with the facets.
The value of the object itself may be used to express constraints. For example, to constraint an integer to be less than or equal to 31 :
Text parsing is very close to content model parsing :
many &ActiveSchema; elements (
A type definition uses text matchers that are text values, regular expressions or other type definitions that define which character sequences are allowed in the type definition. When all matchers expected in a type definition has been involved and that a character sequence remains, the type returns the result data model with a remainder. If the host material that was using this type definition is itself a type definition, the host type goes on applying the matching with the remainder, and so on until the host is an attribute definition or a text content model. At this stage, if the remainder is involved in the next type or matcher, the process is repeated. When the host attribute or text content model definition definitively ends, the remainders must have been consumed. Otherwise, the matching fails.
A text matcher is involved with the
Finally, if the
Text matchers may be optional, and may be repeated.
The repetition may be specified with the
Repetitions may be impossible to process without the help of separators that are not involved in the matching process.
For example, 123456 can't match two
However, my:twoDigits could work as explained above only if it doesn't rely on an
The following sequence definition is used to match x=12,y=34 but not x=,y=34 :
The
To build the data model with a named item, or to compute a value other than those matched,
the
As shown above, the current object is set to the matched value before item creation.
Additionally, this attributes (
In short, a matched content may be :
Finally, the result data model may be construct with arbitrary additional items with the
The
For this purpose, the
In any case, the current object is set to the typed data produced by the base type. Additionally, the
The
When the
For example, the following type is based on an integer:
A typed data created by this type is of the
The following type will remove undesirable spaces from the input text value before choosing which text has been selected :
In this example, the remainder -if any- is also cleaned of trailing spaces.
When parsing, each time a matcher has matched, the typed data matched is set as the current object,
that the matcher can refer to build the data model if the
When the item of the data model has been built, it is set to the
The
Each time the
Each time an item name and an item value are encountered, they complete the typed data with a single named item.
In this example, one defines an attribute value that both matches the following kind of content :
The schema below parses such attribute values with an anonymous type :
The schema reports that in the following snippet XML, the two first polygon definitions are valid, the two last invalid.
Notice that the last polygon definition is invalid, but it is still possible to design a schema that allow heterogeneous value pairs.
The same internal data model built for the 2 first polygon definitions is represented below :
The content model may also be slightly modified thanks to an interim step and &XCL; :
...will produce a named item accessible with ./*/x, whereas :
...will produce an XML attribute accessible with ./*/@x.
Any artifact other than a matcher put in the context will be used to build the typed data.
As matchers are tested sequencially, order is significant when lexical values are overlapping.
For example, the lexical values of
"1" will return a
This is paricularly important when the
The
When a type is defined, it may be based on another type. In this case, the value of the upper type becames the value of the new type defined. Additional items may be produced with the remainder of the upper type. Assertions can be used to restrict the scope of the values of the upper type.
For example, the polygon type defined previously may be used to define a triangle type and a square type :
&ActiveSchema; may cover the data value semantics, and comparison between values expressed in different lexical spaces is possible ; for example, which temperature is colder than the other ? "31°F" or "0°C" ?
As shown above, a typed data is initialized with an
Functions may be bound to a type for the following purposes:
Functions are bound to a type by naming them in the type definition with attributes:
&EXP; may be used advantageously to define such functions, specifically when functions are defined as macro functions. See integration with &EXP;.
The
When this attribute is missing, the arguments are compared as indicated in the XPath specification.
&EXP; may be used advantageously to define such functions. Moreover, several comparison functions may be defined, and each application is free to use which comparison function to use. For example, what to compare when 2 polygons are compared ? The number of points ? The area ? The perimeter ? See integration with &EXP;.
This function is those which will be used for casting operations.
When typed datas are parsed while validating an XML document, it is often
usefull to bind the typed data to the nodes on which a type is defined.
For example, if an attribute is defined as a
&ActiveSchema; allows to augment the amount of informations of an XML document while validating on user request. In this case, comparison operations made with XPath or functions that imply an order relation between item such as a sorting function must apply on the bound typed data, not on the raw textual data.
As the schemata in use are generally defined by the processor instance
involved when validating, several different typed data might be bound
to the same node ; those to consider is those set by the same
processor instance that performs the validation. In other words,
typed datas are bound to nodes in the scope of a processor
instance. As the
Augmented datas must be taking in charge in &ActiveSheet;s ; non &ActiveTags; applications such as XSLT are encouraged to do so.
The
In this example, a weather report indicates town temperatures
expressed in °C as well as in °F. The type of the
The following snippet code simply parse the XML file,
and validate it with the schema within which the expected
type is defined ; then the towns
are displayed in temperature order thanks to the
Output :
32°F Vladivostok 2005/09/09 20°C Paris 2005/09/07 21°C Paris 2005/09/09 22°C Paris 2005/09/08 23°C London 2005/09/08
As expected, 32°F is placed before 20°C.
If the
All materials defined within an &ActiveSchema; must be bound to the target
namespace URI declared by the
Many elements are used for both defining a material or refering to a defined material (<asl:element name="..."> and <asl:element ref-elem="...">). Sometimes, the reference to a material may be inline, sometimes it can't. That's the case of elements that are always referred to definitions located on the top level. On the opposite, attributes, types, and text definitions are not obliged to be located on the top level, and then can be used as inline references.
Many informations inside an &ActiveSchema; deals with namespace URIs ; instead of pointing out directly namespace URIs, which are generally long strings, &ActiveSchema; always uses a prefix bound to a namespace URI as a more convenient mean.
For example, the
As usual with XML namespaces, only the bounded namespace URI matters. Schema designers must define the appropriate namespaces declarations when they are using prefixes in attributes values.
The
|
The ##targetNamespace used in &Wxs; has not its equivalent in &ActiveSchema;ta ;
users just have to use the same prefix as those specified in the |
The xml prefix may also be specified without any particular precaution (the appropriate namespace declaration is always auto-declared). On the contrary, the xmlns prefix must not be specified ; &ActiveSchema; can't constraint namespace declarations because they have a particular meaning in XML.
&ActiveSchema; may be mixed with other schemata technologies to add constraint types not supported. The schemata supported are implementation dependant.
Moreover, legacy schemata doesn't necessary deal with foreign material inclusion in XML instances ; elements and attributes that belong to other namespaces and that was not plan to be present will be normally forbidden.
&ActiveSchema; allow to "patch" existing schemata (of course including &ActiveSchema;ta), in order to :
This is particularly interesting when users are dealing with several third-party schemata that has not been written to accept materials in foreign namespaces.
A company uses multiple schema instances at different level :
Each level registers its schemata in a catalog.
In this example, an element is defined at the top level with a legacy public DTD that contains :
<!ELEMENT acme:order (acme:ship-to, acme:item*)> <!ATTLIST acme:order xmlns:acme CDATA #FIXED "http://www.acme.com/order" id CDATA #REQUIRED>
The company needs to patch the DTD to allow XHTML content to be inserted inside
As this schema is registered in a catalog close to the application, it will be used first.
The
If the XHTML content had to be inserted inside
An application of that company have to deal with a new attribute (
Schemata are organized in an ordered list ; each item of the list is given by a catalog (a single catalog may deliver several items). Schemata are ordered in the order they are delivered by catalogs.
When an element refers to an attribute that is already referred in a schema that has a less priority, the attribute must be checked only once : the schemata that has a less priority must not check it.
When a specific schema "overrides" a definition (attribute, element, type, etc), those used must be those that overrides even if it is referred from a schema that has a less priority. For example, if an attribute definition uses a named type defined in the same schema instance, but another schema instance that has a higher priority redefines this type and preserve the attribute, the attribute will be checked with the redefinition of the type.
&ActiveSchema; is part of &ActiveTags; technology and then, fully integrated to &ActiveTags; features. Particularly, any other module may be used in a schema.
One of the most useful module may be &XCL;, because it offers the ability to go further with a procedural approach
where the &ASL; declarative model find its limits.
For example, an interim step could be optional by putting it inside an
Modules that provide accesses to remote data sources may be also very useful.
In this example, the &RDBMS; module allow to dynamically draw up a list of values available in an attribute.
As explained in "managing &ActiveSchema;", many storage units can be used to build a schema. This feature is particularly useful when a schema is intended to be shared with third-parties. An access to a RDBMS is not necessarilly public, and the snippet schema above would fail. When designing a schema, it is convenient to make them neutral if they have to be shared ; a private additionnal schema should then cover the non-exportable part that accesses to the RDBMS.
As explain before, &EXP; can be advantageously used
to provide complex functions used when initializing typed datas,
to specify a counterpart function for a type,
or to specify a comparison function.
The &EXP; module defined must be bound to the same namespace URI as the schema.
The &EXP; module where these functions has been defined must be known
by the
In this cases and others, it may be convenient to define macro-functions in &EXP;.
Additionally, several of this functions could be defined in a module, and a schema could use one or another of this function.
For example, when comparing a polygon, one could:
According to the relevant application, one of this methods or the other could be used. Close to the application, a schema could specify which one use.
In this example, assume that the type defined previously is labelled geom:polygon-definition, with the variant where x and y are stored in attributes of a point, and with the approriate namespace declarations for the geom and math prefixes (assuming that a math module is also provided).
Within a
The perimeter function could be better : for example, it could test before if the attribute "perimeter" exists and return it, otherwise it performs the computation and set the attribute for other usage.
The first application could use this schema:
The second one could use:
A preceding example was showing a type that converts °F to °C ; this example show an alernative type definition which comparison relies on a macro function.
There is neither inclusion nor import mechanism in &ActiveSchema;.
&ActiveSchema; doesn't define itself how to retrieve different schema components :
it delegates the task to a
Moreover, &ActiveCatalog; may be used for other purpose than schema instance retrieval : when defining a schema, it may be useful to define &EXP; resources (such as functions, as shown before) that will be use in schemata instances.
Here is a snippet &ActiveCatalog; instance that bounds a namespace URI to several resources :
See the &ActiveCatalog; specification for further details.
&ActiveSchema; allows to define data types with tags, as explained in a previous chapter. However, &ActiveSchema; also uses built-in data types ; the &ActiveDatatype; specification provides several built-in libraries for data types that may be used in &ActiveSchema;, including the well known &Wxs; data type library (the &ActiveDatatype; specification adapts this library to be used in &ActiveTags; technologies). Built-in data type libraries are pre-compiled schema instances ; a pre-compiled schema instance may contain any material definition : types, attributes, elements... Pre-compiled schema instances
The &ActiveDatatype; specification defines an other kind of data type, called marker types, that can't be used in &ActiveSchema; instances, but are part of the &ActiveTags; technologies. As this specification is also an &ActiveTags; application, such types can be referred in this document ; just notice that they can't be used as is in &ActiveSchema; instances : an &ActiveSchema; instance can use only XML-unstructured raw data as specified in the &ActiveDatatype; specification.
The
Schema client handlers may expose a description if one available instead of an error message when an unexpected content is encountered.
The
When a schema client handler needs a description, it may hold an ordered list of preferred languages ; if a message or description exists for the language given, it will be chosen ; otherwise, the description will be used as default language.
Usually, a default language is set on the schema root element with |
Schema designers may find convenient to insert descriptions in a single master language only. Translations could be added in separated &ActiveSchema; documents (1 per language). |
Additionally, the
Users are responsible of their models, and should not deploy them before testing them seriously.
For example, the following model will always raise an error when involved,
because the first sequence element will consume all
Of course, some schemata may not present such obvious inconsistency.
A non-deterministic content model is a grammar-based content model where the schema processor has at most one possible choice.
There are no non-deterministic content model in an &ActiveSchema;, because the basic processes don't allow such case to happened. The major rule in &ActiveSchema;ta is that a candidate material matches or doesn't match a primitive model, where its material is read sequencially.
Thus, a write playing is still available in &ActiveSchema; to express any arbitrary complex content model, without causing a schema inconsistency.
For example, the familiar following pattern is an unambiguous pattern which is not deterministic and can't be rewritten in a deterministic form :
(odd, even)*, odd?
A DTD containing this declaration would reject it. On the contrary, a valid &ActiveSchema; may be written to express the same content model :
This step is refactoried as long as there is a candidate element that is alternatively
The following valid &ActiveSchema; may also be written to express the same content model :
As long as an element is matched, it is followed by an optional element,
alternatively
A short set of instructions is defined to invoke a schema.
These tags have to be used inside an
As explained in "integration with &ActiveCatalog;", the best way to invoke schemata, is to register them in an &ActiveCatalog; and let the engine do the job. However, as this approach is efficient with schemata targetting a namespace URI, it can't be done as is with a schema that has no namespace URI target ; there are 3 ways to deal with such a schema :
Several storage units (files) may be used to build a schema. Furthermore, a single XML document may be validated by several schemata, for example when several namespaces are used in the instance.
A schema client handler that expects a specific schema, for example to perform a validation on an element, launches a schema request.
Once a schema request is launched, the schema client handler must resolve and hold all storage units (files) that are composing the schema ; the schema client handler must then use the schemata hold for next schema requests.
The schemata are processed in the order in which they have been retrieved ;
it is particularly important when using the
&ActiveSheet;s may be validate by &ActiveTags; engine while unmarshalling or before unmarshalling. While unmarshalling, only active tags are checked.
Errors are categorized in the following types :
Schema client handlers are free to process errors as they want. Errors are just reported informations that denotes that the engine has noticed an unexpected content inside an XML document, regarding to the schemata that have been used by the engine.
When validating, a report that holds the errors found is created.
An application that perform validations may use such reports to produce an XML output for specific processing. For example, by transforming it in HTML for an end user, or by transforming it in text for logging. For this purpose, the report provided is a cross operable object that contains informations about the errors :
Implementations should provide high-level structured error report and XSLT stylesheet to display them in a user-friendly fashion.
&ASL; | : | &Asl; |
&ASL; namespace URI | : | &asl; |
Usual prefix | : | asl |
Some features listed here are not used inside a schema,
but may appear in other XML documents or
Root element for an &ActiveSchema;.
The schema handles all definitions found at the top level.
An object of the type
The
The
The
A global attribute definition can be referenced from any other schema. A local attribute definition can be referenced only from the schema that defines it.
Defines an item when building a typed data. An item may have a name or may be unnamed.
Opens a context and runs its subactions; builds an item with the data found in the context; feeds the upper context with an item.
Set the boundaries of a partial content model. A step may contains substeps. It is a convenient container for primitive content models and other steps. A step is always unstable.
Opens a context and runs its subactions; invoke the schema client handler with the list of matchers found in the context.
A container step that denotes that the current content model must be leaved temporarily. The inner models are applied on the next candidates only if the host step has matched. When ending, the host content model goes on (default behaviour). An interim step is always unstable.
Opens a context and runs its subactions; invoke the schema client handler with the list of matchers found in the context.
Defines a sequence of elements and/or text content. A sequence is always stable.
Opens a context and runs its subactions; invoke the schema client handler with a sequence of matchers found in the context.
Defines a choice of elements and/or text content.
Opens a context and runs its subactions; invoke the schema client handler with a choice of matchers found in the context.
Defines a list of elements and/or text content to select.
Opens a context and runs its subactions; invoke the schema client handler with a selection of matchers found in the context.
Draws up a list of exceptions.
Opens a context and runs its subactions; the matchers found in the context are the exceptions for a host matcher.
Defines a block. This element is very useful when the same definitions have to be reused several times.
Simply runs its subactions.
Uses an identifiable element or its content.
Simply runs the action or the subactions referenced.
Checks an assertion. An assertion evaluated to false denotes that the model that uses it fails.
If the assertion is not expressed with the
Defines a text matcher. A text matcher is used to match attribute values and text content. A simple text matcher is defined with one of the following attributes:
A complex text matcher refers to a type either with the
Opens a context and runs its subactions.
Feeds the upper context with a text matcher.
Defines a data type.
The
When the type to define is based on an another type (
A type is defined with steps that draw up lists of text matchers and other type matchers.
A type definition may be registered to the schema either by name or by ID:
A type matcher is invoked when a text candidate must be checked. For this purpose, it opens a context and runs its subactions. The context is used to build the typed data. If the text candidate matched the type, the typed data feeds the upper context.
Errors raised by a type may be ignored by the schema client handler : when a type matcher failed to match a text data, another candidate type matcher may be used; an error is raised only when a candidate matcher is expected and that none matched.
Indicates to the schema client handler to apply the definitions of the next schema. The schemata list is built from catalogs and maintained by the schema client handler.
A fallback matcher is used when an unexpected material (element or attribute) has been encountered when applying definitions (
The pattern to match.
If an unexpected material doesn't match any
A description.
An alternate message definition.
This element is designed for multilingual support.
When a
The version of the &ASL; module to use. This attribute should be encountered before any &ASL; element, but it takes precedence on the element inside which it is hosted.
This function returns the candidate material to check or to insert, according to the using mode of the schema. However, if it is an element, it has no name because in an insert mode, the host application may use the schema to guess the list of elements available in the insert context.
In fact, the candidate material can be used only for positional testing.
This function returns the element designated by the
This function returns the document hosting the element or attribute that is processing by the schema.
Returns the compacted form of a string, that is to say a string with no trailing spaces and for which contiguous spaces are replaced with a single space.
The following items are not designed to be used inside an &ActiveSchema;,
but inside an
Parses a schema.
Instruction used to perform a validation of a schema on a node.
If no schema is specified (with the
The node to validate.
The name of the report property to produce.
A property with this name is added to the data set.
A report is a
Instruction used to identify the material available on the given context.
This element does nothing.
It is an element container for
When created, this element contains all namespace declarations expected in the path,
hosted either in the
See "path element" and
Build a structured message with the object given. The message may be built with the locale language or with a specific language.
If no object is specifically provided, the object given is the current object when building the structured message.
Errors are created when performing a validation on an XML instance.
The
This element contains the namespace declarations involved in the path.
The path is exposed as an expression to make easier use by an application.
This list is not exhaustive. Additional implementations are welcome.