CML/Golem dictionary syntax =========================== .. highlight:: xml Here, we'll go through one entry from a very small CML/Golem dictionary, which you can find in your Golem distribution in ``GOLEM/docs/AnnotatedDictionaryEntries.xml``. :: You specify the three attributes above here - a URL for the dictionary (which you can think of as a machine-readable name; pick something unique at a domain you control), the dictionaryPrefix (short name for the dictionary), and the dictionary title (which is the title people using the dictionary'll see.) As mentioned earlier, the dictionary generator will add the namespaces for you, but if you're writing your own dictionary for whatever reason, just copy them across as they are here. :: The ``id`` must be unique within the dictionary. The term is designed for documentation, so should be as pithy as you can make it. ```` and ```` ************************************** Next, we deal with documenting what this dictionary entry means. :: The exchange-correlation functional used. The ``definition`` is a one-sentence description of the concept this dictionary entry defines. Next, ```` - note that ``h:`` here was defined, above, to be bound to the XHTML namespace: :: The exchange-correlation functional used in a given simulation. Available values for this are: LDA, the Local Density Approximation PW91, Perdew and Wang's 1991 formulation PBE, Perdew, Burke and Enzerhof's original GGA functional RPBE, Hammer et al's revised PBE functional ```` contains a longer description - ideally, documentation for the term. As mentioned above, this takes the form of XHTML. :: Optionally, ```` - which contains CML ```` - can be used to document the provenance of a dictionary entry (for instance, who wrote it). ```` ***************** Next, we need to tell our programs how to find the data this entry is describing; we do that by giving an XPath expression pointing to where it can be found.:: ./cml:parameter[@dictRef="castep:xcFunctional"] ```` ******************** Once we have found the data, we need to know to read it. Here, the data that we're trying to read looks something like :: PBE Here, we're trying to read a scalar (a number or string). Golem templates use XSLT to convert pieces of CML, like this one, into JSON objects of the form ``[value, "u:units"]``, where ``u`` is the namespace in which the units are declared. The golem dictionary generation tools "know about" - i.e., have templates for the following tags, assuming they are used in the same way as FoX uses them: * ```` * ```` * ```` * ```` * ```` * ```` * ```` So if the data you are reading is in one of these tags, then the following will let you read it: :: Here, the value of ``call`` will, typically, correspond to the name of the tag which has the actual data in: so here it's "scalar". If additional information (say, extra properties on each ``atom`` in an ``atomArray``) is added, the read will still succeed, but the extra information *will* be ignored; either you will need to modify the template for that type, or arrange to read out that extra information in another way (such as using ``etree`` methods, as in the examples later in this documentation). If your data is contained in some other tag, and you wish to read it directly using Golem, then you need to: .. highlight:: xslt * Write a dictionary entry with the tagname as the id (i.e. ````); * Write an XSLT stylesheet which produces a JSON document of the form ``[ newEntry_value, "units:newEntry_units" ]`` when run over the data. For example, lattice vectors are represented by markup of the form:: a b c d e f g h i * which we associate with a stylesheet :: [[ [ , ], ] ], "A**-1"] * Put this in your dictionary entry. Here's how you do that: :: * Add ```` to the dictionary entries which'll use this new template to read their data. .. highlight: python :: The ``role`` of templates determines how they are used: all templates used for reading data with Golem should have ``role="getvalue"`` and ``binding="pygolem_serialization"``. This is the only special case in ``role``; but you can add other templates with different roles, too. These get mapped onto functions if you're using the Golem library: for instance, if you've got a dictionary ``d`` with namespace ``n``, then (in a Python interactive shell):: >>> xcFunctional_entry = d["{%s}xcFunctional" % n] >>> print str(xcFunctional_entry.arb_to_input("RPBE")) .. highlight: xml :: will print out the value of this template when passed "RPBE" as an argument. :: XC_FUNCTIONAL ```` et al ************************* In this section, we describe the terms which can be used to draw relations between concepts in dictionaries. Those relationships can then be used by the Golem library to enable equivalent concepts to be looked up at once, rather than having to check them all separately. First, ````; if entry A ``implements`` entry B, then if a piece of CML satisfies the definition of term B, it also satisfies the definition of term A. In other words, term A is an *implementation* (analogous to a subclass) of term B. :: convertibleToInput value absolute In the final case here, the term ``absolute`` resides in a different dictionary with the given namespace; to be able to make use of this in your code, your program will need to load both dictionaries. If two concepts are synonymous, then any instance of one concept is equivalent to an instance of the other, although they may be serialized differently. To implement that in a dictionary, add the following to ``concept1``'s dictionary entry: :: concept2 Or if concept2 resides in a different dictionary with a different namespace: concept2 although, as above, you will not be able to make use of this relationship without explicitly loading the second dictionary. Synonyms are *symmetric*, unlike ``implements``; stating that ``a`` implements ``b`` does not imply that ``b`` implements ``a``, whereas stating that ``a`` is synonymous with ``b`` *does* imply that ``b`` is synonymous with ``a``. It is sufficient to specify the synonym on either one of these concepts; it doesn't need to be given on both. In both cases, when searching for a concept using the ``findin`` method (in the following section on how to use the Golem libraries), all synonyms and implementations of the current concept are found. ```` is used to denote any other relationship between concepts; it does not imply any particular relationship, but alerts the user that it may be worth looking at the other entry. This is mostly of use in dictionary-browsing applications and similar tools, where it can be used (for instance) to implement a thesaurus. We may also want to represent certain aspects of document structure in the dictionary. This is particularly useful when you are wanting to evaluate "everything in a section of the document"; for example, "every input parameter". We can denote these relationships using ````. A ```` is found as a childNode (in the XML sense) of the CML representation of another node in the dictionary: :: input So, here, ``xcFunctional`` is found in the child nodes of the CML representation of the dictionary term ``input``. ```` *************************** The range, and type, of data one expects for a given concept can be given with ````. :: LDA PW91 PBE RPBE The ``type`` of data may be ``int``, ``float``, ``string``, or ``matrix``. ``int``s (and analogously ``floats``) are specified as follows: :: 2 Matrices are a little more complex: you can specify both the dimension of the matrix and the type of the data therein. For example: :: 0 10