Using the Golem library ======================= In this chapter, we move from considering the Golem markup language to the Golem library. If you aren't familiar with the Python programming language, it'll be helpful to have a look at the Python tutorial (http://www.python.org/doc/tut). Any code samples starting with: :: >>> are example interactive sessions; you can start a Python interactive shell by simply running ``python.`` Importing Golem and loading a dictionary ---------------------------------------- If you have installed Golem using setup.py/setuptools/``easy_install`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Golem will be on your PYTHONPATH, so you can simply use ``import golem`` in your script. If you have installed Golem using ``make`` (Unix/MacOS X only) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. highlight:: sh If you have installed Golem using ``make``, you will need to set your PYTHONPATH. Either set the PYTHONPATH environment variable: :: $ export PYTHONPATH=/usr/local/share/pygolem:$PYTHONPATH or pass it on the command line when you invoke ``python``: :: $ PYTHONPATH=/usr/local/share/pygolem:$PYTHONPATH python .. highlight:: python Loading a dictionary and looking up terms ----------------------------------------- Loading a dictionary ^^^^^^^^^^^^^^^^^^^^ Running Python interactively: :: >>> import golem >>> d = golem.Dictionary("/PATH/TO/DICTIONARY") ``d`` is now an instance of ``golem.Dictionary``, which inherits from (the Python type) ``dict``. Thus, ``d.keys()``, for instance, will list all the terms in the dictionary. Looking up terms ^^^^^^^^^^^^^^^^ Terms in the dictionary are indexed by keys in *Clark form*: ``{namespace}id``, where ``namespace`` is the dictionary's namespace URI and ``id`` is the ID of the dictionary entry you're looking up. Thus, an entry ``cutoff`` in a dictionary with namespace ``http://www.castep.org/cml/dictionary/``: :: >>> cutoff = d["{http://www.castep.org/cml/dictionary/}cutoff"] >>> cutoff If there was no ``cutoff`` entry in this dictionary, a ``KeyError`` exception would be raised. We can then use this to find instances of this concept in a CML file: :: >>> cutoffs = cutoff.findin("LiH-geomopt1.cml") >>> cutoffs [] ``findin`` always returns a list - if the concept is present, it returns all the instances of the concept (or its implementations or synonyms) in document order, and an empty list if the concept is not found. One can then extract the value of these instances, as follows: :: >>> val = cutoffs[0].getvalue() >>> val 330.0 and once you've obtained a value, check what its units are, and which term it is an example of: :: >>> val.entry >>> val.unit u'castepunits:eV' >>> val.entry.term 'Basis set cutoff energy' In the next section, we show some examples of how to use Golem for real-world data-extraction from CML files, taken from the reporting component of `MaterialsGrid `_.