.. Command line tool documentation The Command Interpreter Tool **************************** This chapter describes the part of ``VDAT`` responsible for the reduction of the data. Introduction ============ The main scope of the ``VDAT`` GUI is to allow users to select, visualize and reduce ``VIRUS`` data. ``VDAT`` relies mostly on ``cure`` to execute the reduction steps. ``cure`` is a C++ library that provides a number of executable that operates on single or group of fits files. For each of the reduction steps, ``VDAT`` must collect the input files and command line options according to the directories and IFUs selected by the user and run the appropriate ``cure`` tool. Although ``cure`` is the main tool to use, some of the steps of the reduction are not implemented there. We also want to allow users to execute generic commands without any prior knowledge of the signature and name of the files. We have solved those requirements designing a command line tool based on these two building blocks: 1) an interpreter that parse the command string, containing placeholders, and execute the command in a loop replacing the placeholders with the correct values; we use the standard `python string Template `_ to define placeholders; 2) one or more `yaml `_ configuration files to instruct the interpreter on how to expand the placeholders for any provided commands .. _interpreter: The interpreter =============== The public interface of the interpreter is defined by the constructor of the class :class:`~vdat.command_interpreter.CommandInterpreter` and its method :meth:`~vdat.command_interpreter.CommandInterpreter.run`. Constructor ----------- The constructor has the following signature .. autoclass:: vdat.command_interpreter.CommandInterpreter :noindex: 1) ``command`` is a string with the command to execute. E.g.:: subtractfits $args -o $biassec $fits 2) ``command_config``: the relevant part of the parsed ``yaml`` configuration file containing the instructions on how to expand placeholders like ``args``, ``biassec`` and ``fits`` while running the command ``subtractfits``. The part of the configuration file necessary to run the above command is .. code-block:: yaml subtractfits: # mandatory keys mandatory: [fits, ] # primary key: the interpreter collects files according to # the instructions in the `fits` key, loop over them, # replace all the placeholders and execute the command primary: fits # looks for all the files matching the pattern in the `selected_dir` fits: '[0-9]*.fits' # Get the `BIASSECT` value from the header of every file # and from it extract the part within square brackets biassec: type: header keyword: BIASSEC extract: - \[(.*)\] - \1 args: '-s -a -k 2.8 -t -z' Both the GUI and the command line interface inject into the ``command_config`` the following keys: * ``target_dir``: is the directory selected by the user; in the above examples, the ``fits`` files are searched in this directory * ``cal_dir``: the reference calibration directory * ``zro_dir``: the reference bias directory If no directory ``cal`` or ``zro`` has been explicitly selected in the GUI, the default ones are added. .. warning:: If any of these entries is already in the configuration file, they will be overwritten 3) ``selected``: list of selected items or ``None``, for selecting all. It tells the interpreter which of the ``primary`` elements must be run. E.g. the ``VDAT`` GUI passes as ``selected`` the list of IFUs selected by the user. The instructions on how to extract the information to match against ``selected`` from the files while running the command is defined in the :ref:`command configuration file `. .. note:: VDAT pass the IFU head mount plate IDs (ihmpid) to the command interpreter. This id is a 3 digit number stored in the file headers under the IFUSLOT key. ---- In the constructor the following steps are performed: 1) the configuration object is copied and saved in local variables: this allows to enqueue multiple commands; 2) validations: a) the command executable, e.g. ``subtractfits``, is searched in the path b) check that all the mandatory fields are present in the command c) check that all the required keywords are present in the configuration d) check that all the required keywords are of known type e) map all the types to the functions implementing them The ``run`` method ------------------ Invoking .. automethod:: vdat.command_interpreter.CommandInterpreter.run :noindex: will: 1) collect all the ``primary`` files 2) filter them according to the list of selected items 3) loop over the ``primary`` files 4) check whether the step must be executed or not 5) for each step in the look replace the placeholders in the input command according to the instructions from the configuration 6) execute the command 7) report execution progress 8) collect and send out execution results .. _command_conf: The configuration file ====================== To allow for flexibility and extendability, the instructions on how to expand keyword comes from one or more configuration files, written using the ``yaml`` standard. When validating the ``command``, the keywords are extracted and searched in the configuration. The value of a keyword can be either a string or a dictionary. If it's a string, like ``'-a -b'``, it is converted into a keyword of type ``plain``: ``{'type': 'plain'; 'value': '-a -b'}``. If it is a dictionary, it must contain a key ``type``, whose value define the type of the keyword .. _special_keys: Special keywords ---------------- These keywords are understood and used by the interpreter, but should not be used as variables to expand ``is_alias_of`` ^^^^^^^^^^^^^^^ If exists, its value is the real name of the executable. This allows to create various commands using the same underlying executable. If e.g. the command is:: do_something $args -o $ofile $ifiles and the configuration file contains .. code-block:: yaml is_alias_of: an_executable args: "-a -b" ofile: outfile ifiles: file[1-9].txt primary: ifiles then the interpreter will loop through all the files matching the ``ifiles`` pattern in ``target_dir``. For the first file, it will execute:: an_executable -a -b -o outfile file1.txt ``mandatory`` ^^^^^^^^^^^^^ List of mandatory fields; field names defined under ``mandatory`` must exist in the provided command; if not found, or empty, no check is done .. code-block:: yaml mandatory: [ifiles] # or equivalently mandatory: - field1 - field2 ``primary`` ^^^^^^^^^^^ Name of the keyword to use as primary. A primary keyword has a special status: files are collected from the ``target_dir`` according to the type of the underlying keyword, then they are looped over and for each step the command string is created and executed. If the value of any other keywords needs to be built at run time, it will use the ``primary`` files to do it. ``VDAT`` is shipped with few :ref:`primary types `. ``filter_selected`` ^^^^^^^^^^^^^^^^^^^ Tells the interpreter how to filter the list of primary files. If this option is not found in the configuration or the ``selected`` keyword in :class:`~vdat.command_interpreter.core.CommandInterpreter` is ``None``, no filtering is performed. Otherwise, for each element in the primary list: * uses the instructions from the value of ``filter_selected`` to extract a string * check if the string is in ``selected``. The value of ``filter_selected`` can be any of the :ref:`keyword types ` described below. With the following settings: .. code-block:: yaml # Use the value of the header keyword ``IFUSLOT`` to decide whether to # keep the primary field or not filter_selected: type: header keyword: IFUSLOT the content of the fits header keyword ``IFUSLOT`` is extracted and compared with the list provided with the ``selected`` options in :class:`~vdat.command_interpreter.core.CommandInterpreter` ``execute`` ^^^^^^^^^^^ For each iteration of the ``primary``, tells the interpreter whether to run or not the command. If the option is not found, no filtering will be performed. ``VDAT`` is shipped with a few :ref:`execute types `. The following configuration: .. code-block:: yaml execute: type: new_file sub_type: format value: masterbias_{ica}.fits keys: ica: type: regex match: .*\d*?T\d*?_(\d{3}[LR][LU])_.*\.fits replace: \1 Create the ``value`` extracting the ``ica`` keyword from the primary file name and returns false if the file already exists. If the handling of the keyword raises and exception, it is logged and the command is executed. .. _primary_types: Build-in primary keyword types ------------------------------ ``plain`` ^^^^^^^^^ It looks for all the files matching the give pattern in the target directory. If the value of a keyword is a string, it is interpreted as of ``plain`` type. These three definitions are equivalent: .. code-block:: yaml keyword: 20*.fits --- keyword: type: plain value: 20*.fits --- keyword: {type: plain, value: 20*.fits} ``loop`` ^^^^^^^^ 1) collects the ``keys`` 2) cycles through all the possible combinations of the keys 3) for each combination replaces the corresponding entries in ``value`` using the standard python `format string syntax `_ 4) look for all the files matching the resulting strings 5) if any file is found, construct a string with space separated file names and yields it. The value of ``keys`` is a map between the names of the keys, e.g. ``ifu`` and the values that they can have. Their value can be either a list or three comma separated numbers: ``start, stop, step``. The latter case is converted into a list of numbers from ``start`` to ``stop`` excluded every ``step`` The following configuration: .. code-block:: yaml keyword: type: loop value: 's[0-9]*{ifu:03d}{channel}{amp}_*.fits' keys: # dictionary of keys to expand in ``value`` ifu: 1, 100, 1 # start, stop, step values of a slice channel: [L, R] # a list of possible values amp: # alternative syntax for the list - L - U cycles through all the possible combinations of the three lists: ``[1, 2, .., 99]``, ``['L', 'R']`` and ``['L', 'R']``. For the first combination we get: ``ifu``: 1, ``channel``: L, ``amp``: L and ``value`` becomes ``s[0-9]*001LL_*.fit``. Then all the files matching this pattern are collected. ``groupby`` ^^^^^^^^^^^ 1) collects all the files matching ``value`` and loop through them 2) for each of the files replace ``match`` with all the values in ``replace`` using the `python regex syntax `_ The following configuration: .. code-block:: yaml keyword: type: groupby value: 'p[0-9][LR]L_*.fits' match: (.*p\d[LR])L(_.*\.fits) replace: - \1U\2 cycles through all the files matching ``value`` in the ``target_dir``, e.g. "p2LL_sci.fits", and for each of them creates a new file name the last "L" with "U", e.g. "p2LU_sci.fits". The two files are then returned. To create multiple files out of the first one, it's enough to provide other entries to ``replace``. E.g.: .. code-block:: yaml replace: [\1U\2, \1A\2, \2_\1] will create three new files: "p2LU_sci.fits", "p2LA_sci.fits" and "_sci.fits_p2L" .. _keyword_types: Build-in keyword types ---------------------- ``plain`` ^^^^^^^^^ A static string. These three definitions are equivalent: .. code-block:: yaml keyword: '-a -b --long option' --- keyword: type: plain value: '-a -b --long option' --- keyword: {type: plain, value: '-a -b --long option'} ``header`` ^^^^^^^^^^ Extract and manipulate a fits header keyword from the primary files. If the primary is a space-separated list of file names, it uses the first one. If ``extract`` is present, it uses :func:`re.sub` to replace ``extract[0]`` in the header value with ``extract[1]``. Assuming that the primary files have a header keyword ``BIASSEC = [1:32,1:1032]`` .. code-block:: yaml keyword: type: header value: BIASSEC will extract ``[1:32,1:1032]``, while .. code-block:: yaml keyword: type: header value: BIASSEC extract: - \[(.*)\] - \1 will extract ``1:32,1:1032``. ``format`` ^^^^^^^^^^ Creates a new string `formatting `_ ``value`` using the ``keys``. They can be of any type defined in this sections, except ``format`` to avoid circular recursion. Assuming to have a fits file called ``file_001_LL.fits``, with a header keyword ``DATE-OBS = 2013-01-01``, the following configuration instructs the interpreters to extract the ``id`` key, a three digit number, from the file name and the ``DATE-OBS`` fits header value. The resulting value is the string ``file_001_2013-01-01.fits``. If the types for the keys do not exist, a ``CIKeywordTypeError`` will be raised at run time. If one of the keys has a string as value, it will be interpreted as of type ``plain``. .. code-block:: yaml keyword: type: format value: file_{id}_{sec}.fits keys: id: type: regex match: .*_(\d{3}).*\.fits replace: \1 date: type: header value: DATE-OBS ``regex`` ^^^^^^^^^ Returns a string obtained from primary replacing ``match`` with ``replace``. It uses :func:`re.sub` to do the substitution. If e.g. the primary is called ``file_001_LL.fits``, the following entry returns ``L001`` .. code-block:: yaml keyword: type: regex match: .*_(\d{3})([LR]).*\.fits replace: \2\1 .. _execute_types: Build-in execute types ---------------------- ``new_file`` ^^^^^^^^^^^^ For each of the primary entry, it constructs a string using the keyword type defined by ``subtype``. If that string corresponds to something existing in the file system, returns ``False``. Besides ``type``, ``subtype`` is the only mandatory keyword and its value must be one of the available keyword types. All the relevant keywords for that type must of course exist. .. _plugin_types: Add new types ============= To any type, be it primary or not, there is a corresponding function that implements how to handle it. All the types are implemented as plugins, `discovered `_ and `dynamically loaded `_ at run time. The command interpreter look for two entry points: * ``vdat.cit.primary``: for the definition of primary types * ``vdat.cit.keyword``: for the definition of other types * ``vdat.cit.execute``: for the definition of types to decide whether to execute or not the command Each entry point is defined as a string like:: type = package.module:func where ``type`` is the name of the type and ``func`` is the function handling the keyword of ``type``; ``func`` is implemented in the ``module`` module of the package ``package``. The functions implementing primary and secondary keywords have the following signature: .. autofunction:: vdat.command_interpreter.types.primary_template :noindex: .. autofunction:: vdat.command_interpreter.types.keyword_template :noindex: .. autofunction:: vdat.command_interpreter.types.execute_template :noindex: Communication ============== The command interpreter communicate with the rest of the world through different channels. * Upon errors directly handled by the interpreter, one of the errors defined in :mod:`vdat.command_interpreter.exceptions` is raised. Please check the documentation of :class:`~vdat.command_interpreter.CommandInterpreter` for more details. * During normal execution of the command, the resolved command string, standard output, error and any exception raised while executing the code are logged to a logger with the name of the executable. In ``VDAT``, these loggers are set to write to files located in the directory defined in the ``VDAT`` configuration file; the name of those files are the executable name with a ``.log`` extension. These loggers are set in the main ``VDAT`` code, not in the command interpreter sub-package. * .. automodule:: vdat.command_interpreter.relay :noindex: The relays ``emit`` method can be replaced by other applications using .. autofunction:: vdat.command_interpreter.override_emit :noindex: