ARC Information System Technical Details

New in version 7.0.

General Information System configuration items are described in the configuration items section [infosys] block

This page contains a description of the architecture of the ARC Information System.

Components and workflow of the ARC Information System

Fig. 24 Components and workflow of the ARC Information System

Overview: Purpose of the Information System

The ARC Information System (short name: “infosys”) is a collection of scripts that generates information documents about the status of the A-REX service, the LRMS and the submitted jobs.

These documents can be retrieved using the ARC CE REST interface formatted as XML or JSON.

A group of legacy components provide the same information as LDIF documents via a LDAP server .

The information is presented according to two schemas, the default GLUE2 schema and the legacy NorduGRID schema.

Some of the content of the documents is static but there is also a dynamic component to provide minimal statistics and job progress.

It is important to understand the asynchronoyus nature of the content of these documents, as this is NOT a real time system. Every rendered item contains a validity field so that information consumers (clients) can determine how fresh is the presented information. See section Static and dynamic info for more details.

Inputs and Outputs

The main inputs to the information system are:

  • the runtime arc.conf for most of the static data;

  • the content of the control directory where both A-REX and the LRMS scan scripts write information about the status of the jobs;

  • direct queries to the LRMS to try to get the most up-to-date information from it

  • The output of commands used to collect info about load, storage and other system related matters.

The outputs of each “run” or “scan” is mainly three types of files:

  • info.xml, that contains static data about the Computing Element and aggregated dynamic statistics about the status of the LRMS and the jobs, generated in the control directory. The output is formatted according to the NorduGRID implementation of the GLUE2 schema.

  • For each job, a file containing information about that job is generated in a <controldir>/jobs/jobpath/xml file. The output is formatted according to the NorduGRID implementation of the GLUE2 schema for ComputingActivity objects.

  • Optionally, a BASH script ldap_infosys.sh is created when LDAP information is required, generated inside /run/arc/infosys. The script contains an LDIF representation of the NorduGRID and GLUE2 schemas.

Description of the components

This section described the functionalities and purposes of the Information System components. It is not meant to describe the code in detail but to give an understanding of the architecture.

The Infoproviders

The infoproviders are script or executables that output a formatted document containing information about the status of the ARC computing element. Currently there are two information providers, one of which is generated by the other for historical reasons:

  • CEinfo.pl, a PERL script responsible for parsing the configuration, collecting all information from different data sources, aggregating it into proper data structures, writing it out in different formats, but mainly generating the info.xml documents and an xml document for each job. It also generates the legacy information provider ldif-provider.sh below when [infosys/ldap] block and [infosys/nordugrid] block are present in configuration.

  • ldif-provider.sh is a bash script generated by CEinfo.pl and it’s a legacy functionality for backward compatibility of previous ARC versions that outputs information about the cluster as a complete LDIF document representing the cluster infrormation for both NorduGRID and GLUE2 schemas. GLUE2 information is only rendered if the [infosys/glue2/ldap] block is present in the configuration. For a more detailed description of the LDAP backend see The legacy LDAP subsystem.

The information collectors

The information collectors are a set of PERL modules responsible to collect and aggregate information of various kinds.

Their purposes are described in the table below.

Table 7 Information collectors :widths: 25 75 :header-rows: 1

Module name

Purpose

<LRMS>mod.pm

Modules responsible to query directly the named <LRMS> to obtain fresher information about queues/partitions and jobs. The mod prefix identifies the latest version of these modules that can automatically collect information about GLUE2 ExecutionEnvironments, that is, the hardware of nodes of each partition in the cluster.

<LRMS>.pm

Legacy LRMS modules that cannot return proper hardware information about the nodes. Since some of the LRMS backends are maintained by communities, not all communities have interest in collecting detailed hardware information, so they were never developed in that direction.

HostInfo.pm and Sysinfo.pm

Collect information about the status of the frontend and the mount points, as well as other details such as hostname, load, available storage etc.

RTEinfo.pm

Collects information about Run Time Environments (See also RunTime Environments in ARC)

ARC1ClusterInfo.pm

Aggregates information from all sources into a datastructure that can be used to generate GLUE2 documents.

ARC0ClusterInfo.pm

Aggregates information from all sources into a datastructure that can be used to generate LDIF documents. Used for legacy LDAP information rendering.

The information renderers

Information renderers are responsible for transforming the datastructures generated by infocollectors into documents in various formats.

Below a description of the modules and their purposes.

Table 8 Information renderers

Module name

Purpose

XmlPrinter.pm

Generic library to render XML documents.

GLUE2xmlPrinter.pm

Specialized library to render XML documents according to the GLUE2 schema.

LdifPrinter.pm

Generic library to render LDIF documents.

NGldifPrinter.pm

Specialized library to render LDIF documents according to the NorduGRID schema.

GLUE2ldifPrinter.pm

Specialized library to render LDIF documents according to the GLUE2 schema.

Helper libraries

The infoproviders make use of intermediate libraries to perform various tasks required for collection, rendering and other support features. In the following table their description.

Table 9 Libraries of the information system

Module name

Purpose

LogUtils.pm

Logging facility to keep consistency across logging in various infoprovider modules.

IniParser.pm

Library to parse ARC configuration INI-formatted files.

ConfigCentral.pm

Main library for parsing configuration files and translating them into datastructures usable by the information system scripts.

InfoChecker.pm

Lint library to cleanup configuration entries that are not used by the information system. It also does some format correction and consistency checks.

LRMSInfo.pm

Interface to any kind of LRMS information collector. It takes care of selecting the proper LRMS information collector, and add compatibility fixes for legacy collectors.

ARC0mod.pm

Compatibility layer to LRMS information collectors to add hardware datastructures to the legacy modules, for consistency.

InfosysHelper.pm

Helper library to bridge A-REX infocollection and LDAP updates. It is used to synchronize the collection to the update of the database to avoid inconsistencies between the XML and LDIF documents.

The legacy LDAP subsystem

The legacy LDAP subsystem provides information about the cluster using the LDAP protocol in the form of LDIF files.

It supports two main custom GRID schemas:

It is constituted by 4 main elements:

  • A LDAP server, openLDAP (slapd), that serves information on port 2135 to anonymous users.

  • A third party Python script bdii-update that periodically executes ldapadd/modify/delete against the LDAP database to keep the information up to date.

  • A generated ldif-provider.sh script that outputs a LDIF document containing all the information about the cluster, used by bdii-update

  • A set of startup scripts that configure the LDAP database so that it can work with legacy GRID information from ARC.

This subsystem is disabled by default in ARC7. It can be enabled in configuration by adding the following blocks:

Static and dynamic info

It is important to understand the nature of the contents of the documents in the information system. It is not a real-time system: the information is collected asynchronously and serially from various sources and then published with a validity timestamp so that the clients can decide what to do with such info.

Most of the static information is taken from configuration files of various subsystems. The dynamic information updated regularly concerns storage availability, system load, status of jobs, statistics about jobs currently managed by the system.

The information system shows no historical records of any object, nor keeps a log of what happened before the latest scan.

Althought the generated data is used by A-REX in some REST responses, A-REX has some in-memory information that is fresher than the one presented by the information system. See more details below.

A-REX itself has no means to check what is the state of a job inside the LRMS, therefore the LRMS backends and the information system are the only sources of information about what is happening with jobs inside the cluster batch system.

A-REX in-memory information

A-REX keeps in-memory coarse-grained information about important changes in the job states. The job status retrievable using the REST API rest_interface is more reliable than the job state contained/generated by the infosys and therefore for tracking a job it is more reliable to retrieve this information via REST instead of LDAP.