Log Database Use Cases
From CEDPS
Contents |
CEDPS Log Database Use Cases
Version 1 of the CEDPS Log Database assumes a single database per site. Future versions will provide a distributed overlay for all site Log Databases.
Here are some sample queries that should be supported.
Note: all queries below are for a given time window
From a GOC admin:
- find log messages for jobs from VO=Atlas running at site=FNAL
- find log messages related to service=condor, user=Joe, site=Indiana
- find log messages for user=Joe
- find log messages with status=error
- find log messages which event=*authn* with status=error
- find log messages where the time between start/end events are more than 3X the baseline
- find log messages with start events with no matching end event
From a User (ie: all these relate to logs for the user DN):
- find log messages for all my jobs
- find log messages with status=error
- find log messages which event=*authn* with status=error
- find log messages where time intervals are more than 3X the baseline, where the baseline is computed from historical data in the log database
From a VO:
- what sites had connection attempts for a given user DN
- what data files were accessed most
- which user moved the largest amount of data in my VO
- find all logs where job manager status=killed (ie: jobs that were killed for running too long)
- which user submitted the most jobs (Gratia is better choice for this, but maybe it should be supported?)
From a site admin:
- what was the average GridFTP transfer speed on server=gridftp.lbl.gov
- what are the top 10 fastest/slowest sites receiving GridFTP transfers from my site
- what is the distribution of job run times on CE=myComputeElement
Query output format
To start with, we assume a command line tool that uses the grid proxy certificate to connect to a web service wrapper for mySQL. The tool should be able to output the following:
- CEDPS "Best Practice" format (ie: name=value pairs)
- CSV format
Required functionality
To support the above queries efficiently requires the following functionality:
- support for wildcard queries on DNs and event names
- performance baselines for many types (all?) of start/end pairs
The following functionality is assumed to be provided directly by SQL:
- restriction of the result based on boolean combinations of attribute/value pairs (where clause)
- equijoin, and other types of joins (join)
- elimination of duplicate results in a given result set (distinct operator)
The following may need to be done at the implementation level:
- data indexed by time, DN, VO, event
Other issues
How to handle Distributed Queries?
Assume central log database has the following ONLY:
- start/end events for jobs and gridftp transfers that include a DN and a GUID
- URL of site archive for this event
