At line 0 added 66 lines. |
+ !!! SICS Log Files and Tracing |
+ |
+ !! SICS Log Files |
+ |
+ In the instrument account, there is a log directory. To this directory, the SICS server |
+ automatically writes an automatic log file. These log files contain all commands executed with |
+ either user or managers privilege. They thus allow to figure out what is going on at the instrument. |
+ It is always a good idea to look into the log files when a problem is reported: the statements |
+ of the scientists often are misleading or wrong. |
+ |
+ The log files follow a naming convention. It is autoYYYY-MM-dd@HH-MM.log. Where YYYY is for the year, |
+ MM for the months, DD for the day, HH for the hour and MM for the minute. Normally, there is only one |
+ log file per day. But when SICS gets restarted, a new log file is generated. In order to use those log files, |
+ do this: |
+ |
+ # Extract from the scientists the date and time the problem happened |
+ # cd into the log directory |
+ # With ls autoYYY-MM-DD*.log figure out which log files were written that day. Replace placedolders as |
+ required: for example auto2014-07-9*.log to show all log files july, 9 in 2014. |
+ # Open the relevant log file |
+ # The log file contains time stamps, use these and the file writing times to navigate to the interesting |
+ section of the log file |
+ # Look for interesting information and error messages |
+ |
+ |
+ !! Tracing |
+ |
+ For the really hard problems, log files are not enough. This is why there is a tracing facility in SICS. |
+ Trace files contain everything interesting SICS does: communication with hardware, device start/stop messages, |
+ performance measurements, I/O, you name it. Not surprisingly, trace files can become quite large. This is why |
+ tracing is off by default. From a SICS command line, tracing can be controlled via the following commands: |
+ |
+ ;trace log: shows if tracing is on or off |
+ ;trace log filename: Starts writing a trace file into the filename given as a parameter |
+ ;trace log close: switches tracing off again. |
+ |
+ ! Analysing Trace Files |
+ |
+ Obviously, analysing trace files depends on the problem at hand. Thus only some general information about |
+ trace files can be given. A trace file starts with a dump of all parameters known to SICS. This is followed |
+ by the real trace entries. A trace entry looks like this: |
+ |
+ {{{ |
+ io:countersct:1404857127.006489:send:RS |
+ }}} |
+ |
+ This is subsystem specifier:name of component: time stamp: component specific data. For the example this means: |
+ io is the subsystem, countersct is the name, 1404857127.006489 is the time stamp, send:RS is the component |
+ specific data. The time stamp is unix time in seconds, after the comma is the base 10 converted |
+ sub second resolution. |
+ |
+ Known sub systems include: |
+ |
+ ;io: For input/output |
+ ;par: for parameter changes |
+ ;com: for client communication |
+ ;sys: system messages |
+ ;dev: device messages, mostly start and stop of devices |
+ |
+ Any 10 minutes there are sys:TIMESTAMP message with time in easily human readable format. Search for these |
+ when it is known when something interesting has happened. |
+ |
+ For everything else, grep is your friend for extracting the I/O to a certain device etc. |
+ The detailed timestamps give an information about response times which can be interesting when hardware |
+ is supected of being slow. |
+ |