HECToR Monthly Report, March 2011
Information on the utilisation, disk allocations, slowdowns and helpdesk statistics can be found in the associated SAFE monthly report.
This report relates to the Phase 2b service only. Performance statistics for the non-contractual Phase 2a system can be found here.
Dates covered: 08:00 1 March 2011 to 08:00 1 April 2011
Number of hours: 744
1: Availability
Scheduled down time: 11 hours 17 minutes.
Incidents
The following incidents were recorded:
Severity | Number |
1 | 0 |
2 | 2 |
3 | 18 |
4 | 0 |
Of the four severity levels, level 1 corresponds to a contractual failure.
Out of the 18 SEV-3 Incidents, 18 were attributed to single node failure events.
Details of severity level 1 incidents
None this month.MTBF and Serviceability
Attribution | Failures | MTBF | UDT | Serviceability |
Cray | 0 | ∞ | 00:00 | 100% |
Site | 0 | ∞ | 00:00 | 100% |
External | 0 | ∞ | 00:00 | 100% |
Other | 0 | ∞ | 00:00 | 100% |
Overall | 0 | ∞ | 00:00 | 100% |
- Note 1: Serviceability%= 100*(WCT-SDT-UDT)/(WCT-SDT)
- Note 2: MTBF (Mean Time Between Failures) is defined as 732/Number of failures.
Details of single node failures
Error Type | Number |
Software error - kernel panic/LBUG | 12 |
HT lockup | 3 |
MCA bank 4 error (DIMMs) | 2 |
Nodes admin down | 1 |
2: Courses
This information is supplied by NAG LtdTitle of Course | Dates | Available Places | Ordinary Attendees | Paying Attendees | CSE Staff | Total Attending |
Advanced Computational Methods II (MSc), University of Southampton | Every Tuesday in March (1,8,15,22, 29) 2011 | 20 | 20 (18 Msc, 2 PhD) | 0 | 0 | 20 |
3: Quality Tokens
None set this month.4: Hours Worked
Group | Days worked | FTEs |
USL | 90.2 | 5.1 |
OSG | 76.7 | 4.3 |
5: Performance Metrics
Technology Provision
Description | TSL | FSL | Value |
Technology reliability | 85% | 98.5% | 100% |
Technology throughput | 7000 hours | 8367 hours | 8648 hours |
Capability job completion rate | 70% | 90% | 100 % |
Technology MTBF | 100 hours | 126.4 hours | ∞ |
Note: Technology throughput is calculated: 12*(732-UDT-SDT), where 732 is the annual average number of hours in a month.
Note: MTBF is calculated as 732/number of failures
Service Provision
Description | TSL | FSL | USL | Value |
Percentage of non-in-depth queries resolved within one day | 85% | 97% | 99% | 98.7% |
Number of SP FTEs | 7.3 | 8.0 | 8.7 | 9.4 |
SP Serviceability | 80% | 99% | 99.5% | 100% |