The HECToR Service is now closed and has been superceded by ARCHER.

HECToR Monthly Report, September 2011

Information on the utilisation, disk allocations, slowdowns and helpdesk statistics can be found in the associated SAFE monthly report.

Dates covered: 08:00 1 September 2011 to 08:00 1 October 2011
Number of hours: 720

1: Availability

Scheduled down time: 10 hours 44 minutes.

Incidents

The following incidents were recorded:

SeverityNumber
10
20
318
40

Of the four severity levels, level 1 corresponds to a contractual failure.

Out of the 18 SEV-3 Incidents, 18 were attributed to node failure events.

Details of severity level 1 incidents

None this month.

MTBF and Serviceability

AttributionFailuresMTBFUDTServiceability
Cray000:00100.0%
Site000:00100.0%
External000:00 100.0%
Other/Unknown000:00 100.0%
Overall000:00 100.0%
  • Note 1: Serviceability%= 100*(WCT-SDT-UDT)/(WCT-SDT)
  • Note 2: MTBF (Mean Time Between Failures) is defined as 732/Number of failures.

Details of single node failures

Error Type Number
Software problem (related to a user application, bug #775153) 12
MCA bank 1/4 error 4
Voltage fault 1
Thermal trip on processor 1

2: Courses

This information is supplied by NAG Ltd
Title of Course Dates Available Places Ordinary Attendees Paying Attendees CSE Staff Total Attending
Fortran 95, University of Southampton 19-21 September 2011 40 17 0 0 17
Core Algorithms for High Performance Scientific Computing, Uni of Warwick 26-30 September 2011 36 24 0 0 24
Introduction to HECToR, University of Oxford 30 September 2011 12 1 0 0 1

3: Quality Tokens

Date Tokens Awarded Comment Consortium
22-Sep-2011 * * * * * Positive tokens - no user comments received x01
29-Sep-2011 * * * * * Positive tokens - no user comments received e139

4: Hours Worked

GroupDays workedFTEs
USL 82.6 4.6
OSG 79.0 4.5

5: Performance Metrics

Technology Provision

Description TSL FSL Value
Technology reliability 85% 98.5% 100%
Technology throughput 7000 hours 8367 hours 8655 hours
Capability job completion rate 70% 90% 100%
Technology MTBF 100 hours 126.4 hours

Note: Technology throughput is calculated: 12*(732-UDT-SDT), where 732 is the annual average number of hours in a month.

Note: MTBF is calculated as 732/number of failures

Service Provision

Description TSL FSL USL Value
Percentage of non-in-depth
queries resolved within one day
85% 97% 99% 98.7%
Number of SP FTEs 7.3 8.0 8.7 9.1
SP Serviceability 80% 99% 99.5% 100%