The HECToR Service is now closed and has been superceded by ARCHER.

HECToR Monthly Report, June 2008

Information on the utilisation, disk allocations, slowdowns and helpdesk statistics can be found in the associated SAFE monthly report.

Dates covered: 08:00 1 June 2008 to 08:00 1 July 2008
Number of hours: 720

1: Availability

Scheduled down time: 64 hours 12 minutes. This includes a 48-hour slot for the X2 integration.

Incidents

The following incidents were recorded:

SeverityNumber
15
23
320
40

Of the four severity levels, level 1 corresponds to a contractual failure.

Details of severity level 1 incidents

ID Date Description Length Attribution
Incident-260 06/06/2008 HSN collapsed after compute node failure 01:32 Cray
Incident-268 17/06/2008 HSN collapse after compute node failure 03:13 Cray
Incident-270 19/06/2008 Voltage fault on c18-1c2s4n 00:59 Cray
Incident-278 23/06/2008 Close for security problem 15:52 Cray
Incident-282 27/06/2008 PDU fail in cab 9 delayed service return 00:22 Cray

MTBF and Serviceability

AttributionFailuresMTBFUDTServiceability
Cray514621:58:0096.7%
Site0 ~ 00:00:00100%
External0 ~ 00:00:00100%
Other0 ~ 00:00:00100%
Overall514621:58:0096.7%
  • Note 1: Serviceability%= 100*(WCT-SDT-UDT)/(WCT-SDT)
  • Note 2: MTBF (Mean Time Between Failures) is defined as 732/Number of failures.

2: Courses

This information is supplied by NAG Ltd

Title of Course Dates Available places Total attending HECToR Users HECToR Staff
10-11 June 2008 Programming Tricks for HECToR 12 4 4 4
12-13 June 2008 Techniques for Achieving Scalability 12 1 1 0

3: Quality tokens

None set this month

4: Hours worked

GroupDays workedFTEs
USL73.6 4.1
OSG 74.33 4.2

5: Performance metrics

Technology Provision

Description TSL FSL Value
Technology reliability 85% 98.5% 96.7%
Technology throughput 7000 hours 8367 hours 7750 hours
Capability job completion rate 70% 90% 93.3%
Technology MTBF 100 hours 126.4 hours 146 hours

Note: Technology throughput is calculated: 12*(732-UDT-SDT); 732 - annual average number of hours in a month

Note: MTBF is calculated as 732/number of failures

Service Provision

Description TSL FSL USL Value
Percentage of non-in-depth
queries resolved within one day
85% 97% 99% 100%
Number of SP FTEs 7.3 8.0 8.7 8.3
SP serviceability 80% 99% 99.5% 100%