left right

SUPERCOMPUTACION


19-23 de Junio de 2000


Introducción


Area de Arquitectura y Tecnología de Computadores


Francisco José Suárez Alonso

fran@atc.uniovi.es



left right
Diapositiva 0
19-23 de Junio de 2000

Introducción
up up

Indice de Contenidos

left right


left right
Diapositiva 1
up up
19-23 de Junio de 2000

Introducción
up up

ARQ: Clasificación de Flynn

left right


left right
Diapositiva 2
up up
19-23 de Junio de 2000

Introducción
up up

ARQ: Computadores SISD

left right


left right
Diapositiva 3
up up
19-23 de Junio de 2000

Introducción
up up

ARQ: Computadores Matriciales (SIMD)

left right


left right
Diapositiva 4
up up
19-23 de Junio de 2000

Introducción
up up

ARQ: Computadores Vectoriales

left right


left right
Diapositiva 5
up up
19-23 de Junio de 2000

Introducción
up up

ARQ: Computadores MIMD

left right


left right
Diapositiva 6
up up
19-23 de Junio de 2000

Introducción
up up

ARQ: Multiprocesadores de Memoria Compartida

left right


left right
Diapositiva 7
up up
19-23 de Junio de 2000

Introducción
up up

ARQ: Multiprocesadores de Memoria Distribuida

left right


left right
Diapositiva 8
up up
19-23 de Junio de 2000

Introducción
up up

ARQ: Paso de Mensajes básico

left right


left right
Diapositiva 9
up up
19-23 de Junio de 2000

Introducción
up up

ARQ: Paralelismo de Grano Fino

left right


left right
Diapositiva 10
up up
19-23 de Junio de 2000

Introducción
up up

ARQ: Paralelismo de Grano Grueso

left right


left right
Diapositiva 11
up up
19-23 de Junio de 2000

Introducción
up up

SUP: IBM SP

left right





Standard configuration

Node type

375 MHz POWER3 SMP Thin Node

375 MHz POWER3 SMP Wide Node

POWER3 SMP High Node

Processor

375 MHz 2, 4-way POWER3-II

375 MHz 2, 4-way POWER3-II

222 MHz 2, 4, 6, 8-way POWER3

L1 cache

64KB data(a)/32 KB instruction(a)

64KB data(a)/32KB instruction(a)

64KB data(a)/32KB instruction(a)

L2 cache

8MB(a)

8MB(a)

8MB(a)

RAM memory

256MB

256MB

1GB

Mem. bus width

128-bit

128-bit

1280-bit

Internal storage

None Required(b,c)

None Required(b,c)

None Required(b,c)

Disk/media bays

Two

Four

Two/Twenty-six(h)

PCI Expansion slots

Two

Ten

Five (f)/Fifty-three(h)

Nodes per frame

 tall/short

Sixteen/eight

Eight/four

Four/N/A

Bus speeds

 for I/O adapters

132MB/second

132 and 264MB/second (triple bus)

264MB/second

 for Switch adapter

480MB/second

480MB/second

400MB/second

Adapters

Integrated Ultra2 SCSI and Ethernet (10/100 Mbps)

Integrated Ultra2 SCSI and Ethernet (10/100 Mbps)

Integrated Ultra2 SCSI and Ethernet (10/100 Mbps)

System expansion

Maximum RAM

8GB

8GB

16GB

Maximum internal storage

36.4GB(C)

109.2GB (C)

36.4(C)/946.2GB(h)

SP expansion I/O units

N/A

N/A

Zero through Six

SP Switch and adapter

300MB/second

300MB/second

300MB/second



Node type

332 MHz SMP Thin Node

332 MHz SMP Wide Node

Processor

332 MHz 2, 4-way PowerPC 604e

332 MHz 2, 4-way PowerPC 604e

L1 cache

32KB data(a)/32KB instr(a)

32KB data(a)/32KB instr(a).

L2 cache

256KB(a)

256KB(a)

RAM memory

256MB

256MB

Mem. bus width

128-bit

128-bit

Internal storage

None required (b,c)

None required (b,c)

Disk/media bays

Two

Four

PCI Expansion slots

Two

Ten

Nodes per frame

 tall/short

Sixteen/eight

Eight/four

Bus speeds

 for I/O adapters

132MB/second

132 and 264MB/second (triple bus)

 for Switch adapter

400MB/second

400MB/second

Adapters

Integrated SCSI-2 F/W; Ethernet (10 Mbps)

Integrated SCSI-2 F/W and Ethernet (10 Mbps)

System expansion

Maximum RAM

3GB

3GB

Maximum internal storage

36.4GB(C)

72.8GB(C)

SP expansion I/O units

N/A

N/A

SP Switch and adapter

300MB/second

300MB/second



 
RS/6000 SP Comparison Chart
BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0" BORDER="0"
Machine type     9076     9076     9076     9076     9076    
Node Type
    332 MHz     332 MHz     375 MHz POWER3 SMP     375 MHz POWER3 SMP     POWER3 SMP    
Packaging
    Thin     Wide     Thin     Wide     High    
Microprocessor
                                 
Type
    PowerPC 604e     PowerPC 604e     POWER3-II     POWER3-II     POWER3    
Processors per node
    2/4     2/4     2/4     2/4     2/4/6/8    
Clock rates
    332 MHz     332 MHz     375 MHz     375 MHz     222 MHz    
Memory
                                 
System(min/max)
    256MB/3GB*     256MB/3GB*     256MB/8GB*     256MB/8GB*     1GB/16GB*    
L2 Cache
    256KB**     256KB**     8MB**     8MB**     4MB**    
PCI I/O Capacity
                                 
Slots available(base node)
    2 PCI (32-bit)     10 PCI(7 32-bit/3 64-bit)     2 PCI (32-bit)     10 PCI (2 32-bit/8 64-bit)     5 PCI(1 32-bit/4 64-bit)    
Slots available(per SP Expansion I/O)
    N/A     N/A     N/A     N/A     8 PCI (64-bit)    
Slots available(base node and six SP Expansion I/O Units)
    N/A     N/A     N/A     N/A     53 PCI (1 32-bit/52 64-bit)    
Internal Disk Capacity
                                       
min/max(base node)
     
0/36.4GB
     
0/72.8GB
     
0/36.4GB
     
0/109.2GB
     
0/36.4GB
   
min/max (base node and six SP Expansion I/O Units)
     
N/A
     
N/A
     
N/A
     
N/A
     
0/946.4GB
   
Benchmarks
                                 
SPECint_base_rate95
    245/485     245/485     407/812     407/812     229/450/661/908    
SPECfp_base_rate95
    206/364     206/364     804/1359     804/1359     461/910/1329/1760    
Relative ROLTP Performance
     
17.9/32.8
     
17.9/32.8
     
44.0/80.0
     
44.0/80.0
     
23/43.3/64.0/81.3
   
9076-500 - short frame available with up to 8 Thin nodes or 4 Wide nodes
9076-550 - tall frame available with up to 16 thin nodes, 8 wide nodes, 4 high nodes or 16 SP Expansion I/O Units

left right
Diapositiva 12
up up
19-23 de Junio de 2000

Introducción
up up

SUP: SGI Origin2000

left right




PROCESSOR DATA
Microprocessor MIPS RISC R10000 64-bit CPU or
R12000™ 64-bit CPU
Primary caches 32KB two-way set-associative on-chip instruction cache
32KB two-way set-associative on-chip data cache
Secondary cache 4MB or 8MB cache per CPU


NODE CARD
CPU capacity 2 R10000 or R12000 CPUs
Memory capacity Up to 4GB ECC SDRAM
SDRAM
HW cache coherency yes
Interleaving 4-way per node card
Memory bandwidth 680MB/sec sustained, 780MB/sec peak


DESKSIDE SYSTEM OR RACK MODULE
Processors 1 to 4 node cards, 2 to 8 CPUs
I/O bandwidth 5.0GB/sec sustained
6.24GB/sec peak
I/O boards 12 XIO or 11 XIO
and 3 PCI 32- or 64-bit
Internal peripherals 5 3.5-inch Ultra SCSI devices,
1 5.25-inch SCSI device
Independent power yes
Redundant power optional
Redundant cooling yes


MAXIMUM RACK SYSTEM
Processors 1 to 256 node cards, 2 to 512 CPUs
I/O bandwidth 80GB/sec sustained, 100GB/sec peak
I/O boards 192 XIO or 184 XIO
and 24 PCI 32- or 64-bit
Internal peripherals 512 3.5-inch Ultra SCSI devices,
64 5.25-inch SCSI devices
Independent power yes
Redundant power optional
Redundant cooling yes


SOFTWARE
System Software IRIX® 6.5 ASE, X/OPEN XPG4 BASE 95, IEEE POSIX 1003.2, and 1003.1b, 1003.1c FIPS 151-2, UNIX® System V.4, 4.3 BSD extensions, MIPS ABI, SVID issue 3, X11R6, Motif™ Window Manager 1.2, IRIS GL™, OpenGL®
Networking TCP/IP, NFS™ V2/V3, RSVP, DHCP, Bulk Data Service (BDSpro), NetVisualyzer™, SNMP management, SNMP MIB, NIS/ONC+
Server software XFS™ 64-bit journaled filesystem with guaranteed rate I/O, IRIS NetWorker, Performance Co-Pilot system and network performance monitoring software, System MIB (Provision), Software Distribution (Propel)
Compilers ANSI C, C++, Fortran 77, Ada, Pascal, Power C Accelerator (PCA), Power Fortran 77, Fortran 90, Power Fortran 90
PC/Macintosh® integration Syntax TotalNet Advance server, supports Windows® 95 and Windows NT® (SMB), NetWare, AppleShare®, Samba environments for PC and Macintosh
Security Trusted IRIX™ B1 security, Commercial Security Pack (CSP)
Web server Netscape® Enterprise server

left right
Diapositiva 13
up up
19-23 de Junio de 2000

Introducción
up up

SUP: CRAY T90

left right





Cray T94 Cray T916 Cray T932
No. of CPUs 1 ­ 4 8 ­ 16 16 ­ 32
Peak Performance 1.8 ­ 7.2
GFLOPS
14 ­ 28
GFLOPS
28 ­ 56
GFLOPS
Memory Size .5­2GB 2­8GB 4­16GB
Peak Memory Bandwidth 100GB/sec 450GB/sec 900GB/sec
Max No. of GIGARINGTM CHANNELS 8 16 32
Peak I/O Bandwith 8GB/sec 16GB/sec 32GB/sec
Cooling Technology Air or liquid Liquid Liquid

left right
Diapositiva 14
up up
19-23 de Junio de 2000

Introducción
up up

SUP: CRAY T3E

left right





Cray T3E-1200E
No. of CPUs

air-cooled
liquid-cooled
6 ­ 128
32­2048
Processor MHz 600
Peak performance More than 2.4TFLOPS
Memory Size per Processor 256MB­2GB
Interconnect Topology 3D bidirectional Torus
Max. Bisection Bandwidth 122GB/sec
Max. No. of GIGARING Channels 128
Peak I/O Bandwidth 128GB/sec

left right
Diapositiva 15
up up
19-23 de Junio de 2000

Introducción
up up

SUP: CRAY SV1

left right





Cray SV1-1A Cray SV1-1 Cray SV1-4 Cray SV1-8 Cray SV1-32
Number of SMP Nodes 1 1 4 8 32
Peak Performance 9.6-19
GFLOPS
9.6-38
GFLOPS
38-154
GFLOPS
77-308
GFLOPS
308-1229
GFLOPS
Memory Size 2 ­ 16GB 4­32GB 16­128GB 32­256GB 128­1024GB
Number of CPUs:

4.8 GFLOPS CPUs

plus 1.2 GFLOPS CPUs

or if all 1 GFLOPS
Up to 3


4+



9.6-19
Up to 6


8+



9.6-38
Up to 24


32+



38-154
Up to 48


64+



77-3086
Up to 192


256+



308-1229
CPU Clock (Mhz) 300 300 300 300 300
Memory Technology DRAM DRAM DRAM DRAM DRAM
Cooling Options Air Air Air Air Air or
Water Assisted

left right
Diapositiva 16
up up
19-23 de Junio de 2000

Introducción
up up

SUP: FUJITSU AP3000

left right




Node Specifications

Types of NodesU170U200U300
Number of processors11 or 21 or 2
Processor UltraSPARC UltraSPARC UltraSPARC
Clock cycle 167MHz 200MHz 300MHz
SPECint95 5.567.72 (1CPU) / 7.88 (2CPUs) 12.1 (1CPU) / 12.3 (2CPUs)
SPECfp95 9.0611.40 (1CPU) / 14.70 (2CPUs) 15.5 (1CPU) / 20.2 (2CPUs)
Cache size-internal 32KB 32KB/CPU32KB/CPU
Cache size-external 512KB 1MB/CPU2MB/CPU
Memory size 64 to 1GB 128MB to 2GB128MB to 2GB
Disk capacity 2.1 to 4.2GB 4.2 to 8.4GB 4.2 to 8.4GB
Available S-BUS slots 2 3 3


System Specifications

Number of nodes4 to 1024
Types of nodes U170, U200, U300
Memory size 256MB to 2TB
Disk capacity 8.4GB to 8.4TB
Inter-node network AP-Net(200MB/sec bi-directional)
External networks Ethernet, Fast Ethernet, FDDI, ATM
Storage devices Disk array, Tape libraries, etc.
OS Solaris 2.5.1 and later

left right
Diapositiva 17
up up
19-23 de Junio de 2000

Introducción
up up

SUP: FUJITSU VPP

left right





Technical Specifications

System configurations:
VPP700E Series VPP300E Series VX-E Series
No. of PEs 8 to 256 1 to 16 1 to 4
Peak performance 19.2 to 614.4 GFLOPS 2.4 to 38.4 GFLOPS 2.4 to 9.6 GFLOPS
Memory capacity 4 to 512GB
(512MB or 2GB per PE)
512MB to 32GB
(512MB or 2GB per PE)
512MB to 8GB
(512MB or 2GB per PE)
Memory throughput 156.8 to 5017.6 GB/s 19.6 to 313.6 GB/s 19.6 to 78.4 GB/s
Disk capacity Starting with 4GB Starting with 4GB Starting with 4GB
Crossbar network 615MB/s X 2/PE 615MB/s X 2/PE 615MB/s X 2/PE



VPP5000VPP5000U
Number of PEs4 to 128 (512*)1
Peak performance38.4 to 1,229 (4,915GFLOPS*)9.6 GFLOPS
Memory size16 to 2,048 GB (8,192GB*)4 to 16 GB
Cross bar3.2 GB/s/PE max. (send and receive)-
* by special order

left right
Diapositiva 18
up up
19-23 de Junio de 2000

Introducción
up up

SUP: COMPAQ SC

left right





The AlphaServer SC Roadmap
Year CPU Clock (MHz) CPUs per Node Nodes TFLOPS
1999 667 4 128 0.7
2000 >800 32 128 ~7
2001 >1000 64 256 ~30
2002 >1200 64 256 ~40
2004 ~1500 64 256 ~100


Specifications

Form Factor

Standard RETMA Racks, H9A15-MD, 230/240V

Processor/ES40

Choice of 1 to 4 667 MHz Alpha 21264a processors; each with 64 KB I-cache, 64 KB D-cache on chip, and 8 MB per processor of Level 2 cache

Memory/ES40

Choice of 1 – 16 GB, ECC, 4-way interleaved industry-standard DIMM memory, expandable to 8 GB

ES40 Architecture

Advanced dual 256-bit wide memory data paths and crossbar switch technology providing 5.2 GB/sec peak memory bandwidth; dual 64-bit PCI busses providing over 500 MB/sec peak I/O throughput

Performance

For the latest performance numbers visit www.compaq.com/hpc

ES40 PCI

10 64-bit PCI slots; 2-4 available with initial configurations

Storage Controllers

Integrated single-channel Ultra2-SCSI, Ultra SCSI RAID, HiPPI, and Fibre Channel

Network Controllers

Includes dual 10/100 Fast Ethernet, and asynch. Communications, optional Gigabit Ethernet and ATM configurations

Drive Bays

Includes 1 internal hot-swap StorageWorks Ultra2 SCSI drive bay in each ES40, with 2 18.2 GB 1.6" drives included. Four removable media bays: one 3.5" bay for diskette drive; one 5.25" for CD-ROM; and two open half-height 5.25" bays for tape or hard disk drives

Power Supply

Includes three hot-swap 750-watt (N+1) power supplies per ES40.

Cooling

Air, Six hot-swap redundant variable speed fans per ES40

High Availability

Hot-swap redundant power and cooling, auto reboot, thermal management software, remote system management, RAID, hot-swap drives, memory fail over, ECC memory, ECC cache, SMP CPU fail over, error logging, optional Uninterruptible Power Supply (UPS), and UPS Power Management Software

Service and Support

Compaq provides a 3-year on-site, 5 day x 9-hour warranty with next day response. Optional service options for up to 4-hour same-day response time are available, as well as a complete portfolio of worldwide service offerings to maximize your critical system environment.

Operating System

Tru64 UNIX 5.0 with Java.

Required Software

AlphaServer SC System Software is required with all AlphaServer SC Series systems. Also highly recommended is AlphaServer SC Development Software which includes FORTRAN, C++, and Developers Toolkit for each node. For the Development Software, all nodes must be licensed if ordered.

SC Interconnect

DMA driven; get and put ; Bandwidth 200 MB/sec/rail bi-directional.

SC Interconnect Switch

8-way x-bar chips; 16 or 128 port packages; up to 20 m cables; 0.035 usec switch Latency; 3 usec DMA/Shmem and 5.7 usec MPI Latency

Single System Management

AlphaServer SC System Software supports system wide gang scheduling, resource management and performance monitoring. Parallel File System, and Cluster File System software enable the management of up to four 32-node CFS domains.

Value-added Implementation Services (VIS)

AlphaServer SC standard package includes Staging and Integration of the AlphaServer SC System, storage devices, QSW interconnect switch and other peripheral devices, software load of Tru64 UNIX operating systems and associated layered products.

Optional services include custom site planning, custom freight arrangements, VIS engineering team travel and installation of AlphaServer SC, and tailored Customer Configuration Documentation.

Support and Training

90 days of Consultant-level onsite support is included to help address the planning and introduction of the AlphaServer SC into your environment. A full range of Training Courses is available at additional charge

Installation

Installation is included with each of the System Building Blocks and is available for quotation on additional and optional equipment


left right


Diapositiva 19
up up
19-23 de Junio de 2000

Introducción
up up

SUP: SUN HPC 10000

left right





HPC 10000 Specifications

Number of processors4 to 64
Architecture400 MHz UltraSPARC-II
Cache per processor Primary: 16-KB instruction, and 16-KB data on chip

Secondary: 4-MB external cache

CPU interface64-bit Ultra Port Architecture (UPA) slots

System Boards

Number of boardsMaximum of 16 boards per system; minimum configuration contains four system boards. Each holds up to four processors, up to four SBus cards, memory module with four banks of eight SIMMS each

Main Memory

2-GB to 64-GB memory capacity per system

512-MB, and 2-GB memory expansion options (each a group of 16 SIMMs)

Up to two memory expansion options per system board

Internal Mass Storage

Disk arraySPARCstorage Array Model 100 Series (maximum thirty 4.2-GB fast/wide SCSI-2 disks)
Disk traySPARCstorage RSM (max 7- x 9.1-GB disks). SPARCstorage RSM 2000 with up to 318 GB of redundant storage.

Software

Operating systemSolarisTM 2.6
Parallel programming environmentMPI, PVM, Performance Workshop, C, C++, F77, F90, JavaTM
NetworkingONCTM, NFSTM, TCP/IP, SunNetTM OSI, MHS, X.25, DCE, Netware
Windowing systemOpenWindowsTM Version 3 optional
Resource management softwareLoad Sharing Facility (LSF)

left right
Diapositiva 20
up up
19-23 de Junio de 2000

Introducción
up up

SUP: HP VCLASS

left right





Technical Specifications

  V2250 V2500 (SMP) V2600 (SMP) V2500/
V2600 (SCA)
System Processing Unit
Central processor 64-bit PA-RISC PA-8200 64-bit PA-RISC PA-8500 64-bit PA-RISC PA-8600 64-bit PA-RISC PA-8600
Clock frequency 240 MHz 440 MHz 552 MHz 440 or 552 MHz
Number of processors 1–16 2–32 2–32 16–128
Cache size (per processor) Direct map
2-MB L2 Data Cache
2-MB L2 Instruction Cache
Four-way set-associative
1-MB L1 Data Cache (on-chip)
0.5-MB L1 Instruction Cache (on-chip)
Four-way set-associative
1-MB L1 Data Cache (on-chip)
0.5-MB L1 Instruction Cache (on-chip)
Four-way set-associative
1-MB L1 Data Cache (on-chip)
0.5-MB L1 Instruction Cache (on-chip)
Operating environment HP-UX 11 HP-UX 11 HP-UX 11 HP-UX 11


  V2250 V2500 (1 Cabinet) V2500 (4 Cabinets)
SCA configuration
V2600
HyperPlane Crossbar, Memory Subsystem (SMP, SCA)
Memory Architecture Crossbar-based symmetric
multiprocessor (SMP)
Crossbar-based symmetric
multiprocessor (SMP)
Crossbar-based symmetric
multiprocessor (local cabinet)
Cache-coherent, non-uniform memory
access (remote cabinet)
Crossbar-based symmetric
multiprocessor (SMP)
Type 8x8 non-blocking multiported crossbar 8x8 non-blocking multiported crossbar 8x8 non-blocking multiported crossbar (local cabinet) 8-way interleaved, split-transaction SCA HyperLinks (remote cabinet) 8x8 non-blocking multiported crossbar
Bandwidth (max.) 15.36 GB/s 15.36 GB/s 61.44 GB/s 15.36 GB/s
Memory Capacity 1 GB to 16 GB ECC protected 1 GB to 32 GB ECC protected 4-GB to 128-GB ECC protected 1 GB to 32 GB ECC protected
Memory Interleaving Up to 32-way Up to 256-way 256-way per cabinet Up to 256-way
I/O Subsystem
Number of channels 8 x PCI 32-bit 8 x PCI 64-bit 32 x PCI 64-bit 8 x PCI 64-bit
Channel bandwidth 240 MB/s (bidirectional) 240 MB/s (bidirectional) 240 MB/s (bidirectional) 240 MB/s (bidirectional)
Peak Aggregate I/O Channel
Bandwidth
1.9 GB/s 1.9 GB/s 7.6 GB/s 1.9 GB/s
Number of PCI I/O cards 1 to 24 1 to 28 4 to 112 1 to 28
Controllers Supported FWD SCSI-2, ATM, Fibre Channel, Token Ring, 10/100Base-T, 1000Base-SX, FDDI, HyperFabric, X.25 FWD SCSI-2, Ultra2 SCSI, ATM, Fibre Channel, Token Ring, 10/100Base-T, 100 Base-Fx, 1000 BaseSX GB , FDDI, HyperFabric, X.25 FWD SCSI-2, Ultra2 SCSI, ATM, Fibre Channel, Token Ring, 10/100Base-T, 100 Base-Fx, 1000 BaseSX GB , FDDI, HyperFabric, X.25 FWD SCSI-2, Ultra2 SCSI, ATM, Fibre Channel, Token Ring, 10/100Base-T, 100 Base-Fx, 1000 BaseSX GB , FDDI, HyperFabric, X.25
User Accessible Meida Drives 650-MB 12X CD-ROM drive
12-GB DDS-3 DAT drive
DVD drive
12-GB DDS-3 DAT drive
DVD drive
12-GB DDS-3 DAT drive
DVD drive
12-GB DDS-3 DAT drive
Internal Storage (not supported for HA configurations)
Number of Drives 16 16 64 16
Capacity (max) 288 GB 288 GB 1.1 TB 288 GB
External Storage Supported in 19" RETMA standard Rack
Up to 40 TB of Fibre Channel information storage
Supported in 19" RETMA standard Rack
Up to 50 TB of Fibre Channel information storage
Supported in 19" RETMA standard Rack
Up to 200 TB of Fibre Channel Information storage
Supported in 19" RETMA standard Rack
Up to 50 TB of Fibre Channel information storage
Physical Specifications
Cabinet Dimensions 39" (99.06 cm) + 31.5"(80.01) +
37" (93.98 cm) per cabinet
39" (99.06 cm) + 31.5"(80.01) +
37" (93.98 cm) per cabinet
39" (99.06 cm) + 31.5"(80.01) +
37" (93.98 cm) per cabinet
39" (99.06 cm) + 31.5"(80.01) +
37" (93.98 cm) per cabinet
Maximum Cabinet Weight 490 lb. (222.73 kg) 490 lb. (222.73 kg) 490 lb. (222.73 kg) 490 lb. (222.73 kg)
Environmental Specifications
Operating Temperature 20 to 30° C 20 to 30° C 20 to 30° C 20 to 30° C
Humidity 40 to 60% 40 to 60% 40 to 60% 40 to 60%
Thermal dissipation 5500 W (4351 Kcal/Hour) maximum 7500 W (5934 Kcal/Hour) maximum 7500 W (5934 Kcal/Hour) maximum 7500 W (5934 Kcal/Hour) maximum
Power Requirements 208/220 VAC single-phase (U.S.)
200VAC single-phase (Far East)
230 VAC single-phase (Europe)
208/220 VAC single-phase (U.S.)
200VAC single-phase (Far East)
230 VAC single-phase (Europe)
208/220 VAC single-phase (U.S.)
200VAC single-phase (Far East)
230 VAC single-phase (Europe)
208/220 VAC single-phase (U.S.)
200VAC single-phase (Far East)
230 VAC single-phase (Europe)

left right


Diapositiva 21
up up
19-23 de Junio de 2000

Introducción
up up

CEN: Centros de Supercomputación

left right


left right
Diapositiva 22
up up
19-23 de Junio de 2000

Introducción
up up

CEN: CEPBA

left right

CEPBA High Performance Computing Facilities
                                  
Logo CEPBA


Home Page

General Info

Projects

Tools

HPC Facilities

Reports

User Support

Contact Us



Mail Us

HPC Facilities

The following machines installed at CEPBA are offered in conjunction with those of CESCA under a common service policy managed by the C^4.


SGI Origin 2000 Server


from Silicon Graphics, with 64 MIPS R10000 processors (each one with 4 MB of cache), and 8 Gb of main memory.

Its theoretical peak performance is 32 Gflop/s.


DEC Alpha Server 8400

from Digital Equipment Corporation, with 12 Alpha 21264 processors running at 525 MHz, and 2 Gigabyte of main memory.

Its theoretical peak performance 12.5 Gflop/s.


Parsytec CCi

from Parsytec GmbH, with 16 Pentium II processors running at 266 MHz on 8 dual processors mainboard with 128 Mb of main memory each one (total: 1Gb).

The machine works internally with 3 communication technologies: Fast Ethernet (100 Mbps), High Speed Link (300 Mbps) and MyriNet (1.28 Gbps for Myrinet-SAM switch and 800 Mbps for each HIPPI Myrinet adapter).

Its theoretical peak performance 4.256 Gflop/s.


Convex C3480

from CONVEX Computer Corporation, with 8 vector processors and 1 Gigabyte of main memory.

Its theoretical peak performance is 400 Mflop/s for double precision operations and 800 Mflop/s for single precision.


CM-200

from Thinking Machines, with 2K processors and 256 Megabytes of memory.

Its theoretical peak performance is 640 Mflop/s for double precision operations and 1.28 Gflop/s for single precision.


Supernode SN-1000

from Parsys, with 32 transputers T-800, each one with 4 Megabytes of memory.

Its theoretical peak performance is 64 Mflop/s.




Home · General Info · Projects  · Tools  · Reports · User Support · Contact Us

Last Update: 2nd Dec 1999



left right
Diapositiva 23
up up
19-23 de Junio de 2000

Introducción
up up

CEN: CESGA

left right

Cálculo Intensivo
CESGA

Centro de Supercomputación de Galicia

Introducción

Vectoriais

Paralelos

Fujitsu VPP300

Fujitsu AP3000

Sun HPC4500

Acceso Manuais Programas Colas Batch

Cálculo Intensivo

Cálculo Intensivo

O equipo principal do CESGA está constituído por un Superordenador Vectorial Fujitsu VPP300E con 6 procesadores, un Superordenador Escalar-Paralelo Masivo Fujitsu AP3000 con 20 procesadores e un servidor escalar SUN HPC4500 con 12 procesadores e memoria compartida.

ODENADOR PROCESADOR CPU´S MEMORIA POTENCIA PICO
FUJITSU VPP300 VECTORIAL 6 12 GB 14,4 GFLOPS
FUJITSU AP3000 ESCALAR 20 2,5 GB 12 GFLOPS
Sun HPC 4500 ESCALAR 12 4 GB 9.6 GFLOPS

Estes equipos foron cofinanciados pola Xunta de Galicia, CSIC, CICYT e FEDER.

 

EVOLUCIÓN DE GFLOPS INSTALADOS

GFLOPS

 

Páxina principal

sistemas@cesga.es


left right
Diapositiva 24
up up
19-23 de Junio de 2000

Introducción
up up

CEN: CESCA

left right


El hardware

El CESCA dispone de tres computadores de altas prestaciones:

IBM SP2: 12 + 32 procesadores (42 thin160 y 2 wide), 12 GB de memoria principal, 494 GB en disco y un rendimiento punta de 27,41 Gflop/s.

Hewlett-Packard Exemplar V2500: 16 procesadores PA8500 (440 MHz), 8 GB de memoria principal, 216 GB en disco y un rendimiento punta de 28,16 Gflop/s.

Hewlett-Packard N4000: 8 procesadores PA8500 (también a 440 MHz), 4 GB de memoria principal, 227 GB en disco y un rendimiento punta de 14,08 Gflop/s.

Todas las máquinas tienen procesadores superescalares pero se diferencian en el acceso a memoria: el SP2 tiene una memoria distribuida, mientras que las otras dos son de memoria compartida.

La interconexión procesadores-memoria del V2500 es mediante un crossbar de 8x8 de 15,3 GB/s y la de la N4000 son dos buses con una velocidad agregable total de 3,8 GB/s. Esta interconexión proporciona una latencia a memoria mucho más rápida que la V2500 (130 ns versus 550 ns).

El rendimiento máximo para resolver un sistema de ecuaciones lineal (Rmax) es, respectivamente, de 16,17, 17,47 y 10,22 Gflop/s.

Gracias al convenio de creación del Centre de Computació i Comunicacions de Catalunya el hardware del CEPBA también está disponible a nuestros usuarios: la Origin2000, la Alphaserver 8400 y el Parsytec CCi.

Características técnicas y rendimiento de los diversos procesadores

IBM SP2
wide
IBM SP2
thin160
HP V2500
PA8500
N4000
PA8500
Frecuencia (MHz) 66 160 440 440
Ancho de bus 256 256 64 64
Cache datos (KB) 256 128 1.024 1.024
R.punta (Mflop/s) 266 640 (2,41) 1.760 (6,62) 1.760 (6,62)
LINPACK TPP 236 532 (2,25) 1.047 (4,44) 1.290 (5,47)
LINPACK 100x100 130 315 (2,42) 375 (2,88) 375 (2,88)
SPECint95 3,8 8,61 (2,26) n/d 34,0 (8,95)
SPECfp95 12,4 25,8 (2,08) n/d 51,4 (4,14)

Glosario

  • Los procesadores superescalares pueden iniciar la ejecución simultánea de varias instrucciones escalares en paralelo de manera que se pueden operar varios elementos de un vector dentro de una misma iteración. En nuestro caso, el PA8500 puede iniciar cuatro y los del SP2, seis.

  • Si la memoria está compartida entre todos los procesadores, es decir, hay un espacio único de direcciones para todos, entonces la programación es muy sencilla ya que los datos se pueden colocar en cualquier módulo de memoria i el acceso es uniforme para todos los procesadores.

  • Si la memoria está distribuida entre los procesadores, es decir, cada procesador tiene acceso a su propia memoria, entonces la programación es más compleja ya que cuando los datos a usar por un procesador están en el espacio de direcciones de oltro, será necesario sol.licitarlas y transferirlas a través de mensajes. De este modo, es necesario impulsar la localidad de los datos para minimizar la comunicación entre procesadores y obtener un buen rendimiento. La ventaja que proporcionan es su escalabilidad, es decir, el sistema puede crecer a número mayor de procesadores que los sistemas de memoria compartida y, por lo tanto, es más adecuado para las máquinas paralelas.

  • Hay un tercer tipo de organización, la memoria distribuida compartida, que combina las ventajas de ambas organizaciones: la memoria está físicamente distribuida y, por lo tanto, el sistema es escalable, pero se accede con un espacio único de direcciones y, en consecuencia, es fácilmente programable.

  • Para optimizar el rendimiento de un supercomputador, uno de los factores a considerar es el tamaño de la memoria cache disponible por procesador:

  • El rendimiento de los supercomputadores se mide en Gflop/s: 1 Gflop/s indica que el procesador realiza 109 operaciones aritméticas (tipo sumas o multiplicaciones) de números reales, codificados en formato de coma flotante de 64 bits, por segundo.


    Una visión histórica del hardware disponible



    © CESCA, MH/220496/041199
    left right
    Diapositiva 25
    up up
    19-23 de Junio de 2000

    Introducción
    up up

    CEN: INM

    left right


    INSTITUTO NACIONAL DE METEOROLOGIA : Proceso de Datos Técnicos

    Proceso de Datos Técnicos en el INM


    © INM.
    left right
    Diapositiva 26
    up up
    19-23 de Junio de 2000