How do we Achieve High Performance Computing?

Good Algorithms

Adaptive algorithms
Load balancing
Available parallelism (Amdahl's law)

Moore's law:

The number of transistors on a chip doubles every 18 months

Limit of latency:

Speed of light = 3*10⁸ m/s

Locality
Independent Basic Blocks
Latency Tolerant
Efficient Compiler

Pipeline
Superscalar
Explicit/Implicit