Performance of scientific processing in networks of workstations
The growing processing power of standard workstations, along with the relatively easy way in which they can be available for parallel processing, have both contributed to their increasing use in computation intensive application areas. Usually, computation intensive areas have been referred to as scientific processing; one of them being linear algebra, where a great effort has been made to optimize solution methods for serial as well as for parallel computing.\nSince the appearance of software libraries for parallel environments such as PVM (Parallel Virtual Machine)  and implementations of MPI (Message Passing Interface) , the distributed processing power of networks of workstations has been available for parallel processing as well.\nAlso, a strong emphasis has been made on the heterogeneous computing facility provided by these libraries over networks of workstations. However, there is a lack of published results on the performance obtained on this kind of parallel (more specifically distributed) processing architectures.\nFrom the whole area of linear algebra applications, the most challenging (in terms of performance) operations to be solved are the so called Level 3 BLAS (Basic Linear Algebra Subprograms). In Level 3 BLAS, all of the processing can be expressed (and solved) in terms of matrix-matrix operations. Even more specifically, the most studied operation has been matrix multiplication, which is in fact a benchmark in this application area.