5.2.6. Lessons Learned
Based on his experience with various distributed file systems, Satyanarayanan (1990b) has stated some general principles that he believes distributed file system designers should follow. We have summarized these in Fig. 5-15. The first principle says that workstations have enough CPU power that it is wise to use them wherever possible. In particular, given a choice of doing something on a workstation or on a server, choose the workstation because server cycles are precious and workstation cycles are not.
The second principle says to use caches. They can frequently save a large amount of computing time and network bandwidth.
|Workstations have cycles to burn|
|Cache whenever possible|
|Exploit the usage properties|
|Minimize systemwide knowledge and change|
|Trust the fewest possible entities|
|Batch work where possible|
Fig. 5-15. Distributed file system design principles.
The third principle says to exploit usage properties. For example, in a typical UNIX system, about a third of all file references are to temporary files, which have short lifetimes and are never shared. By treating these specially, considerable performance gains are possible. In all fairness, there is another school of thought that says: "Pick a single mechanism and stick to it. Do not have five ways of doing the same thing." Which view one takes depends on whether one prefers efficiency or simplicity.
Minimizing systemwide knowledge and change is important for making the system scale. Hierarchical designs help in this respect.
Trusting the fewest possible entities is a long-established principle in the security world. If the correct functioning of the system depends on 10,000 workstations all doing what they are supposed to, the system has a big problem.
Finally, batching can lead to major performance gains. Transmitting a 50K file in one blast is much more efficient than sending it as 50 1K blocks.