Difference between revisions of "Why do I need a cluster file system"

From Linuxintro
imported>ThorstenStaerk
imported>ThorstenStaerk
Line 17: Line 17:
 
BOING, data is lost.
 
BOING, data is lost.
  
Then, there is a possibility to lock a file, and this must of course be possible accross nodes.
+
Then, there is a possibility to [http://man-wiki.net/index.php/9:sys_flock lock a file], and this must of course be possible across nodes.

Revision as of 16:47, 2 April 2009

Why do you need a special file system if you have more than one computer accessing a partition?

First, if two computers have file system cache, it can happen that one writes a block and the other one fetches an obsolete block from its cache, so there would be inconsistencies. But if you switch off file system cache, this problem is solved.

OK, let's look at the next problem: Computer A reads a counter file and increases it:

Computer A reads 20 
Computer B reads 20
Computer A writes 21
Computer B writes 21

And we have inconsistencies. But wait, this is not a cluster-specific problem, you will have the same problem in a multi-tasking environment.

The real problem is the file system meta data. A file system is on disk, but its changes are processed on a computer, and on only one computer. The file system meta data is always cached, and even if not, you run into race conditions. Take this example:

Computer A looks for a free inode, finds number 20
Computer B looks for a free inode, finds number 20
Computer A writes data into inode number 20
Computer B writes data into inode number 20

BOING, data is lost.

Then, there is a possibility to lock a file, and this must of course be possible across nodes.