File System Design part 1: XFS

XFS

SGI’s XFS is a much more complicated file system. Why is that complication needed? To understand that, we need to know the problems with FFS and its ilk (Ext2). These problems are stability under power loss/system crash and scalability. You may notice that FFS has no way of dealing with system crashes. This file system must be mounted sync, meaning programs have to wait for writes to finish before moving on, for it to be stable during power loss. This is a big problem. Mounting file systems synchronously is just not acceptable for today’s systems, its simply too slow. The second problem is scalability and speed. FFS uses lists to store inodes and cylinder groups. This means the file system scales O(n), period. For those who have not taken discrete math (for shame...) scaling O(n) means that as the size of the file system increases, the search time increases at the same rate. This means bad mojo for big file systems. Another note, FFS directories are also stored as lists of file names, so try to stay away from big folders.

So, FFS and its like are not good enough. What’s next? XFS is one of the options. The first thing XFS does is to replace FFS’s lists with something called a B+Tree. B+Trees are complex. Let me rephrase that. B+Trees are very, very complex. First, lets look at what a tree is. Basically, it is a data structure that takes the form of an actual tree, with each node of the tree holding some sort of data and having two or more "branches" to other nodes. For the sake of this discussion, lets assume we are dealing with a "Binary Tree," where each "node" of the tree has at most 2 branches. The following picture shows this.

[7]

Next, lets look at a Binary Search Tree. This is a binary tree where the left branch is "less" than the node and the right branch is "greater" than the node. This cuts down search time significantly, because you can cut out whole parts of the tree when searching. Lets say its a tree of numbers. You go to the root node, its 5. However, your looking for 3, so you take the left branch. The next node is 2, so you take the right branch and boom, 3. Now imagine if that was a list. What if the 3 was at the end of the list? It is easy to see how much better this is.

There is a large problem though, these trees can become unbalanced. For example, lets take numbers 1 through 5. So, lets make our root node, 5, the node to the left 4, and so on. This is just a sorted list and we are suddenly right back where we started. There are types of trees that try to deal with that by introducing balancing routines. One of these is called a B+Tree. I am not going to define what balancing routine it uses, because it is obscenely complicated and not important for our discussion. Suffice to say that it does work.

B+Trees have some other interesting attributes.

They are NOT binary trees. They can and many times do have more than 2 branches.
All nodes don’t have data stored in them. In fact, only one type of node has data, the "leaves." These are nodes at the "bottom" of the tree that are only connected to a node above them. They don’t have any "child" nodes, only data.
本新闻共3页,当前在第1页 1 2 3

上一篇：在win下访问ext、reiserfs、xfs、ufs的分区

下一篇：Recovering Deleted/Lost/Missing Data From Novell Servers

_abh