About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Documentation / prio_tree.txt




Custom Search

Based on kernel version 3.6.1. Page generated on 2012-10-11 09:36 EST.

1	The prio_tree.c code indexes vmas using 3 different indexes:
2		* heap_index  = vm_pgoff + vm_size_in_pages : end_vm_pgoff
3		* radix_index = vm_pgoff : start_vm_pgoff
4		* size_index = vm_size_in_pages
5	
6	A regular radix-priority-search-tree indexes vmas using only heap_index and
7	radix_index. The conditions for indexing are:
8		* ->heap_index >= ->left->heap_index &&
9			->heap_index >= ->right->heap_index
10		* if (->heap_index == ->left->heap_index)
11			then ->radix_index < ->left->radix_index;
12		* if (->heap_index == ->right->heap_index)
13			then ->radix_index < ->right->radix_index;
14		* nodes are hashed to left or right subtree using radix_index
15		  similar to a pure binary radix tree.
16	
17	A regular radix-priority-search-tree helps to store and query
18	intervals (vmas). However, a regular radix-priority-search-tree is only
19	suitable for storing vmas with different radix indices (vm_pgoff).
20	
21	Therefore, the prio_tree.c extends the regular radix-priority-search-tree
22	to handle many vmas with the same vm_pgoff. Such vmas are handled in
23	2 different ways: 1) All vmas with the same radix _and_ heap indices are
24	linked using vm_set.list, 2) if there are many vmas with the same radix
25	index, but different heap indices and if the regular radix-priority-search
26	tree cannot index them all, we build an overflow-sub-tree that indexes such
27	vmas using heap and size indices instead of heap and radix indices. For
28	example, in the figure below some vmas with vm_pgoff = 0 (zero) are
29	indexed by regular radix-priority-search-tree whereas others are pushed
30	into an overflow-subtree. Note that all vmas in an overflow-sub-tree have
31	the same vm_pgoff (radix_index) and if necessary we build different
32	overflow-sub-trees to handle each possible radix_index. For example,
33	in figure we have 3 overflow-sub-trees corresponding to radix indices
34	0, 2, and 4.
35	
36	In the final tree the first few (prio_tree_root->index_bits) levels
37	are indexed using heap and radix indices whereas the overflow-sub-trees below
38	those levels (i.e. levels prio_tree_root->index_bits + 1 and higher) are
39	indexed using heap and size indices. In overflow-sub-trees the size_index
40	is used for hashing the nodes to appropriate places.
41	
42	Now, an example prio_tree:
43	
44	  vmas are represented [radix_index, size_index, heap_index]
45	                 i.e., [start_vm_pgoff, vm_size_in_pages, end_vm_pgoff]
46	
47	level  prio_tree_root->index_bits = 3
48	-----
49													_
50	  0			 				[0,7,7]					 |
51	  							/     \					 |
52					      ------------------       ------------			 |     Regular
53	  				     /					   \			 |  radix priority
54	  1		 		[1,6,7]					  [4,3,7]		 |   search tree
55	  				/     \					  /     \		 |
56				 -------       -----			    ------       -----		 |  heap-and-radix
57				/		    \			   /		      \		 |      indexed
58	  2		    [0,6,6]	 	   [2,5,7]		[5,2,7]		    [6,1,7]	 |
59			    /     \		   /     \		/     \		    /     \	 |
60	  3		[0,5,5]	[1,5,6]		[2,4,6]	[3,4,7]	    [4,2,6] [5,1,6]	[6,0,6]	[7,0,7]	 |
61			   /			   /		       /		   		_
62	                  /		          /		      /					_
63	  4	      [0,4,4]		      [2,3,5]		   [4,1,5]				 |
64	  		 /			 /		      /					 |
65	  5	     [0,3,3]		     [2,2,4]		  [4,0,4]				 |  Overflow-sub-trees
66	  		/			/							 |
67	  6	    [0,2,2]		    [2,1,3]							 |    heap-and-size
68	  	       /		       /							 |       indexed
69	  7	   [0,1,1]		   [2,0,2]							 |
70	  	      /											 |
71	  8	  [0,0,0]										 |
72	  												_
73	
74	Note that we use prio_tree_root->index_bits to optimize the height
75	of the heap-and-radix indexed tree. Since prio_tree_root->index_bits is
76	set according to the maximum end_vm_pgoff mapped, we are sure that all
77	bits (in vm_pgoff) above prio_tree_root->index_bits are 0 (zero). Therefore,
78	we only use the first prio_tree_root->index_bits as radix_index.
79	Whenever index_bits is increased in prio_tree_expand, we shuffle the tree
80	to make sure that the first prio_tree_root->index_bits levels of the tree
81	is indexed properly using heap and radix indices.
82	
83	We do not optimize the height of overflow-sub-trees using index_bits.
84	The reason is: there can be many such overflow-sub-trees and all of
85	them have to be suffled whenever the index_bits increases. This may involve
86	walking the whole prio_tree in prio_tree_insert->prio_tree_expand code
87	path which is not desirable. Hence, we do not optimize the height of the
88	heap-and-size indexed overflow-sub-trees using prio_tree->index_bits.
89	Instead the overflow sub-trees are indexed using full BITS_PER_LONG bits
90	of size_index. This may lead to skewed sub-trees because most of the
91	higher significant bits of the size_index are likely to be 0 (zero). In
92	the example above, all 3 overflow-sub-trees are skewed. This may marginally
93	affect the performance. However, processes rarely map many vmas with the
94	same start_vm_pgoff but different end_vm_pgoffs. Therefore, we normally
95	do not require overflow-sub-trees to index all vmas.
96	
97	From the above discussion it is clear that the maximum height of
98	a prio_tree can be prio_tree_root->index_bits + BITS_PER_LONG.
99	However, in most of the common cases we do not need overflow-sub-trees,
100	so the tree height in the common cases will be prio_tree_root->index_bits.
101	
102	It is fair to mention here that the prio_tree_root->index_bits
103	is increased on demand, however, the index_bits is not decreased when
104	vmas are removed from the prio_tree. That's tricky to do. Hence, it's
105	left as a home work problem.
106	
Hide Line Numbers
About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Information is copyright its respective author. All material is available from the Linux Kernel Source distributed under a GPL License. This page is provided as a free service by mjmwired.net.