Greedy Page Replacement Algorithm For Flash-Aware Swap System
Greedy Page Replacement Algorithm For Flash-Aware Swap System
Abstract —Because of the attractive features, flash memory increasing of capacity, portable consumer electronics have
replaces magnetic disk as swap storage. All page write been widely equipped with flash memory as their second
operations to flash-memory-based swap storage are storage media [10].
requested during the page replacement algorithm in terms of Because of the limited memory resource of portable
swapping out dirty pages to obtain free page frames. Due to consumer electronics, portable consumer electronics
out-of-place update scheme, intensive write operations currently exploit efficient swap systems [1] [6]-[8]
could result in using up the flash-memory-based swap considering flash memory as swap space as a cost effective
storage quickly and incurring frequent garbage collection solution to extend the limited memory space. As we know,
operations with high energy consumption. Moreover, the all the write operations to flash-memory-based swap storage
cost of flash page write operation is much higher than that are requested during the page replacement algorithm to
of flash page read operation. Therefore, in this paper, we make free page frames for the requested swap in pages. Due
propose a greedy page replacement algorithm, called to the out-of-place update scheme adopted to solve the
GDLRU, for flash-aware swap system. In order to reduce erase-before-write constraint, intensive write operations
the number of flash page write operations, GDLRU could result in using up the flash-memory-based swap
introduces a clean-aware victim page selection method storage quickly and incurring frequent garbage collection
called CPS which evicts clean page preferentially. If there is operations with high energy consumption. Moreover, the
no clean page, CPS evicts the dirty page with the least dirty cost of flash page write operation is much higher than that of
data preferentially. To further reduce the number of flash flash page read operation. In order to reduce energy
page write operations, GDLRU also introduces a clean- consumption of battery-powered portable consumer
aware victim page update scheme called CPU which only electronics, the design principle for designing an efficient
writes back the dirty flash pages within the victim dirty page replacement algorithm for flash-aware swap system is
page. The simulation results indicate that our proposed to reduce the number of flash page write operations.
algorithm outperforms other existing page replacement To achieve this design principle, we propose a greedy page
algorithms in terms of replacement cost1. replacement algorithm, called GDLRU for flash-aware swap
system. Our contributions can be summarized as follows:
Index Terms — Flash memory, Page replacement algorithm,
Swap storage (1) We introduces a greedy page replacement algorithm,
called GDLRU which enhances the normal LRU algorithm
I. INTRODUCTION and maintains two page lists by the LRU order, namely
clean page list and dirty page list.
Recently, flash memory [6] has been one of the best
(2) In order to reduce the number of flash page write
storage media for portable consumer electronics such as
operations, a clean-aware victim page selection method
MP4 players, digital cameras, smart phones and laptop
called CPS is presented to evict the clean page in the clean
computers, to name a few. Compared with traditional
page list preferentially. If the clean page list is empty, CPS
magnetic hard disk, flash memory is advantageous in
evicts the dirty page with the least dirty data within the dirty
various aspects: faster data access speed, lighter weight,
page list preferentially.
smaller dimensions, better shock resistance, lower power
(3) In order to further reduce the number of flash page
consumption, and less noise [14]-[15]. Because of these
write operations, a clean-aware victim page update scheme
attractive features, as well as, the decreasing of price and the
called CPU is proposed only to write back the dirty data
within the selected victim dirty page.
1
This work was supported in part by the Research Fund of the Doctoral
Program in China under Grant No. 20110191110038. To evaluate the effectiveness of our proposed GDLRU
Mingwei Lin is with the college of Computer Science, Chongqing algorithm, we conducted various simulation experiments with
University, Chongqing, China (e-mail: [email protected]). traces captured while executing xmms and gedit applications.
Shuyu Chen is with the school of Software Engineering, Chongqing
University, Chongqing, China (e-mail: [email protected]). The simulation results show that our proposed GDLRU
Guiping Wang is with the college of Computer Science, Chongqing algorithm outperforms existing page replacement algorithms
University, Chongqing, China (e-mail: [email protected]). for flash-aware swap system in terms of replacement cost.
Contributed Paper
Original manuscript received 03/15/12
Revised manuscript received 04/01/12
Current version published 06/22/12
Electronic version published 06/22/12. 0098 3063/12/$20.00 © 2012 IEEE
436 IEEE Transactions on Consumer Electronics, Vol. 58, No. 2, May 2012
One block
.
.
.
Fig. 1. The architecture of flash memory
TABLE I
THE PERFORMANCE OF NAND FLASH MEMORY
Page read Page write Block erase
Metric
512 bytes 512 bytes 16K bytes
Latency (μs) 348 909 1,881
Energy consumption (μjoule) 99 237.6 422.4
The remainder of this paper is organized as follows. Section aware swap system. Firstly, flash memory adopts out-of-place
II provides an overview of NAND flash memory. Section III update scheme to solve the erase-before-write constraint of
briefly reviews existing works on page replacement algorithm flash memory. Namely, once a page is written, it should be
for flash memory. Section IV describes the design and erased in advance before the subsequent write operation is
implementation of GDLRU. Section V shows the performed on the same page. Secondly, as shown in Table I,
experimental results. Finally, in the Section VI, we conclude the cost of flash page write operation is much higher than that
our work. of flash page read operation in terms of latency and energy
consumption.
II. BACKGROUND
B. Motivation
In this section, we shall briefly introduce the overview of
Traditional page replacement algorithms [11]-[13] [16] are
flash memory. By showing the very distinct characteristics of
customized for decades under the assumption that magnetic
flash memory, the potential issues of traditional page
disk is adopted as swap storage. These traditional page
replacement algorithms customized for magnetic disk are
replacement algorithms maximize the page hit ratio in the
addressed as the motivation of our work.
situation that magnetic disk can be overwritten and the I/O
A. NAND Flash Memory operation costs of magnetic disk are equal. Because out-of-
As illustrated in Fig. 1, a NAND flash memory is organized place update scheme is adopted to solve the erase-before-write
by many blocks, and each block is of a fixed number of pages. constraint of flash memory, intensive write operations can
Each page in turn consists of 512 bytes in the main area and result in using up the flash-memory-based swap storage
16 bytes in the spare area. The main area is usually used for quickly and incurring frequent garbage collection operations
storing data, while the spare area is often used to store with high energy consumption. Moreover, the cost of flash
management information and error correction code which page write operation is much higher than that of flash page
corrects errors when reading and writing. read operation. Therefore, these traditional page replacement
A NAND flash memory provides three basic operations: algorithms are not available for use to flash-aware Linux swap
read, write, and erase. The read operation fetches data from a system considering flash memory as swap storage. And an
target page, while the write operation writes data to a page. efficient page replacement algorithm for flash-aware swap
The erase operation resets all the values of a target block to 1. system performed on battery-powered portable consumer
Namely, the granularity of erase operation is a block, while electronics must focus on reducing the number of flash page
the granularity of read/write operation is a page. However, write operations.
flash memory exhibits a number of unique characteristics
which might have a significant influence on energy III. RELATED WORK
consumption of traditional page replacement algorithms Existing operating systems mainly use the Least Recently
customized for magnetic disk directly implemented for flash- Used page replacement algorithm called LRU, whose primary
M. Lin et al.: Greedy Page Replacement Algorithm for Flash-aware Swap System 437
goal is to minimize the page fault ratio. However, as flushing dirty pages which are not regarded as cold to reduce
mentioned above, intensive write operations could increase the number of page write operations to flash memory. In order
energy consumption. Therefore, a number of page to identify cold dirty page, cold-detection algorithm is
replacement algorithms in light of the unique characteristics of introduced and each page in the LRU list has an additional
flash memory have been studied to reduce the number of flash flag called cold flag. Dirty page with setting cold flag is
page write operations. regarded as cold and cold flag of dirty page is cleared when
Park et al. proposed the first page replacement algorithm, the page is referenced again. LRU-WSR scans the LRU list
called CFLRU [2], which is modified from the LRU and checks the least recently referenced page. If the checked
algorithm and delays the eviction of dirty pages. As page is a clean page, LRU-WSR evicts it regardless of the
illustrated in Fig. 2, CFLRU divides the LRU list into two status of its cold flag. If dirty, LRU-WSR checks the cold flag
regions, namely the working region and clean-first region. of this dirty page. If it is not set, this dirty page is moved to
The working region contains recently referenced pages and the MRU position of the LRU list with setting the cold flag
most of page hits are generated in this region. The pages in and another candidate page is selected from the LRU position
clean-first region are victim page candidates and the of the LRU list. If the candidate page is a dirty page and its
number of pages belonging to the clean-first region is cold flag is set, the dirty page is evicted and flushed into the
decided by a window size, w. In order to reduce the number flash memory. Although LRU-WSR can greatly reduce the
of page write operations, CFLRU evicts clean pages number of page write operations, it does not evict all the clean
preferentially in the clean-first region by the LRU order. If pages before dirty pages and also not consider the clean data
there is no clean page in the clean-first region, CFLRU within dirty page when a dirty page is selected as victim and
evicts the rest pages by the LRU order as the normal LRU written back to flash-memory-based swap storage.
algorithm. Fig. 2 depicts an example of the CFLRU
algorithm. As illustrated in Fig. 2, suppose pages are IV. GREEDY PAGE REPLACEMENT ALGORITHM
recently accessed in the order of P6, P5, P4, P3, P2, P1, and The traditional page replacement algorithms designed for
the window size is 3. Although the page at LRU position is disk write back the full victim dirty page to the disk-based
the dirty page P6, CFLRU selects the clean page P5 as swap storage [5]. As we know, the full victim dirty page often
victim to reduce the number of flash page write operations, contains clean data. Therefore, if the traditional page
even though the page P5 is more recently accessed than the replacement algorithms are used for flash-memory-based
dirty page P6. swap storage, a full victim dirty page with the size of 4KB
CFLRU can significantly reduce the number of page could result in eight flash page write operations including
write operations by delaying the eviction of dirty pages in some redundant flash page write operations under the
the clean-first region. However, there are still many clean assumption that the size of flash page is 512B. These
pages in the working region and dirty page selected as redundant flash page write operations could result in using up
victim often contains clean data which could result in the flash-memory-based swap storage quickly and incurring
redundant flash page write operations to flash-memory- frequent garbage collection operations with high energy
based swap storage. consumption. Moreover, the cost of flash page write operation
Jung et al. proposed another buffer replacement algorithm is much higher than that of flash page read operation. In order
called LRU-WSR [3] which enhances the normal LRU to reduce the number of flash page write operations, we
algorithm with add-on page replacement strategy, namely propose a greedy page replacement algorithm called GDLRU
Write Sequence Reordering (WSR). LRU-WSR delays for flash-aware Linux swap system.
438 IEEE Transactions on Consumer Electronics, Vol. 58, No. 2, May 2012
As illustrated in Fig. 3, GDLRU maintains two page lists, selects the least recently referenced page as victim. If there is no
namely clean page list and dirty page list. Each page within page in the clean page list, the CPS changes to scan the dirty
the dirty page list is divided into eight flash pages under the page list and evicts the dirty page that minimizes the formula:
assumption that the sizes of page and flash page are 4KB and m
512B, respectively. Each flash page has a dirty flag. A flash (1)
page with dirty flag set is called dirty flash page.
n
When the number of free page frames in the main memory Where m is the number of dirty flash pages in the selected
is lower than a threshold value, the greedy page replacement victim dirty page and n is the number of all flash pages which
algorithm incurs clean-aware victim page selection method to the selected victim dirty page contains. Fig. 3 shows the
select a suitable page as victim. After a page is selected as example of GDLRU. CPS evicts the least recently referenced
victim, to further reduce the number of flash page write page P6. If the clean page list is empty, CPS changes to scan
operations, the greedy page replacement algorithm incurs the dirty page list and evicts the page P8 that minimizes (1).
clean-aware victim page update scheme.
B. Clean-aware Victim Page Update Scheme
A. Clean-aware Victim Page Selection Method After some page is select as victim, the clean-aware victim
The clean-aware victim page selection method called CPS page update scheme called CPU checks whether the victim
works as follows. Firstly, the CPS scans the clean page list and page is clean. If the victim page is clean, CPU just removes it
M. Lin et al.: Greedy Page Replacement Algorithm for Flash-aware Swap System 439
TABLE II
THE CHARACTERISTICS OF XMMS AND GEDIT TRACES
Memory references
Workload Memory footprint
Total references Instruction read Data read Data write
Xmms 80,059 142,984 1,098,561
11.08MB 1,321,604
(mp3 player) read : write = 1 : 4.92
Gedit 598,245 872,346 129,152
12.84MB 1,599,743
(document editor) read : write = 11.38 : 1
xmms
Weighted replacement cost
50000
(read:write = 1:8)
45000
40000
35000
30000
25000
20000
15000
10000
5000
0
LRU CFLRU LRU-WSR GDLRU
gedit
Weighted replacement cost
35000
(read:write = 1:8)
30000
25000
20000
15000
10000
5000
0
LRU CFLRU LRU-WSR GDLRU
from the physical memory and makes the corresponding V. PERFORMANCE EVALUATION
page frame free. If the victim page is dirty, CPU only In order to investigate the effectiveness of our proposed
writes back the dirty flash pages within the victim page into GDLRU algorithm, we equipped the flash-aware Linux swap
the flash-memory-based swap storage. Fig. 4 shows the system called SGBI based on Linux 2.6 with different page
example of CPU. CPU writes backs the dirty flash page f8 replacement algorithms, which are LRU, CFLRU, LRU-WSR,
within the victim page P8 into the allocated free flash page and our proposed GDLRU algorithm. For flash-aware Linux
F11. In this case, seven flash page write operations are swap system, we gathered virtual memory reference traces and
eliminated. conducted trace-driven simulations. The traces that we used
440 IEEE Transactions on Consumer Electronics, Vol. 58, No. 2, May 2012
were collected using the Valgrind toolset [9], and were [5] D. P. Bovet and M. Cesati, “Understanding the Linux kernel,” O’Reilly,
third edition, 2006.
captured while executing xmms and gedit applications on [6] D. Jung, J. S. Kim, S. Y. Park, J. U. Kang and J. Lee, “A flash-aware
Linux/x86 machine. Table II shows the characteristics of the swap system,” Proc. of International Workshop on Software Support for
traces which are also illustrated in [4]. Portable Storage, San Francisco,CA, USA, 2005.
The performance metric we used is replacement cost. To [7] O. Kwon and K. Koh, “Swap space management technique for portable
consumer electronics with NAND flash memory,” IEEE Transactions on
evaluate the replacement cost, we use the weighted count of Consumer Electronics, Vol. 56, No. 3, pp. 1524-1531, 2010.
read and write operations on flash memory. The write count is [8] S. Ko, S. Jun, Y. Ryu, O. Kwon and K. Koh, “A new Linux swap system
weighted with eight times higher than the read count. This is for flash memory storage devices,” Proc. of the International Conference
on Computational Sciences and its Applications, p. 151-156, 2008.
based on the access latency and the energy consumption [9] N. Nethercote and J. Seward, “Valgrind: A program supervision
characteristics of NAND flash memory taking into account the framework,” Electronic Notes in Theoretical Computer Science, Vo. 89,
cost of potential erase operations. No. 2, 2003.
[10] S. Park and S. Y. Ohm, “New techniques for real-time FAT file system
Fig. 5 and Fig. 6 show the replacement cost of four page in mobile multimedia devices,” IEEE Transactions on Consumer
replacement algorithms. It can be seen that GDLRU incurs the Electronics, Vol. 52, No. 1, pp. 1-9, 2006.
least flash page read operations and consumes the least [11] J. T. Robinson and M. V. Devarakonda, “Data cache management using
frequency-based replacement,” Proc. of ACM SIGMETRICS conference
replacement cost from the Fig. 5. The reason is that xmms is a on Measurement and Modeling of Computer Systems, pp. 134-142,
write intensive application and GDLRU delays the eviction of 1990.
dirty pages aggressively to reduce the number of flash page [12] E. J. O’Neil, P. E. O’Neil and G. Weikum, “The LRU-K page
write operations. Because gedit is a read intensive application replacement algorithm for database disk buffering,” Proc. of the 1993
ACM SIGMOD International Conference on Management of Data, pp.
and GDLRU evicts clean pages preferentially, GDLRU 297-306, 1993.
degrades the page hit ratio in terms of increased flash page [13] S. Jiang and X. D. Zhang, “LIRS: an efficient low inter-reference
read operations. However, GDLRU introduces the clean- recency set replacement policy to improve buffer cache performance,”
Proc. of ACM SIGMETRICS Conference on Measurement and
aware victim page update scheme to reduce the number of Modeling of Computer Systems, pp. 31-42, 2002.
flash page write operations significantly and the replacement [14] H. Yang, L. Han, Y. Yoo, D. Lim and Y. S. Ryu, “Design of multimedia
cost is reduced as illustrated in Fig. 6. file system on flash memory storage,” In: Proceedings of Korea
Multimedia Society Fall Conference, 2005.
[15] L. Han and Y. Ryu, “Performance Comparison of file systems on flash
VI. CONCLUSION disk and hard disk,” In: Proceedings of Korea Multimedia Society Fall
Conference, 2004.
In this paper, we propose a greedy page replacement [16] Theodore Johnson and Dennis Shasha, “2Q: A low overhead high
algorithm for flash-aware Linux swap system, called GDLRU. performance buffer management replacement algorithm,” Proc. Of the
GDLRU maintains two page lists, namely clean page list and 20th International Conference on Very Large Databases, 1994.
dirty page list. In order to reduce the number of flash page
write operations, GDLRU introduces a clean-aware victim
BIOGRAPHIES
page selection method to evict the least recently referenced
page within the clean page list preferentially. If the clean page Mingwei Lin received his B. S. degree in 2009 from
list is empty, GDLRU selects the dirty page with least dirty Chongqing University, P. R. China. He is currently a PhD
flash pages as victim. After a victim is selected, to further candidate in the Chongqing University. He is invited as
the reviewer by Journal of Systems and Software as well
reduce the number of flash page write operations, GDLRU as Computers and Electrical Engineering. His current
adopts a clean-aware victim page update scheme to check interests include flash memory and Linux Kernel.
whether the victim page is clean. If yes, GDLRU just removes
it from the physical memory and swaps in the requested pages.
Otherwise, GDLRU only writes back the dirty flash pages
within the victim page to flash memory. We conducted a Shuyu Chen received his PhD degree from Chongqing
University, P. R. China. Currently, he is a professor in
series of experiments and obtained encouraging results. the school of Software Engineering at Chongqing
University. His research interests include embedded
REFERENCES Linux system, cloud computing and Linux kernel.
[1] M. Lin, S. Chen, G. Lv and Z.Zhou, “Optimised Linux swap system for
flash memory,” Electronics Letters, Vol. 47, No. 11, pp. 641-642, 2011.
[2] S. Y. Park and D. Jung, “CFLRU: A replacement algorithm for flash
memory,” Proc. of the 2006 International Conference on Compilers,
Architecture and Synthesis for embedded systems, pp. 234-241, 2006.
[3] H. Jung, H. Shim, S. Park, S. Kang and J. Cha, “LRU-WSR: Integration of Guiping Wang is a PhD student in College of Computer
LRU and writes sequence reordering for flash memory,” IEEE Transactions Science at Chongqing University. His research interests
on Consumer Electronics, Vol. 54, No. 3, pp. 1215-1223, 2008. include dependability analysis and design of electronic
[4] O. Kwon, B. Hyokyung and K. Kern, “FARS: A page replacement system. As the first author, he has published about 10
algorithm for NAND flash memory based embedded systems,” Proc. of journal and conference papers in related research areas
the IEEE 8th International Conference on Computer and Information during recent years.
Technology, pp. 218-223, 2008.