Linux对内存的管理, 以及page fault的概念_投稿

http://blog.scoutapp.com/articles/2015/04/10/understanding-page-faults-and-memory-swap-in-outs-when-should-you-worry

Linux allocates memory to processes by dividing the physical memory into pages, and then mapping those physical pages to the virtual memory needed by a process. It does this in conjunction with the Memory Management Unit (MMU) in the CPU. Typically a page will represent 4KB of physical memory. Statistics and flags are kept about each page to tell Linux the status of that chunk of memory.

These pages can be in different states. Some will be free (unused), some will be used to hold executable code, and some will be allocated as data for a program. There are lots of clever algorithms that manage this list of pages and control how they are cached, freed and loaded.

由MMU把物理内存分割成众多个page，每个page是4KB. 然后把page映射到进程的虚拟内存空间. CPU在执行进程中的指令时，以虚拟内存地址为基础，通过map映射，进而找到物理内存中实际存放指令的地址.

Imagine a large running program on a Linux system. The program executable size could be measured in megabytes, but not all that code will run at once. Some of the code will only be run during initialization or when a special condition occurs. Over time Linux can discard the pages of memory which hold executable code, if it thinks that they are no longer needed or will be used rarely. As a result not all of the machine code will be held in memory even when the program is running.

A program is executed by the CPU as it steps its way through the machine code. Each instruction is stored in physical memory at a certain address. The MMU handles the mapping from the physical address space to the virtual address space. At some point in the program's execution the CPU may need to address code which isn't in memory. The MMU knows that the page for that code isn't available (because Linux told it) and so the CPU will raise a page fault.

The name sounds more serious than it really is. It isn't an error, but rather a known event where the CPU is telling the operating system that it needs physical access to some more of the code.

Linux will respond by allocating more pages to the process, filling those pages with the code from the binary file, configuring the MMU, and telling the CPU to continue.

page fault, (严格说，这里指的是major page fault)名字听起来挺严重，实际上，并不是什么"错误".

大致是这样，一个程序可能占几Mb，但并不是所有的指令都要同时运行，有些是在初始化时运行，有些是在特定条件下才会去运行. 因此linux并不会把所有的指令都从磁盘加载到page内存. 那么当cpu在执行指令时，如果发现下一条要执行的指令不在实际的物理内存page中时， CPU 就会 raise a page fault，通知MMU把下面要执行的指令从磁盘加载到物理内存page中. 严格说，这里指的是major fault. 还有另一种，就是minor fault.

There is also a special case scenario called a minor page fault which occurs when the code (or data) needed is actually already in memory, but it isn't allocated to that process. For example, if a user is running a web browser then the memory pages with the browser executable code can be shared across multiple users (since the binary is read-only and can't change). If a second user starts the same web browser then Linux won't load all the binary again from disk, it will map the shareable pages from the first user and give the second process access to them. In other words, a minor page fault occurs only when the page list is updated (and the MMU configured) without actually needing to access the disk.

minor page fault, 指的就是CPU要执行的指令实际上已经在物理内存page中了，只是这个page没有被分配给当前进程, 这时CPU就会raise一个minor page fault, 让MMU把这个page分配给当前进程使用, 因此minor page fault并不需要去访问磁盘.

当物理内存不够时，把一些物理内存page中的内容写入到磁盘，以腾出一些空闲的page出来供进程使用, 这就是swap out.(The process of writing pages out to disk to free memory is called swapping-out)

反过来说，当CPU要执行的指令被发现已经swap out到了磁盘中，这时就需要从磁盘把这些指令再swap in到物理内存中，让CPU去执行.

swap in和swap out的 *** 作都是比较耗时的, 频繁的swap in和swap out *** 作很影响系统性能.

-------DONE.-----------

page结构体里的flags域包含了page所在的node id，在该node中包含管理的多个zones，且page在该node中的zone id也放在了flags域中，因此page可以找到它所在node中的zone结构体，从而知道page的zone。可参考mm.h文件中的page_zone函数

查看os系统块的大小

[root@dg1 ~]# tune2fs -l /dev/sda1 |grep 'Block size'

Block size: 4096

[root@dg1 ~]#

查看os系统页的大小

[root@dg1 ~]# getconf PAGESIZE4096[root@dg1 ~]#

修改块的大小:

创建文件系统时，可以指定块的大小。如果将来在你的文件系统中是一些比较大的文件的话，使用较大的块大小将得到较好的性能。将ext2文件系统的块大小调整为4096byte而不是缺省的1024byte，可以减少文件碎片，加快fsck扫描的速度和文件删除以及读 *** 作的速度。另外，在ext2的文件系统中，为根目录保留了5%的空间，对一个大的文件系统，除非用作日志文件，5%的比例有些过多。可以使用命令

# mke2fs -b 4096 -m 1 /dev/hda6

将它改为1%并以块大小4096byte创建文件系统。

使用多大的块大小，需要根据你的系统综合考虑，如果系统用作邮件或者新闻服务器，使用较大的块大小，虽然性能有所提高，但会造成磁盘空间较大的浪费。比如文件系统中的文件平均大小为 2145byte，如果使用4096byte的块大小，平均每一个文件就会浪费1951byte空间。如果使用1024byte的块大小，平均每一个文件会浪费927byte空间。在性能和磁盘的代价上如何平衡，要看具体应用的需要。

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/tougao/6070687.html

Linux对内存的管理, 以及page fault的概念

发表评论

评论列表（0条）