C 程序员经常用volatile来表示该变量可以在当前执行线程之外被修改。 结果,当使用共享数据结构时,
他们有时会倾向于在内核代码中使用votatile。 换句话说,他们将volatile类型视为一种简单的原子变量,
而事实并非如此。 在Linux内核代码中使用volatile类型的地方几乎都是不正确的使用。 本文将档描述其原因。
理解volatile的关键点在于其目的是抑制优化,这几乎从来不是人们真正想要做的事。 在内核中,人们必须保护共享数据以免受不必要的并发访问,这是个不同寻常的考验。 防止不必要的并发过程也顺便以更有效的方式避免了几乎所有与优化相关的问题。
像是volatile,内核原语(spinlocks,mutexes,memory barriers等)确保了并发访问共享数据的安全,
内核原语同时阻止了不需要的优化。如果能正确的使用这些同步原语,当然同时也就没有必要使用volatile类型。如果你觉得volatile依然是必须的,也许在代码的某个地方存在一个bug。在正确编写的内核代码中,volatile只会降低代码的性能。
考虑下面一段典型的内核代码:
spin_lock(&the_lock); do_something_on(&shared_data); do_something_else_with(&shared_data); spin_unlock(&the_lock);
如果所有的代码都遵从锁的使用规则,当持有the_lock时,共享数据share_data的值不会被意外的修改。在其他任何想修改共享数据的代码中都会等待这把锁释放。spinlock同步原语同时扮演着memory barrier的角色,spinlock的实现明确的加了memory barrier功能,这意味着不会在它们之间优化数据访问。所以编译器可能认为它知道share_data存储的值,但是由于spinlock()的调用(同时充当了memory barrier), 我们会强制编译器忘记他知道的一切事情(例如share_data的值)。这里将不会存在访问共享数据的优化问题。
static inline void __raw_spin_lock(raw_spinlock_t *lock) { preempt_disable(); spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); LOCK_ConTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock); } #ifdef CONFIG_PREEMPT_COUNT #define preempt_disable() do { preempt_count_inc(); barrier(); } while (0) #define preempt_disable() barrier() #define barrier() __asm__ __volatile__("": : :"memory")
- 在有同步原语的情况下,内存屏障天然包含,不需要另外加优化
即使shared_data被定义成volatile类型,锁依然是需要的。但是编译器也会阻止临界区内对share_data 访问的优化,但是我们明知没有人会并发的访问共享数据。当持有锁后,share_data并不是易变(volatile)的。当处理共享数据时,正确的使用锁会使volatile类型没有必要,同时使用volatile类型反而可能对性能有害。
- 即使有volatile,锁也是有必要的,在有锁的情况下还使用volatile,可能对性能有所损害
volatile类型最初是为内存映射的I/O寄存器设计的。 在内核中,寄存器访问也应该受到锁的保护,但是也不希望编译器在临界区内“优化”寄存器访问。 但是,在内核中,I/O内存访问始终是通过调用专门的函数完成。直接通过指针访问I/O内存的方法是不被推荐的,并且不适用于所有架构平台。 这些特定的函数是为了防止不必要的优化,因此,再次的不需要volatile。
- 内存映射的IO寄存器访问,也需要用锁,直接指针访问I/O不被推荐,特定函数访问也不需要volatile
可能会尝试使用volatile的另一种情况是,处理器正在忙于等待变量的值。 但是,执行繁忙等待的正确方法是:
while (my_variable != what_i_want) cpu_relax(); [arm64] static inline void cpu_relax(void) { asm volatile("yield" ::: "memory"); } [arm32] #define cpu_relax() barrier()
调用cpu_relax()会降低cpu使用功耗或者在超线程处理器上会主动让出使用资源。当然它也充当了一条compiler barrier。所以,再一次的说明volatile是没有必要的。当然,让cpu忙等是一种反社会行为。
- volatile 更好的替代
在内核中,仍然有一些罕见的情况使volatile是有意义的:
-
上面提到的访问I/O寄存器的函数可能会使用volatile类型。从本质上讲,每个访问I/O寄存器函数调用本身都会变成一小的临界区,并确保按程序员的预期进行内存访问。
-
被inline并改变内存值得汇编代码可能会由于没有其他可见的副作用而被GCC编译器优化掉。在asm语句前加上volatile可以避免被优化。
#define barrier() __asm__ __volatile__("": : :"memory")
-
Linux中jiffies变量是一个特殊的存在,因为每次引用时它都可能是不同的值,同时读取该变量值的地方又是无锁的。因此jiffies可以被定义成volatile类型, 但是强烈反对对其他变量使用volatile类型。
在这方面,jiffies被认为是“遗留的愚蠢”问题(linus说的),修复它带来的麻烦远比其价值更多。 -
指向一致性内存中可能由I/O设备修改的数据结构的指针有时可以合理声明volatile。网络适配器使用的环形缓冲区就是一个典型的例子,网络适配器会改变指针的值用来表明哪个描述符已经被处理了。
对于大多数代码,以上所有关于volatile使用的理由都不适用。最终,使用volatile的地方就看起来像是个bug,需要对代码进行额外的检查。试图使用volatile的开发者应该往后退一步,思考什么是他们真正想要实现的。通常我们欢迎移除volatile变量的补丁程序,只要带有可以证明正确考虑并发问题的理由即可。
原文如下
C programmers have often taken volatile to mean that the variable could be changed outside of the current thread of execution; as a result, they are sometimes tempted to use it in kernel code when shared data structures are being used. In other words, they have been known to treat volatile types as a sort of easy atomic variable, which they are not. The use of volatile in kernel code is almost never correct; this document describes why. The key point to understand with regard to volatile is that its purpose is to suppress optimization, which is almost never what one really wants to do. In the kernel, one must protect shared data structures against unwanted concurrent access, which is very much a different task. The process of protecting against unwanted concurrency will also avoid almost all optimization-related problems in a more efficient way. Like volatile, the kernel primitives which make concurrent access to data safe (spinlocks, mutexes, memory barriers, etc.) are designed to prevent unwanted optimization. If they are being used properly, there will be no need to use volatile as well. If volatile is still necessary, there is almost certainly a bug in the code somewhere. In properly-written kernel code, volatile can only serve to slow things down. Consider a typical block of kernel code: spin_lock(&the_lock); do_something_on(&shared_data); do_something_else_with(&shared_data); spin_unlock(&the_lock); If all the code follows the locking rules, the value of shared_data cannot change unexpectedly while the_lock is held. Any other code which might want to play with that data will be waiting on the lock. The spinlock primitives act as memory barriers - they are explicitly written to do so - meaning that data accesses will not be optimized across them. So the compiler might think it knows what will be in shared_data, but the spin_lock() call, since it acts as a memory barrier, will force it to forget anything it knows. There will be no optimization problems with accesses to that data. If shared_data were declared volatile, the locking would still be necessary. But the compiler would also be prevented from optimizing access to shared_data _within_ the critical section, when we know that nobody else can be working with it. While the lock is held, shared_data is not volatile. When dealing with shared data, proper locking makes volatile unnecessary - and potentially harmful. The volatile storage class was originally meant for memory-mapped I/O registers. Within the kernel, register accesses, too, should be protected by locks, but one also does not want the compiler "optimizing" register accesses within a critical section. But, within the kernel, I/O memory accesses are always done through accessor functions; accessing I/O memory directly through pointers is frowned upon and does not work on all architectures. Those accessors are written to prevent unwanted optimization, so, once again, volatile is unnecessary. Another situation where one might be tempted to use volatile is when the processor is busy-waiting on the value of a variable. The right way to perform a busy wait is: while (my_variable != what_i_want) cpu_relax(); The cpu_relax() call can lower CPU power consumption or yield to a hyperthreaded twin processor; it also happens to serve as a compiler barrier, so, once again, volatile is unnecessary. Of course, busy- waiting is generally an anti-social act to begin with. There are still a few rare situations where volatile makes sense in the kernel: - The above-mentioned accessor functions might use volatile on architectures where direct I/O memory access does work. Essentially, each accessor call becomes a little critical section on its own and ensures that the access happens as expected by the programmer. - Inline assembly code which changes memory, but which has no other visible side effects, risks being deleted by GCC. Adding the volatile keyword to asm statements will prevent this removal. - The jiffies variable is special in that it can have a different value every time it is referenced, but it can be read without any special locking. So jiffies can be volatile, but the addition of other variables of this type is strongly frowned upon. Jiffies is considered to be a "stupid legacy" issue (Linus's words) in this regard; fixing it would be more trouble than it is worth. - Pointers to data structures in coherent memory which might be modified by I/O devices can, sometimes, legitimately be volatile. A ring buffer used by a network adapter, where that adapter changes pointers to indicate which descriptors have been processed, is an example of this type of situation. For most code, none of the above justifications for volatile apply. As a result, the use of volatile is likely to be seen as a bug and will bring additional scrutiny to the code. Developers who are tempted to use volatile should take a step back and think about what they are truly trying to accomplish. Patches to remove volatile variables are generally welcome - as long as they come with a justification which shows that the concurrency issues have been properly thought through. NOTES ----- [1] http://lwn.net/Articles/233481/ [2] http://lwn.net/Articles/233482/ CREDITS ------- Original impetus and research by Randy Dunlap Written by Jonathan Corbet Improvements via comments from Satyam Sharma, Johannes Stezenbach, Jesper Juhl, Heikki Orsila, H. Peter Anvin, Philipp Hahn, and Stefan Richter.
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)