Copy-on-write
Copy-on-write (COW), sometimes referred to as implicit sharing[1] or shadowing,[2] is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources[3] (most commonly memory pages, storage sectors, files, and data structures).
In virtual memory management
[edit]Copy-on-write finds its main use in operating systems, sharing the physical memory of computers running multiple processes, in the implementation of the fork() system call. Typically, the new process does not modify any memory and immediately executes a new process, replacing the address space entirely. It would waste processor time and memory to copy all of the old process's memory during the fork only to immediately discard the copy.[citation needed]
Copy-on-write can be implemented efficiently using the page table by marking certain pages of memory as read-only and keeping a count of the number of references to the page. When data is written to these pages, the operating-system kernel intercepts the write attempt and allocates a new physical page, initialized with the copy-on-write data, although the allocation can be skipped if there is only one reference. The kernel then updates the page table with the new (writable) page, decrements the number of references, and performs the write. The new allocation ensures that a change in the memory of one process is not visible in another's.[citation needed]
The copy-on-write technique can be extended to support efficient memory allocation by keeping one page of physical memory filled with zeros. When the memory is allocated, all the pages returned refer to the page of zeros and are all marked copy-on-write. This way, physical memory is not allocated for the process until data is written, allowing processes to reserve more virtual memory than physical memory and use memory sparsely, at the risk of running out of virtual address space. The combined algorithm is similar to demand paging.[3]
Copy-on-write pages are also used in the Linux kernel's same-page merging feature.[4]
In software
[edit]This section needs expansion. You can help by adding to it. (October 2017) |
COW is also used in library, application, and system code.
Examples
[edit]The string class provided by the C++ standard library was specifically designed to allow copy-on-write implementations in the initial C++98 standard,[5] but not in the newer C++11 standard:[6]
std::string x("Hello");
std::string y = x; // x and y use the same buffer.
y += ", World!"; // Now y uses a different buffer; x still uses the same old buffer.
In the PHP programming language, all types except references are implemented as copy-on-write. For example, strings and arrays are passed by reference, but when modified, they are duplicated if they have non-zero reference counts. This allows them to act as value types without the performance problems of copying on assignment or making them immutable.[7]
In the Qt framework, many types are copy-on-write ("implicitly shared" in Qt's terms). Qt uses atomic compare-and-swap operations to increment or decrement the internal reference counter. Since the copies are cheap, Qt types can often be safely used by multiple threads without the need of locking mechanisms such as mutexes. The benefits of COW are thus valid in both single- and multithreaded systems.[8]
In computer storage
[edit]COW may also be used as the underlying mechanism for snapshots, such as those provided by logical volume management, file systems such as Btrfs, ZFS, ReFS and Bcachefs,[9] and database servers such as Microsoft SQL Server. Typically, the snapshots store only the modified data, and are stored close to the original, so they are only a weak form of incremental backup and cannot substitute for a full backup.[10]
See also
[edit]- Allocate-on-flush
- Dirty COW – a computer security vulnerability for the Linux kernel
- Flyweight pattern
- Memory management
- Persistent data structure
- Wear leveling
References
[edit]- ^ "Implicit Sharing". Qt Project. Retrieved 10 November 2023.
- ^ Rodeh, Ohad (1 February 2008). "B-Trees, Shadowing, and Clones" (PDF). ACM Transactions on Storage. 3 (4): 1. CiteSeerX 10.1.1.161.6863. doi:10.1145/1326542.1326544. S2CID 207166167. Archived from the original (PDF) on 2 January 2017. Retrieved 10 November 2023.
- ^ a b Bovet, Daniel Pierre; Cesati, Marco (1 January 2002). Understanding the Linux Kernel. O'Reilly Media. p. 295. ISBN 9780596002138. Retrieved 10 November 2023.
- ^ Abbas, Ali. "The Kernel Samepage Merging Process". alouche.net. Archived from the original on 8 August 2016. Retrieved 10 November 2023.
{{cite web}}
: CS1 maint: unfit URL (link) - ^ Meyers, Scott (2012). Effective STL. Addison-Wesley. pp. 64–65. ISBN 9780132979184.
- ^ "Concurrency Modifications to Basic String". Open Standards. Retrieved 10 November 2023.
- ^ Pauli, Julien; Ferrara, Anthony; Popov, Nikita (2013). "Memory management". PhpInternalsBook.com. Retrieved 10 November 2023.
- ^ "Threads and Implicitly Shared Classes". Qt Project. Retrieved 10 November 2023.
- ^ Kasampalis, Sakis (2010). "Copy-on-Write Based File Systems Performance Analysis and Implementation" (PDF). p. 19. Retrieved 10 November 2023.
- ^ Chien, Tim. "Snapshots Are NOT Backups". Oracle.com. Oracle. Retrieved 10 November 2023.