Copy-On-Write Handles

The classics::cow class provides a reference-counted smart pointer having Copy-On-Write semantics.


Related Documents

Purpose for Copy-On-Write Semantics

A cow handle is intended to appear just like value-based semantics, except that unnecessary copies are avoided.

Most commonly, this is seen with returning values from functions. Returning from a function entails copying the prepared return value out to an area prepared by the caller (using the copy constructor), and then deleting the original as it goes out of scope. If the object is large and expesive to copy, this is wasteful. It would be much more efficient to just give the caller control over the object that is about to vanish, instead. A direct way of doing this is with the trug class. A cow handle is a more general solution.

Listing 1

C foo()
 C retval;
 //blah bah blah
 return retval;

// Caller:
C result= retval;

Annotation shows to the right of the code.(see Listing 1)

When used in this manner, their behavior is simple and it takes very little thought to implement something that does this, and little thought to understand its nuances. However, cow handles in Classics can be used for more general issues, and in advanced or even tricky uses it becomes important to understand the nuances, and quite benificial to have a library that copes with such details in a useful way.

Detailed Semantics of C.O.W.

Old Semantics, prior to PB10

Originally, the meaning of a copy-on-write was that every time such a handle was dereferenced (using operator-> or make_unique), it would ensure that no other cow was sharing the same underlying object. In PB9, improvements were made to make the meaning of data consistent, and a race condition was fixed regarding overlapping execution of make_unique on the same cow on different threads.

SVG file is not displayed. Illustration of threading issue

However, further analysis showed that the issue was not unique to overlapping execution of calls to make_unique, but could still occur, without involving any thread-safety features of any one function!

The drawing to the left illustrates the events taking place on three threads, all accessing the same global cow1.

The first thread calls cow1->foo(). Upon making the call to operator->, we know that cow1 is now uniquely holding its underlying object. The tail extending downwards from this point represents the time still inside the execution of foo.

Where the tail turns red, the program crashes, citing an access violation inside foo! What happened? This would be enough to shake your faith in smart pointers, or at least wonder if global variables1 are such a good idea.

The this pointer of foo, still in progress, now points to freed memory!

This is something not easily prevented automatically by the compiler. The issue involved resembles that of the baro class, so an attempt to prevent this issue by means of compiler tricks like having a temporary intermediate object, would end up doing what a baro does, and would be a lot of complexity for such a rare possibility. So, it is better handled at the conceptual level, to eliminate the problem.

Current Semantics

Instead of spitting off additional clones each time a cow is dereferenced, the object is only cloned the first time it is dereferenced.

This is understood as the natural consequence of making a cow handle behave just like a simple value-based object, only avoiding copying when it doesn’t need to.

The rules are:

  1. When a cow is given a value (construction or assignment), it points to the same object as the right-hand-side, and is put in a pending state.
  2. In the pending state, when (and only if!) the cow is dereferenced with operator-> or data, it makes sure it is the sole owner of the underlying object. Having done so, it enters a performed state.

  3. In the performed state, the handle behaves exactly like a classics::handle, and does not re-check to see if it is still unique.


This will behave just like pure value semantics, except for one subtlety that will be explained later.

Listing 2A

C original (1);
C one= original;
C two= one;

int result= two.get_x();
   // still sees 1

Listing 2B

cow<C> original (new C (1));
cow<C> one= original;
cow<C> two= one;

int result= two->get_x();
   // still sees 1

Working strictly with values, as in the listing to the left, Listing 2A, it is obvious exactly where a copy is made, and if all the operations are on a single thread, it is equally clear when relative to other operations. So it is obvious that result will obtain 1, since that’s what it was when two copied it from one, because that’s what it was when one copied it from original. Setting a new value in original, well after it was used to initialize one, is irrelevent.

Working with cow handles, as on the right, Listing 2B, how does it work? Presumably no copy is made for one because it ends up not being used2. But what about two? The value does not get duplicated until after the original has changed. Yet we still get the same result!

After all three cow handles are created, all three are pointing to the same single C instance, and all three are in the pending state. The handle two is not remembering that it came from one, but rather all three are peers, sharing in the ownership (owned reference count is 3) of a single instance of C.

When a member function is called through two (via operator->), it makes a copy of the original object. The original object drops its owned reference count to 2, and the orignal handle is the sole owner of the fresh clone. It is this newly-created copy that is passed to set_x. So, both two and one are unaffected.

When two->get_x() is called, the same thing happens. It makes a copy, another copy of the original object, and then calls get_x() on that. So it still returns 1, as expected. Even if we went out of our way to avoid making a copy to call a meer accessor, by writing two->const_object()->get_x() instead, it would still return 1 because the still-shared instance of C was never affected by the call to set_x.

To summarize, a cow initialized (or assigned) from another that’s still in the pending state is unaffected by subsequent commitment of the original. It will work just like pure value copies, with the value being “sensed” at the time the assignment is made, not when it is actually performed due to delayed copying.

Here is another example which illustrates the same concept.

Listing 3A

C original (1);
C one= original;
C two= one;

int result1= one.get_x();
   // still sees 1
int result2= two.get_x();
   // still sees 1

Listing 3B

cow<C> original (new C(1));
cow<C> one= original;
cow<C> two= one;

int result1= one->get_x();
   // still sees 1
int result2= two->get_x();
   // still sees 1

Again, the pure value-based code on the left (Listing 3A) is obvious. The variable one is made before the original is changed, and the variable two is made after the original is changed. But, changing orignal has no effect on one, so it doesn’t mean anything when two is initialized.

With the cow handles, the result is again the same. When set_x is called, the two handles are separated. The change to original does not affect the value in one.

Now look at another example, where the right-hand-side is no longer in the pending state.

Listing 4A

C original (1);
C one= original;
C two= original;
int result1= one.get_x(); // gets 2
int result2= two.get_x(); // gets 4
result1= one.get_x();  // still 2

Listing 4B

cow<C> original (new C(1));
original->set_x(2); // no longer pending
cow<C> one= original;
cow<C> two= original;
int result1= one->get_x(); // gets 3
int result2= two->get_x(); // gets 4
result1= one->get_x();  // still 3

Now, the results are different using delayed copies then they were using pure value semantics.

If a cow is made from an original that is in the performed state, then additional changes made to the original are seen by the copy, as long as the copy is still pending. When the copy is performed, it clones the current value from the original.

The semantics of cow handles are not exactly like the semantics of ordinary values. However, it is simple and understandable, and it is guaranteed not to cause deadlocks, and it will never destroy an object that’s in the middle of a member function call, will never destroy an object that it’s given out the raw pointer to, and is very efficient.

There are situations where the subtlety of pending vs. performed states will simply not matter. To guarantee scenareos work as you planned, you can lock a cow handle (see examples for the lock function). Sometimes, this is exactly what you want and will be counting on it. Understanding the state model then becomes essential. If these operations are spread over multiple threads, then exactly when a copy is made, relative to operations being done on other threads, is flexible anyway.

Threading Issues


A cow as a global variable:

Problems only occur when the same cow object is accessed on multiple threads. Having multiple handles (of any flavor) to the same underlying object, and using a different handle on each thread, is never a problem.

No copy is made for one:

Actually, just as many copies are made, and one ends up pointing to its own copy in the end! Indeed, no copy was made for one. Rather, copies were made for everyone else, leaving one as the only remaining owner of the initially created instance.