The classics::cow class provides a reference-counted smart pointer having Copy-On-Write semantics.
A cow handle is intended to appear just like value-based semantics, except that unnecessary copies are
avoided.
Most commonly, this is seen with returning values from functions. Returning from a function entails copying
the prepared return value out to an area prepared by the caller (using the copy constructor), and then deleting the
original as it goes out of scope. If the object is large and expesive to copy, this is wasteful. It would be much
more efficient to just give the caller control over the object that is about to vanish, instead. A direct way of doing
this is with the trug class. A cow handle is a more
general solution.
C foo()
{
C retval;
//blah bah blah
return retval;
}
// Caller:
C result= retval;
Annotation shows to the right of the code.(see Listing 1)
When used in this manner, their behavior is simple and it takes very little thought to implement something that does this, and little thought to understand its nuances. However, cow handles in Classics can be used for more general issues, and in advanced or even tricky uses it becomes important to understand the nuances, and quite benificial to have a library that copes with such details in a useful way.
Originally, the meaning of a copy-on-write was that every time such a handle was dereferenced
(using operator-> or
make_unique), it would ensure that no other
cow was sharing the same underlying object. In PB9, improvements were made to make the meaning
of data consistent, and a race condition was fixed
regarding overlapping execution of make_unique on the same cow on different threads.
However, further analysis showed that the issue was not unique to overlapping execution of calls to
make_unique, but could still occur, without involving any thread-safety features of any
one function!
The drawing to the left illustrates the events taking place on three threads, all accessing the same
global cow1.
The first thread calls cow1->foo(). Upon making the call to operator->,
we know that cow1 is now uniquely holding its underlying object. The tail extending downwards
from this point represents the time still inside the execution of foo.
Where the tail turns red,
the program crashes, citing an access violation inside foo!
What happened? This would be enough to shake your faith in smart pointers, or at least wonder if global
variables1 are such a good idea.
When foo is called, the underlying object (which we'll call Ob1) is referenced only by
cow1, and has an owned reference count of 1. That's what operator-> ensured
before calling foo.
Later, in another thread, a variable cow2 is initialized with cow1. Now they
share Ob1, which has an owned reference count of 2.
Then, cow1 is dereferenced again. Since it is sharing Ob1, it is updated in-place to
point to a fresh clone of Ob1, which we'll call Ob2. When operator-> is
done, cow1 points to Ob2 and cow2 points to Ob1. The
this pointer of foo is pointing at Ob1, and whether or not foo cares that
cow1 has changed is another issue.
Finally, cow2 goes out of scope, or gets assigned a different (unrelated) value. Its underlying object,
Ob1, has its reference count decremented from 1 to zero, triggering object destruction.
The this pointer of foo, still in progress, now points to freed memory!
This is something not easily prevented automatically by the compiler. The issue involved resembles that of the
baro class, so an attempt to prevent this issue by
means of compiler tricks like having a temporary intermediate object, would end up doing what a baro
does, and would be a lot of complexity for such a rare possibility. So, it is better handled at the conceptual level,
to eliminate the problem.
Instead of spitting off additional clones each time a cow is dereferenced, the object is only
cloned the first time it is dereferenced.
This is understood as the natural consequence of making a cow handle behave just like a simple value-based object, only avoiding copying when it doesn’t need to.
The rules are:
cow is given a value (construction or assignment), it points to the same object as the
right-hand-side, and is put in a pending state.
In the pending state, when (and only if!) the cow is dereferenced with operator-> or data,
it makes sure it is the sole owner of the underlying object. Having done so, it enters a performed
state.
classics::handle, and
does not re-check to see if it is still unique.
This will behave just like pure value semantics, except for one subtlety that will be explained later.
C original (1);
C one= original;
C two= one;
original.set_x(2);
int result= two.get_x();
// still sees 1
cow<C> original (new C (1));
cow<C> one= original;
cow<C> two= one;
original->set_x(2);
int result= two->get_x();
// still sees 1
Working strictly with values, as in the listing to the left, Listing 2A,
it is obvious exactly where a copy is made, and if all the operations are
on a single thread, it is equally clear when relative to other operations. So it is obvious that result
will obtain 1, since that’s what it was when two copied it from one,
because that’s what it was when one copied it from original. Setting a new
value in original, well after it was used to initialize one, is irrelevent.
Working with cow handles, as on the right, Listing 2B,
how does it work? Presumably no copy is made for one because
it ends up not being used2.
But what about two? The
value does not get duplicated until after the original has changed. Yet we still get the same result!
After all three cow handles are created, all three are pointing to the same single C instance, and all three
are in the pending state. The handle two is not remembering that it came from one,
but rather all three are peers, sharing in the ownership (owned reference count is 3) of a single instance of C.
When a member function is called through two (via operator->), it makes a copy of
the original object. The original object drops its owned reference count to 2, and the orignal handle is
the sole owner of the fresh clone. It is this newly-created copy that is passed to set_x. So, both
two and one are unaffected.
When two->get_x() is called, the same thing happens. It makes a copy, another copy of the
original object, and then calls get_x() on that. So it still returns 1, as expected.
Even if we went out of our way to avoid making a copy to call a meer accessor, by writing two->const_object()->get_x()
instead, it would still return 1 because the still-shared instance of C was never
affected by the call to set_x.
To summarize, a cow initialized (or assigned) from another that’s still in the pending
state is unaffected by subsequent commitment of the original. It will work just like pure value copies, with the value being
“sensed” at the time the assignment is made, not when it is actually performed due to delayed copying.
Here is another example which illustrates the same concept.
C original (1);
C one= original;
original.set_x(2);
C two= one;
int result1= one.get_x();
// still sees 1
int result2= two.get_x();
// still sees 1
cow<C> original (new C(1));
cow<C> one= original;
original->set_x(2);
cow<C> two= one;
int result1= one->get_x();
// still sees 1
int result2= two->get_x();
// still sees 1
Again, the pure value-based code on the left (Listing 3A)
is obvious. The variable one is made before
the original is changed, and the variable two is made after the original
is changed. But, changing orignal has no effect on one, so it doesn’t mean
anything when two is initialized.
With the cow handles, the result is again the same. When set_x is called,
the two handles are separated. The change to original does not affect the value in one.
Now look at another example, where the right-hand-side is no longer in the pending state.
C original (1);
original.set_x(2);
C one= original;
C two= original;
orignal.set_x(3);
int result1= one.get_x(); // gets 2
original.set_x(4);
int result2= two.get_x(); // gets 4
result1= one.get_x(); // still 2
cow<C> original (new C(1));
original->set_x(2); // no longer pending
cow<C> one= original;
cow<C> two= original;
orignal->set_x(3);
int result1= one->get_x(); // gets 3
original->set_x(4);
int result2= two->get_x(); // gets 4
result1= one->get_x(); // still 3
Now, the results are different using delayed copies then they were using pure value semantics.
If a cow is made from an original that is in the performed state, then additional changes
made to the original are seen by the copy, as long as the copy is still pending. When the copy is
performed, it clones the current value from the original.
The semantics of cow handles are not exactly like the semantics of ordinary values.
However, it is simple and understandable, and it is guaranteed not to cause deadlocks, and it will never destroy an
object that’s in the middle of a member function call, will never destroy an object that it’s given out
the raw pointer to, and is very efficient.
There are situations where the subtlety of pending vs. performed states will simply not matter. To guarantee
scenareos work as you planned, you can lock a cow handle (see examples
for the lock function). Sometimes, this
is exactly what you want and will be counting on it. Understanding the state model then becomes essential. If these
operations are spread over multiple threads, then exactly when a copy is made, relative to operations being done on
other threads, is flexible anyway.
A cow as a global variable:
Problems only occur when the same cow object is
accessed on multiple threads. Having multiple handles (of any flavor) to the same underlying object, and
using a different handle on each thread, is never a problem.
No copy is made for one:
Actually, just as many copies are made, and one ends up pointing to its own copy in the end!
Indeed, no copy was made for one. Rather, copies were made for everyone else, leaving
one as the only remaining owner of the initially created instance.