Shallow copy vs. deep copy

The copy of a value variable is a simple concept. If we say that 'b' is a copy of variable 'a', we mean that the value stored in 'a' has been copied to the memory location reserved for 'b'.

Pointer variables require a clarification. Do we want the copy to be just an alias, so that the address stored in the memory location refers to the same object pointed by the original (shallow copy), or to be a completely new object, a logical copy of the original (deep copy)?

As we can imagine, shallow copy has the advantage of being fast and simple, on the other side we should be well aware that we are dealing with just an alias for the original object, and not a real new different one.

Deep copy is more complicated to be achieved, requiring to be expressly tailored on the actual class it is going to impact, but it gives us a real new object that could be used independently from the original one.

Besides, there are a few more details that are specific to Java. Let's see some of them.

Shallow copy through Object.clone()

The base class of the Java hierarchy, Object, has a method, clone(), that could be used to perform a shallow copy. So we should expect that any Java object could be cloneable, but this is actually not true, we have to explicitly takes a couple of actions in our class, if we want that.
  • We have to override Object.clone(), changing its visibility. Object.clone() is declared protected, we can't actually call it for an Object instance.
  • We have to declare that our class implements the Cloneable interface. Otherwise a call to Object.clone() would throw an exception.
Arrays are cloneable

Let's write a silly little class that wraps an integer value:
class WrInt {
    public WrInt(int v) { value = v; }
    public int value;
}
An object of this class can't be cloned, since the class is not Cloneable and there is no clone() method public override.

But we can clone an array of its objects:
WrInt w1[] = { new WrInt(0), new WrInt(1), new WrInt(2) };
WrInt w2[] = w1.clone();
Now we can change an element in the first array, and see that this change could be observed also from its clone:
w1[2].value++;
System.out.print(w1[2].value + " == " + w2[2].value);
The clone() method called on an array takes care of allocating memory for another array with the same size of the original one, and then (shallow) copying each element in the destination. We could get the same result in this way:
WrInt w3[] = new WrInt[w1.length];
System.arraycopy(w1, 0, w3, 0, w1.length);
The System.arraycopy() performs a (shallow) copy from an array (w1, here) starting at a specific position (0) to a second array (w3) starting at the given position (0), looping for a number of times (the length of w1). It is our responsibility to ensure that we are passing valid parameters to arraycopy(), in case of mistakes we should expect an exception.

Is there a way of combining the elegance of clone() and the handiness of System.arraycopy()? Actually, yes. Since version 1.6 the utility class Arrays has been extended to include copyOf() that performs the (shallow) copy of an array creating on the spot a clone, and letting us to use the flexibility of arraycopy().

Here is an example:
WrInt w4[] = Arrays.copyOf(w1, w1.length);
Performing a deep copy is a tad more complicated. We are responsible to correctly allocate memory, and to specify how and where the copy has to be done.

Here is a deep copy of our original array:
WrInt w5[] = new WrInt[w1.length];
for(int i = 0; i < w5.length; ++i) {
    w5[i] = new WrInt(w1[i].value);
}
Each element of the new array has to be explicitly constructed from the correspondent original one.

Immutable

OK, but why we have used that WrInt custom class instead of standard types? We know that it wouldn't make sense to use primitive types (int, double, ...) since they are value types. But what about the wrapper classes (Integer, Double, String, ...)?

The fact is that they are immutable, so we do not get any advantage in performing a deep copy when they are involved. The strange effect we have is that the two twin arrays generated by shallow copy are identical till we perform a change on one element on whichever one. At that point, the item we change is substituted with a new immutable instance breaking the identity with its clone, that keeps the reference to the old element.

No comments:

Post a Comment