Educated Guesswork

Understanding Memory Management, Part 2: C++ and RAII

Cover image

This is the second post in my planned multipart[1] series on memory management. In part I we covered the basics of memory allocation and how it works in C, where the programmer is responsible for manually allocating and freeing memory. In this post, we'll start looking at memory management in C++, which provides a number of much fancier affordances.

Background: C++ #

As the name suggests, C++ is a derivative of C. The original version of C++ was basically an object oriented version of C ("C with classes") but at this point it has been around for 40-odd years and so has diverged very significantly (though modern C is a lot more like original C than C++ is) and accreted a lot of features beyond what you'd think of in an object oriented language, such as generic programming via templates and closures (lambdas).

Despite this, C++ preserves a huge amount of C heritage and many C programs will compile just fine with a C++ compiler;[2] in fact, C++ was originally implemented with a pre-processor called "cfront" which compiled C++ code down into C code, though that's not how things work now. This is actually a source of a lot of issues with C++, when programmers do things the C way—or even the older C++ way—even though modern C++ has better methods. We'll see some examples of this later in this post.

The most obvious change in C++ is the introduction of the idea of objects and classes. At a high level, an object is a data type that has both data and code associated with it, where code means functions. But let's start by looking at a type which just has data associated with it, but where that data is somewhat complex.

C Structs #

Complex data types are already a feature in C. For instance, consider the following example type:

struct rectangle {
int width;
int height;
};

Even if you don't know C, if you've done any programming you can probably figure out what this means: it's defining a new type that represents a rectangle and has two values, the height and the width of the rectangle, each of which are integers (int being one of the C integer types). Obviously you could just have two variables, rectangle_width and rectangle_height, but this lets you group them together, like so:

int area(rectangle r) {
return r.width * r.height;
}

rectangle r = { 10, 2 }; // Make a rectangle of width 10 and height 2

printf("Area is %d\n", area(r));

In this example, we've defined a function called area that takes a rectangle as an argument and returns the product of the width and the height. Note that the notation for accessing a one of the values inside a C struct is the a.b where a is the name of the variable containing the struct and b is the name of the field inside the struct (e.g., width).

Call by Value #

I've actually done something new here that you might not have noticed, which is that I've passed our struct to the function. All function calls in C are what's called "call by value", which is to say that C makes a copy of the data element that is available to the function but is disconnected from the original value. The called function can change its arguments without affecting the caller. Consider, for instance, the following example.

void shrink(rectangle r) {
r.width = r.width/2;
r.height = r.height/2;
printf("Inner width=%d height=%d\n", r.width, r.height);
}

rectangle r = { 10, 2 }; // Make a rectangle of width 10 and height 2
shrink(r);
printf("Outer width=%d height=%d\n", r.width, r.height);

As expected, this prints out:

Inner width=5 height=1
Outer width=10 height=2

because shrink just modified its own copy of r. Function calls are just a special case of generically how assignments in C work: they make a copy of whatever memory was associated with the source and stuff it into the target.[3]

C does provide a way for the called function to modify memory associated with the caller: the caller just passes a pointer to the callee rather than the variable itself, as in the following code:

void shrink(rectangle* rp) {
rp->width = rp->width/2;
rp->height = rp->height/2;
printf("Inner width=%d height=%d\n", rp->width, rp->height);
}

rectangle r = { 10, 2 }; // Make a rectangle of width 10 and height 2
shrink(&r);
printf("Outer width=%d height=%d\n", r.width, r.height);

Note the new notation here:

  • & takes a pointer to a variable so &r is a pointer to r
  • a->b accesses a variable in a struct when you have a pointer to the struct. This is what is known as "syntactic sugar" because you could just do (*a).b, but it's used all the time.

This snippet does what we expect, which is to say modifies the value in the outer function:

Inner width=5 height=1
Outer width=5 height=1

It's important to realize, though, that C was still doing call-by-value; it's just that the value we passed was a pointer to r rather than r itself, which allowed the function to manipulate the memory that the argument pointed to rather than its local copy of that variable.

Objects and Classes #

Everything we've seen here is still normal C, but often we want to associate a function with a type. For instance, the area function we have shown above only works with rectangles, but what if we had circles as well? We'd end up with two functions, one called area_rectangle and one called area_circle. Objects give us another option, which is to associate the function with the type, so that we can do something like this:

Rectangle r = {10, 2}; // Make a rectangle of width 10 and height 2

printf("Area is %d\n", r.area());

We've got some new syntax here, but it's basically an extension of the old syntax. Instead of referring to a data element with r.height we are now referring to the function area() with the the syntax r.area(). Also we don't have to pass the data values to r.area() because it just gets them as part of the function call, which is very convenient if we also have circles, because then we can do:

Circle r = {10};

printf("Area is %d\n", r.area());

Note that the call to area() is exactly the same in both cases. This syntax hides what kind of object we are working with, which lets us reason about the logic of the program without worrying about what shape we are working with. Which area function gets called depends on the type of object (Rectangle or Circle). This type of function is called a method or a member function of the type it's associated with.

Of course, we still have to define Rectangle and Circle. The definition of Rectangle looks like this:

class Rectangle {
public:
int width;
int height;

int area() {
return width * height;
}
};

The first part of this is basically the same as struct rectangle, except for the public: line, which we can ignore for now.[4] Just as before, we have width and height. What's new here is the area() function. This is also almost exactly the same as before, except for two things:

  1. It's defined inside the class.
  2. We don't need to pass a copy of Rectangle as an argument because the width and height fields are automatically available to any member function.[5]

The definition of Circle is similar, except with the standard π r2 area formula

To recap the terminology here: the class is the type definition and an object is a given instance of the class.

Inheritance #

We won't really need this in this post, but I'd be remiss if I didn't mention one of the most important features of classes, which is inheritance. The idea here is to say that a given class, say Rectangle is itself derived from a more general class, such as Shape. Anywhere you could use a pointer to Shape you can use a pointer to a Rectangle instead. For example, we could define a Shape as having an area() function like so:

class Shape {
virtual int area() = 0;
};

Notice that we haven't provided a definition (body) for area(), instead we have the virtual keyword in front and there is = 0 in place of the body. Together these mean that all classes derived from Shape have to define area() for themselves.[6] We then modify Rectangle to indicate that it is derived from Shape and we'll need virtual in front of area here for some technical reasons which we don't need to go into.[7]

class Rectangle : public Shape {
public:
int width;
int height;

...

virtual int area() {
return width * height;
}
};

The result of all this is we can now write a function which can take any shape and do stuff, as in:

void print_area(Shape *s) {
printf("Area = %d\n", s->area());
}

If we have a Rectangle r then print_area() can be called just like you would expect:

print_area(&r);

If you've been paying attention, you'll have noticed that I said you can use a pointer to Rectangle wherever you could have used a pointer to Shape. You cannot, however, use a Rectangle wherever you would have used a Shape. If you try to assign a Rectangle to a Shape you end up with something with the properties of Shape but not Rectangle. This is called object slicing and it's usually not what you want.

Constructors and Destructors #

There's one more C++ feature we need in order to understand basic C++ memory management, and that's constructors (often abbreviated ctors) and destructors (dtors). So far we've initialized stuff just by setting the fields, but C++ lets us do more: a class can have a function that runs whenever an object of that class is created. That's not really that useful with this simple an object, but just as an example suppose we wanted to print something out for debugging purposes whenever someone created a Rectangle. Then we could do:

class Rectangle : public Shape {
public:
int width;
int height;

// Constructor
Rectangle(int w, int h) {
width = w;
height = h;

printf("Rectangle created with width=%d height=%d\n", width, height);
}

...
};

The constructor also has to initialize the fields in the object, as we've done here.[8] Then when you want to create a Rectangle you could do:

Rectangle r(10, 20);

This creates a Rectangle on the stack. If you want to create a Rectangle on the heap, you don't use malloc() but instead a new operator called new, as in:

Rectangle *r = new Rectangle(10, 20);

new tells the C++ compiler that this is an object and should run the constructor (conceptually it's like calling malloc() and then calling the constructor). If you used malloc() you would just get uninitialized memory of the right size.

C++ also supports destructors, which are functions that run before the object is destroyed. But when is an object destroyed, you might ask. Remember how I said that in C freeing an object just means that you release the memory for another use? C++, however, has a richer concept of object lifecycle: whenever a C object would just have its memory returned, C++ thinks of this as an object being destroyed. This means:

  • If the object is on the stack, when the object goes out of scope (e.g., when the function returns).
  • If the object is on the heap, when it is explicitly destroyed with delete (note: not free()).[9] If you have a pointer to an object on the stack and it goes out of scope, you get a leak, just like in C.

A destructor gets written like this:

class Rectangle : public Shape {
public:
int width;
int height;

// Destructor
~Rectangle() {
printf("Rectangle destroyed with width=%d height=%d\n", width, height);
}

...
};

The ~ prefix indicates that it's a destructor. Note that the destructor still has access to the member variables, which is why it's able to print them out. As long as they're regular variables and not pointers, it doesn't need to do anything with them, as they'll just be destroyed when the object is finally destroyed. If they're pointers, however, the destructor needs to call delete or there will likely be a memory leak (unless the data is referenced elsewhere). In either case, the destructors of the member variables will themselves be run as part of the destruction process.

Putting it all together, if we have the following program:

Rectangle *r = new Rectangle(10, 20);
r->print_area();
delete r;

We would expect to see:

Rectangle created with width=10 height=2
Area = 20
Rectangle destroyed with width=10 height=2

You'll notice that I'm not checking for errors when I do new, unlike with C where we had to check that malloc() hadn't failed. By default, if new isn't able to allocate memory it will crash the program[10] rather than returning an error (or rather a null pointer). The technical term for this is that new is "infallible" whereas malloc() is "fallible", thus forcing you to handle allocation failures. It's possible to tell C++ that you want new to be fallible using std::new_throw, in which case new will return nullptr (0) the way malloc() does. Infallible memory allocation is a pretty common pattern in newer languages, many of which don't even really let you detect memory failure; they just crash the program. Whether this is good or bad is a matter of opinion.

RAII #

We now have the pieces we need to significantly improve memory allocation. Let's go back to our previous program and instead of just having a raw pointer, we're going to define a class that holds the list of lines. It looks like this:[11]

class Data {
public:
char **lines;
size_t num_lines;

Data() {
lines = nullptr;
num_lines = 0;
}

~Data() {
for (size_t i=0; i<num_lines; i++) {
free(lines[i]);
}
free(lines);
}
};

This is the same data structure as before, except that we've:

  1. Moved the local variables into the class.
  2. Put the initialization logic in the constructor and the teardown logic in the destructor.

The rest of the program remains the same, except that we have to access lines and num_lines via the data object.[12] Note that we never have to explicitly call the destructor, it just runs automatically when we return from the function. This may seem like a small improvement, but let's go back to the case we looked at in part I where we had an error handling block. Recall that that code looked like this:

  int status = OK;

...

char *l = fgets(line, sizeof(line), fp);

if (!l) {
break; // End of file (hopefully).
}
if (l[strlen(l)-1] != '\n') {
status = BAD_LINE_ERROR;
goto error;
}

...

error:
// Clean up.
fclose(fp);
for (size_t i=0; i<num_lines; i++) {
free(lines[i]);
}
free(lines);
return status;

We had to have the special cased and error prone error: block that did cleanup. Now let's look at (almost) the same code in C++:

  Data data();

...

char *l = fgets(line, sizeof(line), fp);

if (!l) {
break; // End of file (hopefully).
}
if (l[strlen(l)-1] != '\n') {
return BAD_LINE_ERROR;
}

...

// Clean up.
fclose(fp);
return OK;

By using the destructor, we've gotten rid of the potential memory leak entirely: anything that causes data to go out of scope automatically invokes the destructor, and so the memory we've allocated gets cleaned up.[13] We do, however, still have a leak: the file pointer fp, which gets cleaned up properly in the normal case but not in the error case. If we wanted, we could address this by making a new class to wrap fp, but C++ has already done this for us using the std::fstream, which gets used like this:

std::fstream fs("input.txt", std::fstream::in);
if (!fs.open) {
abort();
}

fs.getline(line, 1024);

If we use std::fstream we don't need to clean up the file at all because it will just happen automatically, and the final block just looks like:

  return OK;

This style of memory management is often called "RAII", which stands for "Resource Acquisition is Initialization". RAII is not exactly winning any records for the clearest name, and mostly people just say "RAII". The idea here is that the process of creating the object (e.g., Data or fstream) allocates its resources and the process of destroying the object deallocates its resources, so as long as you have a valid copy of the object, you know it's safe to use and once the object goes out of scope, things will automatically get cleaned up. As you can see, RAII really simplifies memory management and is generally considered to be the most convenient way to do C++ memory management (though there are also vocal RAII opponents).

Note that what makes RAII work here is that the object is on the stack but it's holding resources on the heap. That way when the function returns, the object is automatically destroyed. If instead you were to allocate the object on the heap and stored a pointer on the stack, we would still have a problem. I'll be getting to how to address in a later post.

Containers #

Stuffing our list of stored lines into a class helps some, but it's not really ideal. We've had to make this new Data class and then we have to reach into the class to add new lines and to sort the lines. We could of course add new interfaces to Data but C++ has already done the heavy listing for us by providing containers. A container is basically just a fancy term for an object whose purpose is to holds some number of other objects like a list, vector, or map. Remember all_lines = [] from our original Python version? That's a container. Here's our new program rewritten with some C++ containers.

  std::vector<std::string> lines;
std::string line;
std::fstream fs("input.txt", std::fstream::in);

// 1. Read in the file.
if (!fs.is_open()) {
abort();
}

while (std::getline(fs, line).good()) {
lines.push_back(line);
}

// 2. Sort the list.
std::sort(lines.begin(), lines.end());

// 3. Print out the result.
for (size_t i=0; i<lines.size(); i++) {
printf("%s\n", lines[i].c_str());
}

The key line to look at here is the following:

  std::vector<std::string> lines;

What this does is to make a "vector" called lines which is basically a self-growing container that can be indexed like an array. lines will contain an arbitrary number of objects of type string, which, unsurprisingly, is a C++ object that contains a string of characters. This is loosely analogous to the Python code all_lines = [] except that Python lists can contain mixed types of objects, as in:

all_lines = ["abc", 1]

which contains a string and an integer; this vector can only contain strings.

Containers massively simplify things because now when we want to add a line that we read in to our list of lines it's a one-liner:

  while (std::getline(fs, line).good()) {
lines.push_back(line);
}

This replaces all the complicated machinery we had before where we had to manually make room in lines and then make a copy of the string to add to lines, because C++ does all of that for us. Moreover, we don't need to worry about the string being too big because std::getline() will automatically grow our buffer (line) to whatever size is needed, which eliminates a lot of the error cases. However, if we did have an error for some reason, then RAII would of course clean up. For instance, the following code returns an error if lines are more than 1024 characters long.

  while (std::getline(fs, line).good()) {
if (line.size() > 1024) {
return BAD_LINE_ERROR;
}
lines.push_back(line);
}

Because we are using RAII this is totally fine and both the file and the list of strings will be cleaned up properly.

The sort is a one-liner too, though the syntax is kind of gross. You can sort of see what's happening here, namely that we're providing the first and last items in the vector and then std::sort() figures it out. The actual details are sort of subtle and out of scope for this post.

  // 2. Sort the list.
std::sort(lines.begin(), lines.end());

This leaves us with the last clause, where we iterate over the list of sorted lines and print them out. This code is the most similar to the previous version, differing mostly in that we don't have to remember how many lines there are because the .size() function lets you ask a vector how big it is:

  // 3. Print out the result.
for (size_t i=0; i<lines.size(); i++) {
printf("%s\n", lines[i].c_str());
}

The other change is that we have to use .c_str() method to get the underlying char * to pass it to printf() because printf() doesn't know what to do with a C++ string.

This isn't really that idiomatic C++ for several reasons:

  1. C++ has its own functions to print stuff to the console and most programmers prefer those. Those functions will also take a string directly rather than needing c_str().
  2. In modern C++, you would use an iterator (for (auto x : lines)).

The more modern code would look like this:

  for (auto x : lines) {
std::cout << x << std::endl;
}

I've written it the less idiomatic way for two reasons. First, it's more familiar and I'm trying not to introduce too many new things at once. Understanding what's happening here requires a bunch of new concepts. Second, and more importantly, it illustrates something important about C++, which is that while the better modern techniques are available to you, you're not required to use them, and in fact C++ lets you do all kinds of unsafe stuff. For example:

  • Array-style accesses to vector elements aren't bounds checked, so if I did lines[100000] after reading one line, anything could happen, up to and including the compiler deciding to delete all your files, start mining Bitcoin, or call 911 (the technical term here is undefined behavior).
  • c_str() returns a pointer to whatever internal storage the string object is using to store its value (as of C++11), which means that we have to worry about all the same lifetime issues as before. For instance, if we were to return the value of c_str() from this function, that value would not be safe to use because it would be pointing to storage that had been destroyed when the string went out of scope.

The key point is that C++ provides safe ways to work with these objects, but it also lets you do all the old unsafe C stuff.

Shallow and Deep Copying #

Recall that I said above that when you assign one variable to another, C just copies the internal values. This includes structs, so that, for instance, when we do:

struct rectangle {
int width;
int height;
};

rectangle r = { 10, 2 };
rectangle r2 = r;

r2 just becomes a copy of r, and they're totally independent, so in the following code:

rectangle r = { 10, 2 };
rectangle r2 = r;
r2.height = 10; // I'm a square!

printf("Rectangle 1: %d x %d\n", r1.width, r1.height);
printf("Rectangle 2: %d x %d\n", r2.width, r2.height);

We would get the output:

Rectangle 1: 10 x 2
Rectangle 2: 10 x 10

The situation is no different when one of the fields in a struct is a pointer. For instance:

// Wrap strdup so that we don't have to error check every
// time we use it.
char *infallible_strdup(char *from) {
char *retval = strdup(from);
if (!retval) {
abort(); // Out of memory
}
return retval;
}

struct rectangle {
char *name;
int width;
int height;
};

rectangle r1 = { 10, 2, infallible_strdup("Rectangle 1") };
rectangle r2 = r;
r2.height = 10; // I'm a square!
strcpy(r2.name, "Square 1");
printf("%s: %d x %d\n", r1.name, r1.width, r1.height);
printf("%s: %d x %d\n", r2.name, r2.width, r2.height);

Attention: I added infallible_strdup() because I got tired of writing out the error checking and I thought it distracted from the main flow of the code. In a real program, you might do better.

This prints:

Square 1: 10 x 2
Square 1: 10 x 10

Wait, what? The sizes are different but the name is the same. This happens because when we did the assignment we just assigned the pointer's value not the string's value (i.e., r1.name == r2.name), so r1.name and r2.name are pointing at the same object. strcpy() just overwrites that memory, with the result that both objects end up with name = "Square". By contrast, because width and height are just values, then there are separate values in r1 and r2, as shown below:

Shallow Copy

The result of a shallow copy

This is what's often called a "shallow copy" as opposed to a "deep copy", where there would be two different strings in r1 and r2. Doing a deep copy in this case obviously requires allocating new memory for r2.name and then copying the contents of the string into it (presumably via some API like infallible_strdup). If we want a deep copy in C, we need to do it explicitly. For instance:

void copy_rectangle(rectangle *to, const rectangle *from) {
to.name = infallible_strdup(from.name);
to.width = from.width;
to.height = from.height;
}

The result looks like this:

Deep Copy

The result of a deep copy

Copy Constructors #

By default, C++ also does shallow copies, but it provides a facility that lets you do better. When you make one C++ object starting from another of the same type, the compiler invokes what's called the "copy constructor", which is a special method of the new object that takes the object you're copying from as an argument. For instance, if we just wanted to do a shallow copy of Rectangle it would look like this (recall that the unqualified names of member variables in methods just refer to the current object):

  Rectangle(const Rectangle& other) {
name = name;
width = other.width;
height = other.height;
}

This is just the same thing that happened above, but we've done it explicitly. If you don't supply your own copy constructor, C++ will make one that does a shallow copy, which is to say basically this code. But you can also provide a copy constructor that will do anything you want.[14] For instance, here's a deep copy:

  Rectangle(const Rectangle& other) {
// Deep copy of |name|
char *tmp = infallible_strdup(other.name);

// Just copy |width| and |height| because they are
// numbers and don't point to other memory.
width = other.width;
height = other.height;
}

Note that we don't need to do anything special for width and height because they aren't pointers to anything, just values. However, because we've replaced the copy constructor we do need to explicitly copy them. But for name we want to allocate new memory and copy name into it. Now let's do the do the same thing as before where we mess with the values in r2:

  Rectangle r1("Rectangle 1", 10, 2);
Rectangle r2 = r1;
r2.height = 10; // I'm a square!
strcpy(r2.name, "Square 1");
printf("%s: %d x %d\n", r1.name, r1.width, r1.height);
printf("%s: %d x %d\n", r2.name, r2.width, r2.height);

This has the result we want:

Rectangle 1: 10 x 2
Square 1: 10 x 10

At this point you could be forgiven for thinking that this is all just syntactic sugar. After all, copy_rectangle() and the copy constructor are basically the same code and how hard is it to just write copy_rectangle(r2, r1) instead of Rectangle r2 = r1? At some level this is true of course, in the sense that all programming languages are syntactic sugar on top of assembly, but this is very useful syntactic sugar.

Consider what happens if we have the following class:

class TwoRectangles {
public:
Rectangle r1;
Rectangle r2;
}

If we now do:

TwoRectangles t1 = t2;

This will just work because the default copy constructor for TwoRectangles calls the copy constructors for Rectangle when we try to make t1 from t2. By contrast, without this feature we would need to write copy_two_rectangles():

void copy_two_rectangles(TwoRectangles *to, TwoRectangles *from) {
copy_rectangle(&to->r1, &from->r1);
copy_rectangle(&to->r2, &from->r2);
}

Basically, as long as you are working with objects which contain only other objects which have copying implemented correctly then they will behave properly when you try to copy them without you having to do anything special. This isn't that big an issue in a small system but once things get large it's pretty convenient not to have to think about writing all the boilerplate to recursively copy everything. As with our area() method before, the idea is to free you to focus on the program logic.

However, this only works if the object contains objects. If it contains pointers then those pointers will be copied directly as usual without invoking the copy constructor. Fortunately, C++ has an extensive set of container classes so that you can often—though not always—get away without having to store pointers in your objects. In this specific case, if we just used the C++ string class instead of C-style char *, as shown below, then the default copy constructor would work fine and we wouldn't have to do anything (which is why I showed the worse version that uses char *);

class Rectangle {
public:
std::string name;
int width;
int height;
...

Copy Assignment Constructors #

Nerd sniping alert: this section is going to go a bit into some nitpicky C++ detail. You can safely skip it without missing the main point.

Let's go back to our above code where we use the Rectangle copy constructor:

  Rectangle r1("Rectangle 1", 10, 2);
Rectangle r2 = r1;

What if we alter it slightly so that we construct r2 first and then assign r1 to r2:

  Rectangle r1("Rectangle 1", 10, 2);
Rectangle r2("Rectangle 2", 10, 2);
Rectangle r2 = r1;

This is superficially similar to the previous code but actually does something quite different. Instead of invoking the copy constructor, in this case it invokes the copy assignment operator, which is used whenever you assign one object to another. The reason that the copy constructor was invoked in the first example is that r1 was still under construction, but in the second example, it's already fully constructed and so we instead invoke the copy assignment operator. In case all that's not clear, look at the following code

  Rectangle r1("Rectangle 1", 10, 2);  // Constructor
Rectangle r2(r1); // Copy constructor
Rectangle r3 = r1; // Copy constructor (r3 is under construction)
r2 = r1; // Copy assignment operator

As with the copy constructor, the copy assignment operator can in principle do anything, but in practice what you usually want it to do is to clean up the destination object (similar to what you do do with the destructor) and then copy the source object onto it, similar to what the copy constructor would do.

Here's an example assignment operator implementation:

  Rectangle& operator=(const Rectangle& other) {
if (this == &other) {
return *this;
}

// Clean up name
free(name);
name = infallible_strdup(other.name);

// Just copy the dimensions.
width = other.width;
height = other.height;

return *this;
}

There are two important things to notice about this code.

Self-assignment checks #

First, before we do anything else, we check to see if we are assigning to ourself, as in:

if (this == &other)

If so, we just return early without doing anything else. This may seem like an optimization but it's actually critical for correctness. To see this, take the assignment operator code and fill in the actual concrete values for a self-assignment of r1 to itself. In this case, the lines where we copy over the name look like this:

    // Clean up name
free(r1->name);
r1->name = infallible_strdup(r1->name);

Note that we've just freed r1->name and then right away we try to copy it. Holy use-after-free Batman!

Cleaning Up #

In the copy constructor we just assigned name = infallible_strdup(other.name), but here we have to free this.name first. Why?

The reason is that in the copy constructor we knew that the target object was uninitialized and so this.name isn't holding onto any valid memory—though it might be filled with a random pointer to nothing in particular—but when we are doing copy assignment, the target object already exists which means that it might have something in this.name[15] and so we need to free it first to prevent a memory leak (the opposite of the use after free in the previous section).

Operator Overloading #

We just glossed over the odd syntax declaring this function:

  Rectangle& operator=(const Rectangle& other) {

What's going on here is that C++ allows for what's called operator overloading, which means that you can supply new implementations for existing "operators" like + or =. This is very useful because it allows for idiomatic code in some situations that would otherwise be confusing.

A common example here is complex numbers. These aren't built into C++, which means that it doesn't know how to add or subtract them. You can use operator overloading to provide implementations for + and - so you can write:

c3 = c1 + c2;

rather than:

c3 = add(c1, c2);

which is what you would do in C.

In this case we are overloading the default copy assignment operator implementation which would do fieldwise copy just like the copy constructor.

Why do I need this anyway? #

One natural question to ask is why we need to overload the = operator. The obvious alternative is to have the compiler run the target's destructor and then the copy constructor (after checking for self-assignment, of course).

To be honest, I don't really have a clear picture of whether this is actually infeasible or whether instead it's just a matter of maintaining maximum programmer flexibility. I've spent a bunch of time searching online and had a number of somewhat frustrating conversations with ChatGPT and the overall impression I am getting is that it would violate some pre-existing commitments in C++ (ChatGPT gave me a bunch of stuff about "object identity" and performance),[16] but it's not clear to me how serious these issues are. It's certainly true that C++ has so much history that any new feature needs to exist within a complex web of existing constraints, so it's possible that this approach would violate one, and it often takes a lot of analysis to determine if that's true. If someone has a better! answer, email me!

The rule of three (or five) #

If you have an object for which you need to implement your own copy constructor, then you probably need to also implement your own destructor and copy assignment operator. Rectangle provides a good example:

  • We need to implement our own destructor to free name.
  • We need to implement our own copy constructor to make a deep copy of name.
  • We need to implement the copy assignment operator to free name in the target and then make a deep copy from the source.

In C++ circles, people talk about the rule of three which says: that if you define any one of these then you probably should define all three. In modern C++, people talk about the "rule of five" which also includes the move constructor and the move assignment operator.

Moving On #

Disclaimer: The feature I am about to describe was introduced in C++ comparatively late (by which I mean in the past 15 years) and I haven't really worked with it, so I'm writing based on what I've read online. Don't write code based on this section (or really, on the rest of this post either).

C++-11 introduced the concept of moving on assignment rather than copying. Consider the following somewhat contrived code.

void f() {
Rectangle r1("Rectangle 1", 10, 2);
Rectangle r2("Rectangle 1", 10, 2);
TwoRectangles two(r1, r2);
// r1 and r2 aren't used after this point.

TwoRectangles.do_stuff();
TwoRectangles.do_other_stuff();

// r1, r2, and two are all destroyed here.
}

Under normal circumstances, transferring r1 and r2 into two would involve calling the Rectangle copy constructor to copy them into two. r1 and r2 aren't used after this point but just hang around until they go out of scope at the end of the function, where they are destroyed, at the same time as two. This isn't a correctness issue because we eventually clean up, but is wasteful because we copy them unnecessarily (including allocating new memory to copy name) even though they're only used via two thereafter.

In modern C++ you can instead move r1 and r2 into two. The details are complicated, but the high order idea is that the source of the move isn't required to continue to be usable and so you can make move more efficient than copying, in this case by just coping the pointer to name rather than allocating new memory; you just copy width and height as usual. The source is left in an "unspecified but valid state", which seems to leave a lot of room for implementation discretion.

For obvious reasons you can't just go moving stuff around any time someone assigns one variable to another, as before move was introduced in C++-11 they would have been copied and it would be very surprising to have the source variable suddenly become unusable. There are some specific circumstances where the compiler will do a move automatically, but otherwise you have to tell it you want a move by wrapping the source in a std::move() wrapper, like so:[17]

foo = std::move(bar);

Importantly, nothing stops you from using the source object after moving it, so in this case you could use bar, but with unpredictable results. You probably don't want to do this, because, as noted above, it is left in a "valid but unspecified state", but the compiler assumes you know what you're doing (in a future post we'll look at Rust, where using a value after a move is explicitly forbidden and the compiler will stop you).

Internal References #

In many cases you can implement move with a shallow copy by just copying the fields, because we don't need the original version to be valid. A shallow copy is obviously more efficient, but there are some situations where it doesn't work. One common example is when the object contains an internal reference. Consider the following example:

class Internal { 
int a;
int* ap;

Internal(int i) {
a = i;
ap = &i;
}
}

Now ap is a pointer to the internal field a. This is obviously a contrived example, but there are real situations where it makes sense.

The result is that if you were to just assign the fields of one Internal to another, then ap will end up pointing to the field a in the original object not the new one.

Incorrect Move

A shallow copy of an object with an internal pointer

If the original is destroyed, ap points to free memory, which brings us back to use-after-free problems. Obviously, if you are using this kind of class you will need to provide a smarter move assignment implementation; the point is just that you need to do that.

C++ is full of this kind of situation, where the compiler allows things that are unwise or even dangerous and you're just supposed to know to not do them. To a great extent this is a result of the way C++ developed: it used to be that these were the only way to do things and so they're allowed even though we have better ways now. When we get to Rust we'll see that it just doesn't let you do dangerous stuff—unless you ask it very nicely—because it was designed from the ground up to be safe.

Next Up: Smart Pointers #

RAII is a powerful technique but what we've seen so far is only a partial solution. Things are (mostly) fine when working with objects but if we want to work with pointers, as in our Rectangle example, then we need to implement custom copy constructors, copy assignment operators, etc. if we want them to be safe. This is true even if we want to store on object on the heap but have a pointer on the stack. In the next post I'll be covering a technique called "smart pointers" that helps address these problems.


  1. I've hopefully learned my lesson about not committing ahead of time to the length. ↩︎

  2. In fact, the C program we showed in part I will almost compile, except that in C, you can implicitly cast from void * to any pointer type T *, whereas in C++ you cannot, so you would need to cast the return value of malloc() and realloc(). ↩︎

  3. At least as long as the assignments are of the same type. If you try to assign two values of different types, such as a signed to an unsigned integer , then C may try to convert them, but they still will end up as discrete values. ↩︎

  4. What's going on here is in C++ data and methods are "private" by default, which means they can't be accessed from outside the class. The public: line says to allow access. ↩︎

  5. Note that in some languages, such as Python or Rust, you explicitly have to reference member variables with something like self.width, but that's not how C++ works. ↩︎

  6. It's possible to provide a default implementation that derived classes can override. ↩︎

  7. A virtual function is one which is associated with the type of object rather than on the type of the pointer pointing to it. This is what allows us to have a Shape * where Rectangle and Circle have different behavior. ↩︎

  8. Note that I had to name the function arguments w and h because in C++ the bare width means "the member of the object with the name width". This is one reason why some other languages explicitly require you to specify self. or this->. ↩︎

  9. Do not attempt to mix malloc()/free() with new/delete. Who knows what will happen, but it's probably not good. ↩︎

  10. Technically, it raises an exception which you could catch, but if you don't the program crashes. ↩︎

  11. Don't hate me for not using initialization syntax. ↩︎

  12. Common practice in working with classes would actually be to make these fields "private" so they couldn't be accessed by the rest of the code, but that's not necessary for the point I'm trying to make here. ↩︎

  13. Incidentally, my original code was main() and used exit(), but exit() turns out not to fire the destructor, because it never returns; the program just terminates. ↩︎

  14. There are, however, some rules about what's safe to do. ↩︎

  15. The alternative is that it this.name is assigned to nullptr, meaning that there is nothing there, but free() handles this case correctly. We don't need to handle the case because it can't happen in a correctly constructed object. ↩︎

  16. After I convinced it that I wanted the compiler to do it rather than do it myself. ↩︎

  17. Don't ask what this does; you're better off not knowing about "rvalues", "lvalues", and "xvalues" ↩︎