My Threads Question

A programmer had a problem. He thought to himself, “I know, I’ll solve it with threads!”. has Now problems. two he (from Davidlohr Bueso)

That tweet got me thinking about one of my technical interview questions. I’ve been a little bored in the interviews I’ve conducted recently, so I’m going to blog about my question as a way of forcing myself to come up with some new material.

A long time ago, I had a guy on my team who liked to test people’s knowledge of C++ by asking some obscure question that involved friend classes and pure virtual functions with private scope. I don’t fully remember the question but I’m pretty sure I wouldn’t have been able to get a job from him.

I’m not a big fan of the technical interview questions that turn on whether you know some arcane fact. (I did intern at Microsoft in the 80s, and yes, I remember being asked this famous question. I’ve always thought it was dumb.) If we interview you here at Cardinal Peak, we’re going to assume you’re competent to use Google and we’ll give you the benefit of the doubt that you can look up answers to weird corner cases. What we really want to know is how deeply you understand how a computer works.

For me, the best interview question isn’t even really a simple question. I prefer to set up a scenario that involves one or two fundamental technical concepts, and then get into an interactive discussion. I’ve found that gives me far better insight into what a candidate really knows. It’s also much more representative of how we work together once you’re on the team.

So here’s my threads question.

We have a C++ class called CPThread, which we use when it is convenient to map the lifetime of a thread onto an instance of the class. That is to say, when I cause an instance of CPThread to come into existence (typically via the new operator), I want the class to spin up a thread. And likewise, when the instance goes out of existence — say, via the delete operator — I want to make sure the thread shuts down cleanly.

Just to be clear, let me show you a bit of code. First, here is a simplified .h file for CPThread:

class CPThread {
 public:
CPThread();
virtual ~CPThread();
 private:
void main_loop();
bool time_to_stop;
pthread_t thread_id;
};

And here is an excerpt from the .cpp file:

// this function is started by a call to pthread_create in the
// constructor, so it runs in the context of the class-private
// thread
void
CPThread::main_loop()
{
// maybe some initialization goes here
while (!time_to_stop) {
// do stuff.... the only rule is that you can't block for
// longer than about 20 msec, to ensure that we keep
// polling the value of time_to_stop
}
// maybe some clean-up here before the thread exits
}
// destructor
CPThread::~CPThread()
{
is_stop_time = true;
pthread_join(thread_id, NULL);
}

Now, this is intentionally a simplified case. This class actually exists, but in the real world it is an abstract base class and it is cluttered with all sorts of other code. But hopefully, I’ve stripped the class down to its essence.

I always tell people that my interview question is not trying to test their knowledge of threads — remember, we’re going to assume you can read a man page if you need to. So I will tell you during the interview that the call to pthread_join is going to block until the corresponding main_loop thread has terminated.

Ok, so here’s the question. Somewhere else in my code, I do the following:

int main()
{
CPThread *obj = new CPThread();
// ... time passes ...
delete obj;
obj = NULL;
}

Will this work? That is, at the point the call to delete obj returns, will the main_loop thread have been stopped reliably?

Most people think about this for a bit and then agree that, yes, they think the code will work. I think that’s the right answer, too, but it’s not the end of the question. I’m going to add some code (in bold). Here is the new .h:

// more complicated version
class CPThread {
 public:
CPThread();
virtual ~CPThread();
virtual void print_me(char *s);
 private:
void main_loop();
bool time_to_stop;
pthread_t thread_id;
char my_string[1024];
};

And here is the new .cpp:

// this function is started by a call to pthread_create in the
// constructor, so it runs in the context of the class-private
// thread
void
CPThread::main_loop()
{
// maybe some initialization goes here
while (!time_to_stop) {
if (my_string[0] != '\0') {
fprintf(some_file, "%sn", my_string);
my_string[0] = '\0';
}
}
// maybe some clean-up here before the thread exits
}
// destructor
CPThread::~CPThread()
{
is_stop_time = true;
pthread_join(thread_id, NULL);
}
void
CPThread::print_me(char *s)
{
strncpy(my_string, s, 1024);
my_string[1023] = '\0';
}

And now in some code, I do this:

int main()
{
CPThread *obj = new CPThread();
// ... time passes ...
obj->print_me("Will this work?");
// ... more time ...
delete obj;
obj = NULL;
}

Will the string “Hello world?” be reliably printed from the context of the main_loop thread? At this point, most people figure out that they need a mutex around the my_string variable.

Ok, but why do we need to protect the my_string variable with a mutex, and we don’t need to protect the time_to_stop variable?

Remember, in asking this question, I’m trying to get at how fully a job candidate has synthesized a bunch of concepts: C++ classes, threads, and how processors and memory work.

So, while there are many decent answers, the best answer shows an understanding that the operation of setting a bool to true, as we do in the destructor, is necessarily an atomic operation on any of the processor I know of, and therefore it doesn’t need to be protected by a mutex. In contrast, the strncpy call in the print_me function is definitely not atomic — it’s possible for strncpy to be halfway done when the main_loop thread next runs, and then without a mutex, your world spins out of orbit.

But don’t worry if you don’t get this exact answer, especially under the pressure of the interview. As in life, it’s the journey, not the destination — I’m really looking for how you approach the problem.