最近可能会用到C实现的线程池,在github fork了一个项目,学习其代码,修改发现的问题,本文记录学习中遇到的问题和解决方法。
我的github仓库地址C-Thread-Poll
原作者仓库地址C-Thread-Poll

查看进程创建的线程

thpool.c文件函数thread_do函数使用了函数:
prctl(PR_SET_NAME, thread_name);
设置线程的名字,thread-pool-%d,如何查看某个进程所创建的线程呢?

[root pthread_detach]#ps -T -p $(pidof /root/C-Thread-Pool/example)
  PID  SPID TTY          TIME CMD
20464 20464 pts/1    00:00:00 example
20464 26387 pts/1    00:00:00 thread-pool-0
20464 26466 pts/1    00:00:00 thread-pool-1
20464 26479 pts/1    00:00:00 thread-pool-2
20464 26483 pts/1    00:00:00 thread-pool-3
#if defined(__linux__)
    /* Use prctl instead to prevent using _GNU_SOURCE flag and implicit declaration */
    prctl(PR_SET_NAME, thread_name);
#elif defined(__APPLE__) && defined(__MACH__)
    pthread_setname_np(thread_name);
#else
    err("thread_do(): pthread_setname_np is not supported on this system");
#endif

这里的代码对linux平台和Mac OS X平台做了判断,在linux下通过man查询得知不管函数prctl还是pthread_setname_np对线程名的支持都是最长16个字节(包含结束符).

pthread_detach和pthread_join

pthread_detach和pthread_join是两种处理线程资源回收的方式。
如果使用pthread_join,由主线程回收其资源,如果设置成pthrad_detach则自行释放资源。如果主线程调用pthread_join后,相应的线程没有结束,调用者会被阻塞,在一些web服务器中,当主线程为每个新来的连接请求创建一个子线程进行处理的时候,主线程并不希望因为调用pthread_join而阻塞响应后续请求,一般使用pthread_detach()。线程中传入pthread_self,主线程中使用的话传入线程id.

我们以POSIX : Detached vs Joinable threads | pthread_join() & pthread_detach() examples中的例子结合valgrind验证pthread_detach和pthread_join的作用,引用网址源代码中缺少包含头文件iostream,或者修改std::cout等为printf。
源码如下:

#include <iostream>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <pthread.h>
 
#include <unistd.h>
 
void * threadFunc(void * arg)
{
    std::cout << "Thread Function :: Start" << std::endl;
    // Sleep for 2 seconds  purposely
    sleep(2);
    std::cout << "Thread Function :: End" << std::endl;
    // Return value from thread
    return (void *)0;
}
 
int main()
{
    // Thread id
    pthread_t threadId;
 
    // Create a thread that will funtion threadFunc()
    int err = pthread_create(&threadId, NULL, &threadFunc, NULL);
    // Check if thread is created sucessfuly
    if (err)
    {
        std::cout << "Thread creation failed : " << strerror(err);
        return err;
    }
    else
        std::cout << "Thread Created with ID : " << threadId << std::endl;
    // Do some stuff
 
    
    return 0;
}

使用valgrind运行结果:

==1886== Memcheck, a memory error detector
==1886== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1886== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==1886== Command: ./fork
==1886==
Thread Created with ID : 110671616
==1886==
==1886== HEAP SUMMARY:
==1886==     in use at exit: 288 bytes in 1 blocks
==1886==   total heap usage: 3 allocs, 2 frees, 74,016 bytes allocated
==1886==
==1886== 288 bytes in 1 blocks are possibly lost in loss record 1 of 1
==1886==    at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==1886==    by 0x40134A6: allocate_dtv (dl-tls.c:286)
==1886==    by 0x40134A6: _dl_allocate_tls (dl-tls.c:530)
==1886==    by 0x4E44227: allocate_stack (allocatestack.c:627)
==1886==    by 0x4E44227: pthread_create@@GLIBC_2.2.5 (pthread_create.c:644)
==1886==    by 0x108B2B: main (fork.c:25)
==1886==
==1886== LEAK SUMMARY:
==1886==    definitely lost: 0 bytes in 0 blocks
==1886==    indirectly lost: 0 bytes in 0 blocks
==1886==      possibly lost: 288 bytes in 1 blocks
==1886==    still reachable: 0 bytes in 0 blocks
==1886==         suppressed: 0 bytes in 0 blocks
==1886==
==1886== For counts of detected and suppressed errors, rerun with: -v
==1886== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

如果我们使用pthread_join或pthread_detach,再使用valgrind验证是否存在内存泄露,代码修改如下:

#include <iostream>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <pthread.h>
 
#include <unistd.h>
 
void * threadFunc(void * arg)
{
    pthread_detach(pthread_self());
    std::cout << "Thread Function :: Start" << std::endl;
    std::cout << "Thread Function :: End" << std::endl;
    // Return value from thread
    return (void *)0;
}
 
int main()
{
    // Thread id
    pthread_t threadId;
 
    // Create a thread that will funtion threadFunc()
    int err = pthread_create(&threadId, NULL, &threadFunc, NULL);
    // Check if thread is created sucessfuly
    if (err)
    {
        std::cout << "Thread creation failed : " << strerror(err);
        return err;
    }
    else
        std::cout << "Thread Created with ID : " << threadId << std::endl;
    
    // Sleep for 2 seconds because if main function exits, then other threads will
    // be also be killed. Sleep for 2 seconds, so that detached exits by then
    sleep(2);

    pthread_detach(threadId);
    
    
    return 0;
}

执行结果:

[root workspace]#valgrind --leak-check=full ./fork
==2985== Memcheck, a memory error detector
==2985== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==2985== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==2985== Command: ./fork
==2985==
Thread Created with ID : 110671616
Thread Function :: Start
Thread Function :: End
==2985==
==2985== HEAP SUMMARY:
==2985==     in use at exit: 0 bytes in 0 blocks
==2985==   total heap usage: 3 allocs, 3 frees, 74,016 bytes allocated
==2985==
==2985== All heap blocks were freed -- no leaks are possible
==2985==
==2985== For counts of detected and suppressed errors, rerun with: -v
==2985== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
[root workspace]#

使用pthread_join解决内存泄露问题:

#include <iostream>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <pthread.h>
 
#include <unistd.h>
 
void * threadFunc(void * arg)
{
    std::cout << "Thread Function :: Start" << std::endl;
    // Sleep for 2 seconds  purposely
    sleep(2);
    std::cout << "Thread Function :: End" << std::endl;
    // Return value from thread
    return (void *)0;
}
 
int main()
{
    // Thread id
    pthread_t threadId;
 
    // Create a thread that will funtion threadFunc()
    int err = pthread_create(&threadId, NULL, &threadFunc, NULL);
    // Check if thread is created sucessfuly
    if (err)
    {
        std::cout << "Thread creation failed : " << strerror(err);
        return err;
    }
    else
        std::cout << "Thread Created with ID : " << threadId << std::endl;
    
    void * ptr = NULL;
    std::cout << "Waiting for thread to exit" << std::endl;
    
    // Wait for thread to exit
    err = pthread_join(threadId, &ptr);
    if (err)
    {
        std::cout << "Failed to join Thread : " << strerror(err) << std::endl;
        return err;
    }
 
    if (ptr)
        std::cout << " value returned by thread : " << *(int *) ptr
                << std::endl;
    
    return 0;
}

执行结果:

[root workspace]#valgrind --leak-check=full ./fork
==3776== Memcheck, a memory error detector
==3776== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==3776== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==3776== Command: ./fork
==3776==
Thread Created with ID : 110671616
Waiting for thread to exit
Thread Function :: Start
Thread Function :: End
==3776==
==3776== HEAP SUMMARY:
==3776==     in use at exit: 0 bytes in 0 blocks
==3776==   total heap usage: 3 allocs, 3 frees, 74,016 bytes allocated
==3776==
==3776== All heap blocks were freed -- no leaks are possible
==3776==
==3776== For counts of detected and suppressed errors, rerun with: -v
==3776== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

volatile关键字

我们看到thpool.c文件中定义了很多volatile关键字,volatile关键字的作用是什么呢?

在程序设计中,尤其是在C语言、C++、C#和Java语言中,使用volatile关键字声明的变量或对象通常具有与优化、多线程相关的特殊属性。通常,volatile关键字用来阻止(伪)编译器认为的无法“被代码本身”改变的代码(变量/对象)进行优化。如在C语言中,volatile关键字可以用来提醒编译器它后面所定义的变量随时有可能改变,因此编译后的程序每次需要存储或读取这个变量的时候,都会直接从变量地址中读取数据。如果没有volatile关键字,则编译器可能优化读取和存储,可能暂时使用寄存器中的值,如果这个变量由别的程序更新了的话,将出现不一致的现象。

在C环境中,volatile关键字的真实定义和适用范围经常被误解。虽然C++、C#和Java都保留了C中的volatile关键字,但在这些编程语言中volatile的用法和语义却大相径庭。

在这里例子中,代码将foo的值设置为0。然后开始不断地轮询它的值直到它变成255:

static int foo;
 
void bar(void) {
    foo = 0;
 
    while (foo != 255)
         ;
}

一个执行优化的编译器会提示没有代码能修改foo的值,并假设它永远都只会是0.因此编译器将用类似下列的无限循环替换函数体:

void bar_optimized(void) {
    foo = 0;
 
    while (true)
         ;
}

但是,foo可能指向一个随时都能被计算机系统其他部分修改的地址,例如一个连接到中央处理器的设备的硬件寄存器,上面的代码永远检测不到这样的修改。如果不使用volatile关键字,编译器将假设当前程序是系统中唯一能改变这个值部分(这是到当前为止最广泛的一种情况)。 为了阻止编译器像上面那样优化代码,需要使用volatile关键字:

static volatile int foo;
 
void bar (void) {
    foo = 0;
 
    while (foo != 255)
        ;
}

参考资料:
Volatile变量
why is volatile needed in C

puts vs printf

puts输出字符串默认换行,printf功能更强大,输出字符串默认不换行。
difference between printf and puts

互斥量与条件变量

线程的主要优势在于,能够通过全局变量来共享信息,不过,这种便捷的共享是有代价的:必须确保多个线程不会同时修改同一变量,或者某一线程不会读取正由其他线程修改的变量。术语临界区(critical section)是指访问某一共享资源的代码片段,并且这段代码的执行应为原子操,亦即,同时访问同一共享资源的其他线程不应该终端该片段的执行。
互斥量防止多个线程同时访问同一共享变量。条件变量允许一个线程就某个共享变量(或其他共享资源)的状态变化通知其他线程,并让其他线程等待这一通知。条件变量总是结合互斥量使用。条件变量就共享变量的状态改变发出通知,而互斥量则提供对该共享变量访问的互斥。
条件变量与互斥量之间存在着天然的关联关系,在线程池代码中我们可以看到如下用法:
1.线程在准备检查共享变量状态时锁定互斥量
2.检查共享变量的状态
3.如果共享变量未处于预期状态,线程应在等待条件变量并进入休眠前解锁互斥量(以便其他线程能访问该共享变量)
4.当线程因为条件变量的通知而被再度唤醒时,必须对互斥量再次加锁,因为在典型情况下,线程会立即访问共享变量。
函数pthread_cond_wait()会自动执行3 4 步中对互斥量的加锁和解锁动作。