Linux 内核链表 list.h 的使用 - HelloWorld开发者社区

C 语言本身并不自带集合（Collection）工具，当我们需要把结构体（struct）实例串联起来时，就需要在结构体内声明指向下一实例的指针，构成所谓的“链表”。而为了实现对链表的操作，我们需要另外实现一系列的函数，例如添加、删除、搜索、复制等等。而利用 Kernel 源代码中自带的 list.h，则可以方便地实现任意类型结构体的串联。

编程需求

假设我有一个表示学生资料的结构体：

#define MAX_STRING_LENGTH 50

typedef struct student {
  char first_name[MAX_STRING_LENGTH];
  char last_name[MAX_STRING_LENGTH];
  unsigned int age;
} student_t;

传统的做法，当我们需要将一系列学生的数据串联起来，那我们需要在该结构体内部添加一枚指针：

typedef struct student {
  char first_name[MAX_STRING_LENGTH];
  char last_name[MAX_STRING_LENGTH];
  unsigned int age;
  struct student *next; /* Look at dis ;p */
} student_t;

几乎每位 C 语言学习者都有在教科书中见过此类实现方法。但是这样做的缺点是：我们需要额外编写一系列函数，实现对该类型链表的操作。然而，稍微复杂点的项目内，有个十几个结构体那是很平常的事情。如果我们想为它们都实现链表功能，那我们就需要为每个类型的结构体编写一套函数，然后我们就累 shi 了。

有没有一种更为通用的办法呢？不需要顾及结构体本身的类型，就可以简简单单地把它们串起来。有点类似 Java 中的 LinkedList，如果我需要把一堆 Student 类的对象管理起来，那就很简单：

LinkedList<Student> lstStudents = new LinkedList<Student>();

完事儿！接着直接把对象往 lstStudents 里面放就可以了。

list.h 简介

Linux 内核的源代码中有一个工具头文件 list.h，里面定义了一系列的宏（macro），帮助我们实现对任意类型结构体的串联。然而原始版本的 list.h 是供内核使用的，并不能直接被使用在用户空间（user space）的程序内。好在有网友对其进行了改写，使得其可以被使用在一般应用程序内。因为博客系统的限制，我无法将源代码一并贴上，大家可以自行下载：list.h 。较之 Github 上最新版的 list.h，该修改版提供的宏不是很全面，你可以根据自己的需要，从 Github 摘取、添加新宏。

list.h 的使用

为了利用 list.h 实现链表功能，首先我们需要对我们的结构体做一点小小的改动：

#include "list.h"

typedef struct student {
  char first_name[MAX_STRING_LENGTH];
  char last_name[MAX_STRING_LENGTH];
  unsigned int age;
  struct list_head node_student;
} student_t;

首先毫无悬念的，我们需要将 list.h 包含到我们的程序，接着在结构体内，我们添加一个类型为 struct list_head 的字段 node_student。这个字段就像是链条上的环扣，一环一环地将所有实例串联起来。接下去我们需要一个函数，来创建该结构体的实例：

student_t *make_student(const char *fn, const char *ln, unsigned int a) {
  student_t *stu = NULL;

  if ((stu = calloc(1, sizeof(struct student))) == NULL) {
    return NULL; /* Failed to allocate memory. */
  }
  if (strlen(fn) > (MAX_STRING_LENGTH - 1)) {
    return NULL; /* First name too long. */
  }
  if (strlen(ln) > (MAX_STRING_LENGTH - 1)) {
    return NULL; /* Last name too long. */
  }

  strcpy(stu->first_name, fn);
  strcpy(stu->last_name, ln);
  stu->age = a;
  return stu;
}

函数代码比较简单，就不费口舌解释了。

添加元素

现在在我们的 main 函数内，我们将创建一系列 student_t 结构体的实例，并把它们串联起来：

struct list_head class;
INIT_LIST_HEAD(&class);

student_t *stu = NULL;

/* Create students. */
if ((stu = make_student("Pierre", "Dupont", 16)) == NULL) {
  fprintf(stderr, "Failed to create Pierre.\n");
  return -1;
}
list_add_tail(&stu->node_student, &class);

if ((stu = make_student("Hugot", "Badeaux", 18)) == NULL) {
  fprintf(stderr, "Failed to create Hugot.\n");
  return -1;
}
list_add_tail(&stu->node_student, &class);

if ((stu = make_student("Celine", "Brousseau", 17)) == NULL) {
  fprintf(stderr, "Failed to create Celine.\n");
  return -1;
}
list_add_tail(&stu->node_student, &class);

这里我们首先创建一个链表 class，并用宏 INIT_LIST_HEAD 对其进行初始化。紧接着我们创建三名学生，分别叫：Pierre.Dupont，Hugot.Badeaux 和 Celine.Brousseau。每创建一名学生，我们就用宏 list_add_tail，将其添加到链表 class 内。list_add_tail 取两个参数，首先是结构体内用于串联实例的“环扣”node_student 的指针，其次是链表本身的指针，即 class。

链表的遍历

完成对链表的初始化之后，我们试着打印出链表内所有的学生信息：

/* Print all students in class. */
printf("All the students in class.\n");
list_for_each_entry(stu, &class, node_student) {
  printf("First name: %s\n", stu->first_name);
  printf("Last name: %s\n", stu->last_name);
  printf("Age: %u\n\n", stu->age);
}

对链表的遍历是通过宏 list_for_each_entry 来实现的，它取三个参数，分别是：一个结构体类型的指针，链表的指针，以及结构内“环扣”的名称，即 node_student。程序编译时，该宏会被替换成一个 for 循环，遍历链表内所有的元素，并且打印在终端上。

导出链表

现在学校组织一次春游，为此我们利用宏 LIST_HEAD 添加了另一个新的链表 bus。使用 LIST_HEAD 来创建空链表相对简单，不需要先声明，再用 INIT_LIST_HEAD 进行初始化。现在，我们需要把班级上所有的学生都移到公车 bus 内：

LIST_HEAD(bus);

/* Moving all students into bus. */
printf("Moving all the students into the bus.\n");
list_splice_init(&class, &bus);
if (list_empty(&class)) {
  printf("No more student in class.\n");
}
printf("\n");
  
/* Print all student in bus. */
list_for_each_entry(stu, &bus, node_student) {
  printf("%s %s (%u) is in the bus.\n", stu->first_name, stu->last_name, stu->age);
}
printf("\n");

宏 list_splice_init 取两个参数，分别为原链表指针和新链表指针。使用该宏之后，class 链表内的所有元素就会被导出到 bus 链表内。接着我们再用宏 list_empty 测试链表 class 是否为空。顺利完成对学生的移动之后，我们再一次使用 list_for_each_entry，逐一打印出新链表 bus 内的所有学生。不出意外的话，我们会发现 class 链表已为空，且所有学生都被成功转移到了新链表 bus 内。

删除元素

新的麻烦来了，在校车即将出发之际，妹子 Celine 突然决定不去了，要回家。那我们只能将 Celine 从校车中移出，然后再重新清点学生人数。

student_t *tmp = NULL;

/* Celine wanna go home. */
printf("Celine wants to go home, remove her.\n");
list_for_each_entry_safe(stu, tmp, &bus, node_student) {
  if (strcmp(stu->first_name, "Celine") == 0) {
    list_del(&stu->node_student);
    free(stu);
  }
}
printf("\n");

/* Print remaining students in bus. */
printf("Remaining students in bus.\n");
list_for_each_entry(stu, &bus, node_student) {
  printf("%s %s (%u) is in the bus.\n", stu->first_name, stu->last_name, stu->age);
}
printf("\n");

我们首先用宏 list_for_each_entry_safe 遍历链表 bus 内的所有学生，当我们找到 Celine 的时候，则调用宏 list_del 将其从链表内删除。list_del 只取一个参数，即目标元素内“环扣”的指针。此外 list_for_each_entry 和 list_for_each_entry_safe 的唯一区别在于，list_for_each_entry_safe 可以在遍历的过程当中删除链表内的元素，且不会造成错误。较之 list_for_each_entry，list_for_each_entry_safe 需多取一个参数 tmp，其同样为结构体类型的指针，用于在遍历过程，临时保存链表元素。在删除 Celine 之后，我们发现 Pierre 和 Hugot 还老老实实地坐在校车内。

程序的最后，校车成功抵达目的地，我们将所有剩余的学生（Pierre 和 Hugot）从校车内移出，并释放系统资源：

/* End of demo. */
printf("End of demo, free all resources.\n");
list_for_each_entry_safe(stu, tmp, &bus, node_student) {
  printf("Removing student: %s %s\n", stu->first_name, stu->last_name);
  list_del(&stu->node_student);
  free(stu);
}

return 0;

编译完成之后，试着运行程序，终端输出为：

All the students in class.
First name: Pierre
Last name: Dupont
Age: 16

First name: Hugot
Last name: Badeaux
Age: 18

First name: Celine
Last name: Brousseau
Age: 17

Moving all the students into the bus.
No more student in class.

Pierre Dupont (16) is in the bus.
Hugot Badeaux (18) is in the bus.
Celine Brousseau (17) is in the bus.

Celine wants to go home, remove her.

Remaining students in bus.
Pierre Dupont (16) is in the bus.
Hugot Badeaux (18) is in the bus.

End of demo, free all resources.
Removing student: Pierre Dupont
Removing student: Hugot Badeaux

一切正如我们预期的那样，我们实现了对链表的添加、遍历、导出和删除。

注意事项

最后，使用 list.h 时需要注意代码的可读性，先看看下面这个例子：

typedef struct school {
  struct list_head list_students;
  struct list_head list_teachers;
  struct list_head list_guards;
  struct list_head list_classrooms;
  struct list_head list_bus;
} school_t;

这个结构体的本意很明显，就是想包含一所学校内所有的人员和器材等等。但是单凭这样一个结构体，我们无法知道每一个链表内元素的具体类型。尤其是当你把一个项目丢给新接手的开发人员时，他看到这样一个结构体，会比较头疼。而且很多项目内，结构体的名字并非都那么浅显易懂。

当他想使用宏 list_for_each_entry 或者 list_for_each_entry_safe 遍历链表时，如果没有具体注解，他甚至不知道该用什么结构体类型。所以，一定要有详细的注解：

typedef struct school {
  struct list_head list_students; /* struct student */
  struct list_head list_teachers; /* struct teacher */
  struct list_head list_guards; /* struct guard */
  struct list_head list_classrooms; /* struct classroom */
  struct list_head list_bus; /* struct bus */
} s

综上所述，我个人的意见：list.h 很好用，但是要慎用。