进程ID的类型

内核中的进程ID类型用pid_type来描述,定义在include/linux/pid.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
enum pid_type
{
PIDTYPE_PID,
PIDTYPE_TGID,
PIDTYPE_PGID,
PIDTYPE_SID,
PIDTYPE_MAX,
};

/*
* What is struct pid?
*
* A struct pid is the kernel's internal notion of a process identifier.
* It refers to individual tasks, process groups, and sessions. While
* there are processes attached to it the struct pid lives in a hash
* table, so it and then the processes that it refers to can be found
* quickly from the numeric pid value. The attached processes may be
* quickly accessed by following pointers from struct pid.
*
* Storing pid_t values in the kernel and referring to them later has a
* problem. The process originally with that pid may have exited and the
* pid allocator wrapped, and another process could have come along
* and been assigned that pid.
*
* Referring to user space processes by holding a reference to struct
* task_struct has a problem. When the user space process exits
* the now useless task_struct is still kept. A task_struct plus a
* stack consumes around 10K of low kernel memory. More precisely
* this is THREAD_SIZE + sizeof(struct task_struct). By comparison
* a struct pid is about 64 bytes.
*
* Holding a reference to struct pid solves both of these problems.
* It is small so holding a reference does not consume a lot of
* resources, and since a new struct pid is allocated when the numeric pid
* value is reused (when pids wrap around) we don't mistakenly refer to new
* processes.
*/

pid内核中唯一区分每个进程的标识,在使用fork或clone系统调用时产生一个新的进程时,均会由内核分配一个新的唯一PID值。

用户空间通过getpid()获取的进程号并不是这个PID值

TGID线程组(轻量级进程)的标识。如果以CLONE_THREAD标志来调用clone建立一个进程,就是该进程的一个线程,它们属于一个线程组。线程组的ID,就是TGID。线程组的主线程TGID与PID相同。

PGID,独立的进程可以组成进程组(setpgrp系统调用),进程组可以简化组内进程发送信号的操作。如管道连接的进程处在同一个进程组内,进程组的ID就叫PGID。

SID,几个进程可以合并成一个会话组(setsid调用),可用于终端程序设计。会话组中所有进程都有相同的SID,保存在task_struct的session成员中。

task_ struct->signal->__session表示全局SID,

而全局PGID则保存在task_struct->signal->__pgrp。

辅助函数set_task_session和set_task_pgrp可用于修改这些值。

PID命名空间

pid_namespace定义在include/linux/pid_namespace.h中:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
struct pid_namespace {
struct idr idr;
struct rcu_head rcu;
unsigned int pid_allocated;
struct task_struct *child_reaper;
struct kmem_cache *pid_cachep;
unsigned int level;
struct pid_namespace *parent;
#ifdef CONFIG_BSD_PROCESS_ACCT
struct fs_pin *bacct;
#endif
struct user_namespace *user_ns;
struct ucounts *ucounts;
int reboot; /* group exit code if this pidns was rebooted */
struct ns_common ns;
} __randomize_layout;
  • child_reaper:指向的是当前命名空间的init进程,每个命名空间都有一个作用相当于全局init进程的进程
  • level:代表当前命名空间的等级,初始命名空间的level为0,它的子命名空间level为1,依次递增,而且子命名空间对父命名空间是可见的。从给定的level设置,内核即可推断进程会关联到多少个ID。
  • parent:指向父命名空间的指针

PID结构

pid与upid

pid是内核对pid的内部表示。upid表示特定的命名空间中可见的信息。

upid定义如下:

1
2
3
4
struct upid {
int nr;
struct pid_namespace *ns;
};
  • nr:ID具体的值
  • ns:执行命名空间的指针

所有的upid都会保存在一个散列表中

pid定义如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
struct pid
{
refcount_t count;
unsigned int level;
spinlock_t lock;
/* lists of tasks that use this pid */
struct hlist_head tasks[PIDTYPE_MAX];
struct hlist_head inodes;
/* wait queue for pidfd notifications */
wait_queue_head_t wait_pidfd;
struct rcu_head rcu;
struct upid numbers[1];
};
  • count:指使用该PID的task的数目;
  • level:可以看到该PID的命名空间的数目,也就是包含该进程的命名空间的深度。
  • tasks[PIDTYPE_MAX]:每个数组项都是一个散列表头,分别对应三种类型:PIDTYPE_PID, PIDTYPE_PGID, PIDTYPE_SIDPIDTYPE_MAX表示PID类型的数目)。所有共享同一给定的ID的task实例,都通过该散列表连接起来。
  • numbers[1]:一个upid的实例数组,每个数组项代表一个命名空间,用来表示一个PID可以属于不同的命名空间,该元素放在末尾,可以向数组添加附加的项。