Framist's Little House

◇ 自顶而下 - 面向未来 ◇

0%

【操作系统概念 - 作业 4】Threads

Operating System Concepts Exercises 4

Threads

操作系统作业 4

  • 网上学习 Linux 中的信号(signal)机制
  • 4.2, 4.4,
  • 4.7, 4.8, 4.9, 4.14, 4.15, 4.16, 4.17
  • 4.21

每题最后一个引用块是老师提供的参考答案

网上学习 Linux 中的信号(signal)机制

Linux 信号(signal) - 简书 (jianshu.com)

Linux 信号(signal) 机制分析 - 苏博 - 博客园 (cnblogs.com)

……

Practice Exercises

4.2, 4.4

4.2 What are two differences between user-level threads and kernel-level threads? Under what circumstances is one type better than the other?

What are two differences between user-level threads and kernel-level threads?

用户级线程和内核级线程之间的两个区别是什么?

答:

User-level threads Kernel-level threads
The existence of user-level threads is unknown to the kernel. The existence of the kernel-level threads is known to the kernel.
User-level threads are managed without kernel support. Kernel-level threads are managed by the operating system.
user-level threads are faster to create than are kernel-level threads. Kernel-level threads are user-level threads.
User-level threads are scheduled by the thread library. Kernel-level threads are scheduled by the kernel.

Under what circumstances is one type better than the other?

在哪种情况下哪种类型比另一种更好?

答:

  • Circumstances where kernel-level threads are better than user-level threads:

If the kernel is single-threaded, then kernel-level threads are better than user-level threads, because any user-level thread performing a blocking system call will cause the entire process to block, even if other threads are available to run within the application.

针对内核是单线程的情况,内核线程优于用户线程。因为任何执行阻塞系统调用的用户线程都会导致整个进程阻塞,即使应用程序中可以运行其他线程。

For example a process P1 has 2 kernel level threads and process P2 has 2 user-level threads. If one thread in P1 gets blocked, its second thread is not affected. But in case of P2 if one thread is blocked (say for I/O), the whole process P2 along with the 2nd thread gets blocked.

例如,有两个进程 a 和 b。进程 a 有 2 个内核线程,而进程 b 有 2 个用户线程。如果 a 中的一个线程被阻塞,它的第二个线程不会受到影响。但是对于 b,如果一个线程被阻塞 (比如 I/O),整个进程 b 和第二个线程就会被阻塞。

In a multiprocessor environment, the kernel-level threads are better than user-level threads, because kernel-level threads can run on different processors simultaneously while user-level threads of a process will run on one processor only even if multiple processors are available.

针对多处理器环境,内核线程优于用户线程,因为内核线程可以在不同的处理器上同时运行,而即使有多个处理器可用,进程的用户线程也只能在一个处理器上运行。

  • Circumstances where user-level threads are better than kernel-level threads:

If the kernel is time shared, then user-level threads are better than kernel-level threads, because in time shared systems context switching takes place frequently. Context switching between kernel level threads has high overhead, almost the same as a process whereas context switching between user-level threads has almost no overhead as compared to kernel level threads.

对于时间共享型的内核,用户线程优于内核线程,因为共享系统上下文切换经常发生。内核线程之间的上下文切换具有很高的开销,几乎与进程相同,而与内核线程相比,用户线程之间的上下文切换几乎没有开销。

Answer:

User-level threads are threads that the OS Kernel is not aware of. They exist entirely within a process, and are scheduled to run within that process’s time slice (managed by the threading library itself). User-level threads are much faster to switch between, as there is no context switch. User-level threads can’t be ran in parallel on the different CPUs.

用户级线程是操作系统内核所不知道的线程。它们完全存在于一个进程中,并被安排在该进程的时间片中运行(由线程库本身管理)。用户级线程的切换速度更快,因为没有上下文切换。用户级线程不能在不同的 CPU 上并行运行。

The OS kernel is aware of kernel-level threads. Kernel threads are scheduled by the OS’s scheduling algorithm, and each thread can be granted its own time slice. Kernel-level threads can be ran in parallel on the different CPUs.

操作系统内核知道内核级线程。内核线程由操作系统的调度算法来安排,每个线程都可以被授予自己的时间片。内核级线程可以在不同的 CPU 上并行运行。

Many systems implement threading differently:

A many-to-one threading model maps many user threads directly to one kernel thread, the kernel thread can be thought of as the main process.

A one-to-one threading model maps each user thread directly to one kernel thread, this model allows parallel processing on the multiprocessor systems.

许多系统以不同方式实现线程。
多对一线程模型将许多用户线程直接映射到一个内核线程,内核线程可以被认为是主进程。
一对一线程模型将每个用户线程直接映射到一个内核线程,这种模型允许在多处理器系统上进行并行处理。

4.4 What resources are used when a thread is created? How do they differ from those used when a process is created?

What resources are used when a thread is created?

创建线程时会使用哪些资源?

答:

  1. Code section
  2. Data section
  3. Other operation system resources such as open files and signals with other threads belonging to the same process whereas each and every process has separate code section, data section and operating system resources.

How do they differ from those used when a process is created?

它们与创建进程时使用的那些有何不同?

答:

A thread is a basic unit of CPU utilization, also known as light weight process. So, creating either a user or kernel thread involves allocating a small data structure to hold a thread ID, a program counter, a register set, stack and priority, whereas creating a process involves allocating the large data structure called as PCB(Process Control Block) which includes a memory map, list of open files, and environment variables. Allocating and managing the memory map is typically the most time-consuming activity.

Answer:

Creating either a user or kernel thread involves allocating a small data structure (TCB, thread control block) to hold a register set, stack, state and priority. Because a thread is smaller than a process, thread creation typically uses fewer resources than process creation, and is more efficient.

创建一个用户或内核线程需要分配一个小的数据结构(TCB,线程控制块)来保存一个寄存器集、堆栈、状态和优先级。因为线程比进程小,所以创建线程通常比创建进程使用更少的资源,而且效率更高。

Creating a process requires allocating a process control block (PCB), a rather large data structure. The PCB includes a memory map, list of open files, and environment variables. Allocating and managing the memory map is typically the most time-consuming activity.

创建一个进程需要分配一个进程控制块(PCB),一个相当大的数据结构。PCB 包括一个内存地图、打开的文件列表和环境变量。分配和管理内存图通常是最耗时的活动。

Exercises

4.7, 4.8, 4.9, 4.14, 4.15, 4.16, 4.17

4.7 Under what circumstances does a multithreaded solution using multiple kernel threads provide better performance than a single-threaded solution on a single-processor system?

答:

执行一些高并发式的任务

Answer:

When a kernel thread suffers a page fault, another kernel thread can be switched in to use the interleaving time in a useful manner.

当一个内核线程遭遇页面故障时,另一个内核线程可以被切换进来,以有用的方式使用交错时间。

A single-threaded process, on the other hand, will not be capable of performing useful work when a page fault takes place. Therefore, in scenarios where a program might suffer from frequent page faults or has to wait for other system events, a multithreaded solution would perform better even on a single-processor system.
另一方面,一个单线程的进程,在发生页面故障时,将无法执行有用的工作。因此,在程序可能频繁发生页面故障或必须等待其他系统事件的情况下,即使在单处理器系统上,多线程的解决方案也会表现得更好。

4.8 Which of the following components of program state are shared across threads in a multithreaded process?

多线程进程中的线程之间共享以下哪些程序状态组件?

答:

  • a. Register values
  • b. Heap memory
  • c. Global variables
  • d. Stack memory

Answer:

The threads of a multithreaded process share heap memory and global variables. Each thread has its separate set of register values and a separate stack.

一个多线程进程的线程共享堆内存和全局变量。每个线程都有其独立的寄存器值和独立的栈。

4.9 Can a multithreaded solution using multiple user-level threads achieve better performance on a multiprocessor system than on a single-processor system? Explain

答:

No.

A multithreaded system comprising of multiple user-level threads cannot make use of the different processors in a multiprocessor system simultaneously. The operating system sees only a single process and will not schedule the different threads of the process on separate processors. Consequently, there is no performance benefit associated with executing multiple user-level threads on a multiprocessor system.

一个由多个用户级线程组成的多线程系统不能同时利用多处理器系统中的不同处理器。操作系统只看到一个单一的进程,不会将进程的不同线程安排在不同的处理器上。因此,在多处理器系统上执行多个用户级线程没有任何性能优势。

4.14

A system with two dual-core processors has four processors available for scheduling. A CPU-intensive application is running on this system. All input is performed at program start-up, when a single file must be opened. Similarly, all output is performed just before the program terminates, when the program results must be written to a single file. Between startup and termination, the program is entirely CPU-bound. Your task is to improve the performance of this application by multithreading it. The application runs on a system that uses the one-to-one threading model (each user thread maps to a kernel thread).

一个有两个双核处理器的系统有四个处理器可供调度。一个 CPU 密集型的应用程序正在这个系统上运行。所有的输入都是在程序启动时进行的,此时必须打开一个文件。同样,所有的输出都是在程序终止前进行的,此时程序的结果必须被写入一个文件。在启动和终止之间,程序是完全受 CPU 控制的。你的任务是通过多线程来提高这个程序的性能。该程序运行在一个使用一对一线程模型的系统上(每个用户线程映射到一个内核线程)。

How many threads will you create to perform the input and output? Explain.

答:

1 threads, 1 threads.

It only makes sense to create as many threads as there are blocking system calls, as the threads will be spent blocking. Creating additional threads provides no benefit. Thus, it makes sense to create a single thread for input and a single thread for output.

创建的线程数量与阻塞的系统调用一样多才有意义,因为这些线程将被用于阻塞。创建额外的线程不会带来任何好处。因此,为输入创建一个线程,为输出创建一个线程是有意义的。

How many threads will you create for the CPU-intensive portion of the application? Explain.

答:

4 threads.

Explain:

Threads count depends upon the priority and requirements of the application. So only thread is enough for this kind of application and this thread is going to handle both input and output operation.

  • It is a concurrency approach. Here, it only makes sense to create as many threads as there are blocking system calls, as the threads will be spent blocking.
  • It doesn’t provides any benefits to create an additional threads.
  • Thus, only a signal thread creation makes sense for input and a single thread for output.
  • Four threads are created to perform the CPU-intensive portion of the application. It is because, there should be as many threads as there are processing cores.
  • It would be the waste of processing resources to use fewer threads.
  • Also any number greater than four would be unable to run.

Four. There should be as many threads as there are processing cores. Fewer would be a waste of processing resources, and any number > 4 would be unable to run.

四。有多少个线程就应该有多少个处理核心。更少的会浪费处理资源,任何>4 的数量都会无法运行。

4.15 Consider the following code segment:

1
2
3
4
5
6
7
pid_t pid;
pid = fork();
if (pid == 0) { /* child process */
fork();
thread create( . . .);
}
fork();

a. How many unique processes are created?

答:

  • The statement pid = fork(); before the if statement creates one process. The parent process say p creates this process. Let it be p1.
  • The statement fork(); in the if statement creates one process. The parent process p creates this process. Let it be p2.
  • After the if statement, parent process p, process p1 and process p2 will execute fork(); creating three new processes.
    • One process is created by parent process p.
    • One process is created by process p1.
    • One process is created by process p2.

Hence, 5 unique processes (p1, p2, p3, p4, p5) will be created. If the parent process is also considered, then 6 unique processes (p, p1, p2, p3, p4, p1, p5) will be created.

b. How many unique threads are created?

答:

  • Thread creation is done in if block. Only child process p1 is executed in the if block. Therefore, process p1 will be created one thread.
  • In the if block one process p2 is created using fork(). Therefore, process p2 will also create a thread.

Hence, 2 unique threads will be created.

Answer:

a) 6 processes

b) 2 threads

image-20210701012014413

4.16*

As described in Section 4.7.2, Linux does not distinguish between processes and threads. Instead, Linux treats both in the same way, allowing a task to be more akin to a process or a thread depending on the set of flags passed to the clone() system call. However, other operating systems, such as Windows, treat processes and threads differently. Typically, such systems use a notation in which the data structure for a process contains pointers to the separate threads belonging to the process. Contrast these two approaches for modeling processes and threads within the kernel.

**(不要求)**如第 4.7.2 节所述,Linux 不区分进程和线程。取而代之的是,Linux 以相同的方式对待两者,从而使任务更类似于进程或线程,具体取决于传递给clone()系统调用的标志集。但是,其他操作系统(例如 Windows)对进程和线程的处理方式也有所不同。通常,此类系统使用一种表示法,其中进程的数据结构包含指向属于该进程的单独线程的指针。

对比这两种在内核中对进程和线程进行建模的方法。

Linux operating systems consider both threads and processes as tasks; it cannot distinguish between them. In contrast, windows operating system threads and processes differently.
This approach has pros and cons while modeling threads and processes inside the kernel.

Pros:

  • Linux consider this as similar, so codes belong to operating system can be cut down easily.
  • Scheduler present in the Linux operating systems do not need special code to test threads coupled with each processes.
  • It considers different threads and processes as a single task during the time of scheduling.

Cons:

  • This ability makes it harder for the Linux operating system to inflict process-wide resource limitations directly.

  • Extra steps are needed to recognize the each processes belong to appropriate threads and complexity in performing relevant tasks.

Answer:

In systems where processes and threads are considered as similar entities, some of the operating system code could be simplified. A scheduler, for instance, can consider the different processes and threads on an equal footing without requiring special code to examine the threads associated with a process during every scheduling step. However, this uniformity could make it harder to impose process-wide resource constraints in a direct manner. Instead, some extra complexity is required to identify which threads correspond to which process and perform the relevant accounting tasks.

4.17 The program shown in Figure 4.16 uses the Pthreads API. What would be the output from the program at LINE C and LINE P?

Figure 4.16:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <pthread.h>
#include <stdio.h>
#include <types.h>
int value = 0;
void *runner(void *param); /* the thread */
int main(int argc, char *argv[]) {
pid_t pid;
pthread_t tid;
pthread_attr_t attr;
pid = fork();
if (pid == 0) { /* child process */
pthread_attr_init(&attr);
pthread_create(&tid, &attr, runner, NULL);
pthread_join(tid, NULL);
printf("CHILD: value = %d", value); /* LINE C */
}
else if (pid > 0) { /* parent process */
wait(NULL);
printf("PARENT: value = %d", value); /* LINE P */
}
}
void *runner(void *param) {
value = 5;
pthread_exit(0);
}

The output of LINE C in the program:

1
CHILD: value = 5
  • The child process in the thread is forked by parent process and child process each have its own memory space.
  • After forking, the parent process waits for the completion of child process.
  • New thread is created for child process and the runner() function is called which set the value of the global vairlable to 5.
  • Thus, after execution of this line, the value displayed will be 5.

The output of LINE P in the program:

1
PARENT: value = 0
  • After completing the child process, the value of the global variable present in parent process remains 0.
  • Thus, after execution of this line, the value displayed will be 0.

Answer:

LINE C, 5

LINE P, 0

Programming Problems

4.21

Write a multithreaded program that calculates various statistical values for a list of numbers. This program will be passed a series of numbers on the command line and will then create three separate worker threads. One thread will determine the average of the numbers, the second will determine the maximum value, and the third will determine the minimum value. For example, suppose your program is passed the integers:

编写一个多线程程序,为一串数字计算各种统计值。这个程序将在命令行上传递一系列的数字,然后创建三个独立的工作线程。一个线程将确定这些数字的平均值,第二个线程将确定最大值,第三个线程将确定最小值。例如,假设你的程序被传递了整数:

90 81 78 95 79 72 85

The program will report:

1
2
3
The average value is 82 
The minimum value is 72
The maximum value is 95

The variables representing the average, minimum, and maximum values will be stored globally. The worker threads will set these values, and the parent thread will output the values once the workers have exited. (We could obviously expand this program by creating additional threads that determine other statistical values, such as median and standard deviation.)

代表平均值、最小值和最大值的变量将被全局存储。工人线程将设置这些值,一旦工人退出,父线程将输出这些值。显然,我们可以通过创建额外的线程来扩展这个程序,以确定其他统计值,如中位数和标准差)。

Answer:

和上机题(p199 Project2)类似,可以不做。


注:

翻译:deepl

参考资料:

[1] 操作系统学习笔记(三) —线程_雲的博客-CSDN 博客

[2] Criviere/os (github.com)

[3] Operating System Concepts – 9th Edition 及其答案