亚洲av成人无遮挡网站在线观看,少妇性bbb搡bbb爽爽爽,亚洲av日韩精品久久久久久,兔费看少妇性l交大片免费,无码少妇一区二区三区

  免費(fèi)注冊 查看新帖 |

Chinaunix

  平臺 論壇 博客 文庫
12下一頁
最近訪問板塊 發(fā)新帖
查看: 5430 | 回復(fù): 19
打印 上一主題 下一主題

[FreeBSD] 試譯 Kernel-Scheduled Entities for FreeBSD [復(fù)制鏈接]

論壇徽章:
0
跳轉(zhuǎn)到指定樓層
1 [收藏(0)] [報(bào)告]
發(fā)表于 2006-04-02 01:37 |只看該作者 |倒序?yàn)g覽
Kernel-Scheduled Entities for FreeBSD
FreeBSD中的“內(nèi)核調(diào)度實(shí)體”

Jason Evans

jasone@freebsd.org

January 1, 2003

譯者: DarkBlueSea

感謝風(fēng)絲片雨斑竹和gvim的鼓勵(lì)與修正

Abstract:
摘要
FreeBSD has historically had less than ideal support for multi-threaded application programming. At present, there are two threading libraries available. libc_r is entirely invisible to the kernel, and multiplexes threads within a single process. The linuxthreads port, which creates a separate process for each thread via rfork(), plus one thread to handle thread synchronization, relies on the kernel scheduler to multiplex "threads" onto the available processors.
從前FreeBSD一直缺乏對多線程編程的良好支持。目前,有兩個(gè)線程庫可以使用。libc_r對于內(nèi)核來說完全不可見,它就在單個(gè)進(jìn)程內(nèi)部完成多個(gè)線程的復(fù)用。devel/linuxthreads port是內(nèi)核線程庫,它使用rfork()為每個(gè)線程創(chuàng)建一個(gè)獨(dú)立的進(jìn)程,再額外創(chuàng)建一個(gè)線程,來處理線程間的同步,依靠內(nèi)核調(diào)度器在所有可供使用的處理器上復(fù)用多個(gè)線程。

Both approaches have scaling problems. libc_r does not take advantage of multiple processors and cannot avoid blocking when doing I/O on "fast" devices. The linuxthreads port requires one process per thread, and thus puts strain on the kernel scheduler, as well as requiring significant kernel resources. In addition, thread switching can only be as fast as the kernel's process switching.
兩種方法都存在擴(kuò)展性的問題。libc_r在多處理器上沒有優(yōu)勢,并且進(jìn)行“快速設(shè)備”I/O操作無法避免阻塞。devel/linuxthreads port 需要為每個(gè)線程創(chuàng)建進(jìn)程,因此,加重了內(nèi)核調(diào)度器的負(fù)擔(dān),同時(shí)還需要大量的內(nèi)核資源,另外,線程切換只能達(dá)到內(nèi)核進(jìn)程切換的速度。

This paper summarizes various methods of implementing threads for application programming, then goes on to describe a new threading architecture for FreeBSD, based on what are called kernel-scheduled entities, or scheduler activations.
這篇文章概括了多種為應(yīng)用程序編程實(shí)現(xiàn)線程的方法,接著描述一個(gè)新的FreeBSD線程結(jié)構(gòu),基于內(nèi)核調(diào)度實(shí)體(KSE),或者叫“激活調(diào)度”。

1 Background 背景
FreeBSD has been slow to implement application threading facilities, perhaps because threading is difficult to implement correctly, and the FreeBSD Project's general reluctance to build substandard solutions to problems. Instead, two technologies originally developed by others have been integrated.
FreeBSD在實(shí)現(xiàn)應(yīng)用程序線程工具方面的進(jìn)展很慢,也許這是因?yàn)榫程很難被正確的實(shí)現(xiàn),而且FreeBSD項(xiàng)目一般不原意用低于標(biāo)準(zhǔn)的方案解決問題,而用一種新的技術(shù)來取而待之,這種技術(shù)整合了原來的兩種技術(shù)。

libc_r is based on Chris Provenzano's userland pthreads implementation, and was significantly re-worked by John Birrell and integrated into the FreeBSD source tree. Since then, numerous other people have improved and expanded libc_r's capabilities. At this time, libc_r is a high-quality userland implementation of POSIX threads.
libc_r 基于 Chris Provenzano 所開發(fā)的用戶空間pthreads,經(jīng)過 John Birrell 的重要修正后,加入了FreeBSD的源代碼樹中。之后,許多人提高和擴(kuò)展了libc_r的性能。現(xiàn)在,它已經(jīng)是POSIX threads的一個(gè)高品質(zhì)的用戶態(tài)實(shí)現(xiàn)

The linuxthreads port is based on Xavier Leroy's LinuxThreads, which is now an integral part of GNU libc. The linuxthreads port is completely compatible with LinuxThreads as it runs on Linux.
linuxthreads port基于 Xavier Leroy 所開發(fā)的LinuxThreads,LinuxThreads現(xiàn)在已經(jīng)成為了GNU libc的一個(gè)重要組成部分,linuxthreads port和linux下運(yùn)行的LinuxThreads 完全兼容

Both of these libraries have scalability and performance problems for certain types of threaded programs.
對于某些類型的線程化的程序而言,這兩個(gè)庫都存在擴(kuò)展性和性能方面的問題。

2 Threading Architectures 線程結(jié)構(gòu)
Following is a short description of several threading architectures, along with a representative implementation for each of the more common ones. In all cases, threading is preemptive.
下面是對幾種線程結(jié)構(gòu)的簡短的描述,每一個(gè)類型都選擇一個(gè)典型的實(shí)現(xiàn),在所有的情況下,線程是可搶占的。

2.1 Userland (ala FreeBSD's libc_r) 用戶空間
Userland threads are implemented entirely in an application program, with no explicit support from the kernel. In most cases, a threading library is linked in to the application, though it is also possible to hand code user threading within an application.
用戶級線程完全是在應(yīng)用程序內(nèi)部實(shí)現(xiàn)的,并不需要內(nèi)核提供顯式的支持。在大多數(shù)情況下都是直接把一個(gè)線程庫鏈接到應(yīng)用程序中的,當(dāng)然,你也可以在應(yīng)用程序中自己動(dòng)手編寫用戶級線程的代碼。

In order for userland threading to work, two main issues have to be resolved:
為了讓用戶態(tài)線程可以使用,兩個(gè)重要的問題必須解決

Preemptive scheduling: 搶占式調(diào)度

    Threads must be periodically preempted so that all runnable threads of sufficient priority get to run. This is done by a combination of a timer signal (SIGALARM for libc_r), which allows the userland threads scheduler (UTS) to run, and setjmp()/ longjmp() calls to switch between threads.
    線程必須進(jìn)行周期性的被搶占,來保證所有有足夠優(yōu)先級的可運(yùn)行線程可以運(yùn)行,這是通過把一個(gè)定時(shí)器信號(libc_r的SIGALARM)和setjmp()/logjmp()調(diào)用結(jié)合起來完成的,其中,定時(shí)器信號使得用戶級線程調(diào)度器(UTS)可以運(yùn)行,而setjmp()/longjmp()調(diào)用則用于在線程之間進(jìn)行切換。

Process blocking: 進(jìn)程阻塞

    Normally, when a process makes a system call that cannot be completed immediately, the process blocks and another process is scheduled in order to make full use of the processor. However, a threaded program may have multiple runnable threads, so blocking in a system call should be avoided. This is accomplished by converting potentially blocking system calls to non-blocking. This works well in all cases except for operations on so-called fast devices such as local filesystems, where it is not possible to convert to a non-blocking system call. libc_r handles non-blocking and incomplete system calls by converting file descriptors to non-blocking, issuing I/O requests, then adding file descriptors to a central poll()-based event loop.
通常,當(dāng)一個(gè)進(jìn)程進(jìn)行系統(tǒng)調(diào)用不能立即完成時(shí),為了保證處理器的高利用率,這個(gè)進(jìn)程就會被阻塞,其他的進(jìn)程會被調(diào)度執(zhí)行。但是,一個(gè)線程化的程序可能含有多個(gè)可運(yùn)行的線程,所以在系統(tǒng)調(diào)用過程中,應(yīng)該避免阻塞。這個(gè)問題通過將可能阻塞的系統(tǒng)調(diào)用轉(zhuǎn)換為非阻塞來實(shí)現(xiàn)。在絕大多數(shù)情況下,這樣做是沒問題的,但是如果要操作所謂的“高速設(shè)備”比如本地文件系統(tǒng)時(shí),在這里不可能將系統(tǒng)調(diào)用轉(zhuǎn)換為非阻塞。libc_r處理非阻塞的和未完成的系統(tǒng)調(diào)用的方法是先把文件描述符轉(zhuǎn)換成非阻塞,然后發(fā)出I/O請求,最后再把文件描述符加到一個(gè)核心的基于poll()的事件循環(huán)中。(The poll() system call examines a set of file descriptors to see if some of them are ready for I/O.).

Userland threads have the advantage of being very fast in the simple case, though the complexities of call conversion eats into this performance advantage for applications that make many system calls.
用戶級線程的優(yōu)勢是在一般情況下速度非常快,不過在應(yīng)用程序需要許多系統(tǒng)調(diào)用時(shí),調(diào)用轉(zhuǎn)換的復(fù)雜性抵消了這種性能上的優(yōu)勢。

As libc_r currently exists, there are several problems:
當(dāng)前的libc_r,有幾個(gè)問題

   1. The central poll() loop does not scale well to large numbers of threads. This could probably be solved by converting to the kqueue() interface [Lemon].
1.核心的poll()循環(huán)在線程數(shù)目很大的情況下的擴(kuò)展性不是很好。不過這或許可以通過改用kqueue()接口來解決。

2. The entire process blocks on I/O to fast devices. This problem could probably be solved by integrating asynchronous I/O into libc_r, which would first require stabilizing the AIO implementation, then require switching from a poll()-based event loop to a kqueue()-based event loop in order to be able to integrate I/O completion notifications into the same loop.
在“快速設(shè)備”I/O操作時(shí),整個(gè)進(jìn)程被阻塞,這個(gè)問題可能可以通過在libc_r 中整合異步I/O來解決,解決方案首先要求穩(wěn)定AIO實(shí)現(xiàn),接著從基于poll()調(diào)用的事件循環(huán)交換到基于kqueue()調(diào)用的事件循環(huán)中,以便將I/O完成通告整合到同一個(gè)循環(huán)中。

3.Since all threads are multiplexed onto a single process, libc_r cannot take advantage of multiple CPUs in an SMP system. There is no reasonable retrofit to libc_r which solves this scalability problem.
因?yàn)樗芯程并發(fā)的使用一個(gè)處理器,libc_r在對稱多處理器系統(tǒng)中沒有優(yōu)勢,沒有合適的修改可以使libc_r 解決這個(gè)可擴(kuò)展性問題

Figure 1: Userland threading model
圖1:用戶空間線程模型


2.2 Process-based (ala Linux's LinuxThreads) 基于進(jìn)程的線程
Process-based threading is confusingly referred to as kernel threading much of the time. True kernel threads are threads that run within the kernel in order to run various portions of the kernel in parallel. Process-based threads are threads that are based on some number of processes that share their address space, and are scheduled as normal processes by the kernel. LinuxThreads implements process-based threading.
多數(shù)情況下,基于進(jìn)程的線程作為內(nèi)核線程讓人感到很混亂,真正的內(nèi)核線程是在運(yùn)行在內(nèi)核中的線程,讓線程并行的運(yùn)行在內(nèi)核的不同部分。基于進(jìn)程的線程是一些共享地址空間的進(jìn)程,這些進(jìn)程被內(nèi)核作為普通進(jìn)程來調(diào)度,LinuxThreads就是基于進(jìn)程的線程實(shí)現(xiàn)。

New processes are created such that they share the address space of the existing process(es), as shown in Figure 2. For Linux, this is done via the clone() system call, whereas for FreeBSD it is done via a special form of the rfork() system call.
新進(jìn)程創(chuàng)建時(shí),它們共享一個(gè)已經(jīng)存在的進(jìn)程的地址空間,在表2中可以看出。對于Linux,這些進(jìn)程通過clone()系統(tǒng)調(diào)用完成,但在FreeBSD下,通過rfork()完成。

As soon as a second thread is created via pthread_create(), LinuxThreads creates an additional process that is used exclusively for synchronization among processes. This means that for a program with n threads, there are n + 1 processes (if at some point in time n > 1).
當(dāng)?shù)诙䝼(gè)進(jìn)程創(chuàng)建后,LinuxThreads 會通過pthread_create()調(diào)用額外創(chuàng)建一個(gè)進(jìn)程,專門用于同步其他進(jìn)程,那就是說,對于一個(gè)有n個(gè)線程的程序,一共有你n+1個(gè)進(jìn)程(如果當(dāng)前n>1)

LinuxThreads is elegant in that it is simple. However, it has at least the following POSIX compliance issues that cannot be easily addressed:
LinuxThreads 在普通情況下很優(yōu)雅,但是至少在下面的與POSIX一致性上,不是很容易處理

Each thread runs in a separate process. Each process has a unique process ID (pid), but POSIX requires all threads to appear to have the same pid. Fixing this would require significant modifications to the kernel's data structures.
每一個(gè)線程運(yùn)行在一個(gè)獨(dú)立的進(jìn)程里,每一個(gè)進(jìn)程有一個(gè)唯一的進(jìn)程ID但是POSIX要求,所有的線程中要出現(xiàn)同一個(gè)pid,修復(fù)這個(gè)問題需要大量的修改內(nèi)核的數(shù)據(jù)結(jié)構(gòu)。

The thread priority semantics specified by POSIX cannot be implemented, because each thread is actually a process, which makes thread contention at the application level impossible. All thread contention is by definition at the system level. This has the additional disadvantage that multi-threaded applications compete unfairly with single-threaded applications, which makes running a mixture of applications difficult on a single machine.
POSIX指定的線程的優(yōu)先級的語義不能實(shí)現(xiàn),因?yàn)槊恳粋(gè)線程實(shí)際上是一個(gè)進(jìn)程,這使得線程不可能在應(yīng)用程序級別上相互競爭,所有線程的競爭都被定義在系統(tǒng)級別上。這就帶來了額外的缺點(diǎn),多線程程序和單線程程序的競爭明顯的不公平,這會使一臺機(jī)器上運(yùn)行多種混合程序有困難。

Process-based threading also has some inherent performance and scalability issues that cannot be overcome:
基于進(jìn)程的線程同樣有一些先天的性能和可擴(kuò)展優(yōu)勢無人能及

Switching between threads is a very expensive operation. It requires switching to kernel mode, switching the old thread (process) out, and then running the new thread (process). There is no solution to this problem except to optimize process switching, which defies optimization beyond a certain point. This problem is aggravated by cache locality considerations.
線程間的切換是代價(jià)非常高的操作,需要換入內(nèi)核態(tài),換下原來的線程接著運(yùn)行新的線程,F(xiàn)在沒辦法解決這個(gè)問題,只能通過優(yōu)化進(jìn)程切換來緩解,挑戰(zhàn)最優(yōu)化,以超過一個(gè)點(diǎn)。這個(gè)問題由于緩存擊中率的因素更加尖銳化。

Each thread (process) requires all the kernel resources typically associated with a process. This includes a kernel stack, which means that applications with many threads require large amounts of kernel resources.
每一個(gè)線程(進(jìn)程)需要所有的與進(jìn)程相關(guān)聯(lián)的內(nèi)核資源。包括內(nèi)核堆棧,這意味著多線程程序需要大量的內(nèi)核資源。


[ 本帖最后由 DarkBlueSea 于 2006-4-10 13:08 編輯 ]

FreeBSD_KSE.odt.zip

32.09 KB, 下載次數(shù): 81

文檔

論壇徽章:
0
2 [報(bào)告]
發(fā)表于 2006-04-02 01:44 |只看該作者
Figure 2: Process-based threading model
基于進(jìn)程的線程模型



2.3 Multi-level (ala Solaris's LWPs) 多級線程
Multi-level threading is a hybrid of user-level and process-based threading. Threads are multiplexed onto a pool of processes. The size of the process pool is normally determined automatically by heuristics internal to the threading library that take into account issues such as the number of processors, number of threads, and processor binding of threads.
多級線程是用戶空間和基于進(jìn)程線程的結(jié)合。線程在一個(gè)進(jìn)程池上多路復(fù)用。進(jìn)程池的大小通常由線程庫內(nèi)部自動(dòng)決定,考慮的因素有處理器數(shù)目,線程數(shù)目和處理器捆綁線程數(shù)目。

The idea of multi-level threading is to achieve the performance of userland threading and the SMP scalability of process-based threading. Ideally, most thread scheduling is done by a UTS to avoid the context switch overhead of kernel calls, but multiple threads can run concurrently by running on more than one process at the same time.
多級線程的設(shè)計(jì)目標(biāo)是,即達(dá)到了用戶線程的性能,又能實(shí)現(xiàn)類似基于進(jìn)程線程在SMP下的可擴(kuò)展性,理論上,多數(shù)線程調(diào)度用UTS完成來避免上下文切換進(jìn)行系統(tǒng)調(diào)用的開支,但是多個(gè)線程可以并行的運(yùn)行在多個(gè)處理器上。

In practice, multi-level threading's main shortcoming is its complexity. Ideally, the advantages of userland and process-based threading are combined, without any of the disadvantages, but in practice, some of the disadvantages tend to slip in. The overhead of the multi-level scheduling compounds this.
事實(shí)上,多級線程的主要缺點(diǎn)就是它的復(fù)雜性,理論上,它同時(shí)擁有用戶空間線程和基于進(jìn)程線程的優(yōu)點(diǎn),沒有任何缺點(diǎn),但在實(shí)際中,一些缺點(diǎn)被忽略了,多級線程調(diào)度的開銷就是其中之一。

Multi-level threading does not require light-weight processes (LWPs) in order to work, but Solaris uses LWPs to address the POSIX compliance issues mentioned above for purely process-based threading. Also, in theory, LWPs are light-weight, though Solaris's LWPs no longer generally meet this criterion. That is, by the time Sun got the kinks worked out, LWPs were no longer light-weight.
多級線程不需要輕量級進(jìn)程(LWPs)來支持其運(yùn)行,但是Solaris使用LWPs來實(shí)現(xiàn)POSIX中的提到的純基于進(jìn)程的線程。同樣,在理論上,LWPs是輕量級的,雖然Solaris的LWPs不再完全符合這個(gè)尺度。就是說,Sun在讓LWPs向POSIX的標(biāo)準(zhǔn)靠近時(shí)遇到了障礙,LWPs也不再是輕量級的了。

Figure 3: Light-weight process-based threading model
輕量級基于進(jìn)程的線程模型


2.Scheduler Activations 激活調(diào)度

This is a very brief overview of scheduler activations (SAs) as presented in [Anderson], and is meant only as a basis for the more complete treatment of kernel-scheduled entities (KSEs) for FreeBSD later in this paper. There are many details left out in this section, which are treated in detail in the original paper.
激活調(diào)度的的簡短概述,這里只是基本介紹FreeBSD中的KSE的內(nèi)容。許多細(xì)節(jié)都會被忽略,下一節(jié)將會詳細(xì)說明。

The SAs approach, like multi-level threading, strives to merge the advantages of userland and process-based threading while avoiding the disadvantages of both approaches. SAs differ from multi-level scheduling in that additional kernel facilities are added in order to provide the UTS with exactly the information and support it needs in order to control scheduling. Simply put, SAs allow the kernel and the UTS to do their jobs without any guess work as to what the other is doing.
SAs方法,類似于多級線程,盡力融合用戶級線程和基于進(jìn)程線程的優(yōu)點(diǎn),同時(shí)避免二者的缺點(diǎn)。SAs不同于多級調(diào)度通過增加額外的內(nèi)核工具提供UTS需要的準(zhǔn)確的信息和支持來控制調(diào)度。簡單的說,SAs允許內(nèi)核和UTS做自己的事,而不用關(guān)心對方在干什么。

A process that takes advantage of SAs has a significantly different flow of control than a normal process, from the perspective of the kernel. A normal process has a number of data structures associated with it in the kernel, including a stack and process control block (PCB). When a process is switched out, its machine state is saved in the PCB. When the process is run again, the machine state is restored from the PCB and the process continues running, whether in kernel or user mode.
一個(gè)進(jìn)程用SAs得到的好處是,在內(nèi)核透視中,有一個(gè)和普通進(jìn)程顯著不同的控制流程。在內(nèi)核中,普通進(jìn)程有若干數(shù)據(jù)結(jié)構(gòu)與之關(guān)聯(lián),包括一個(gè)堆棧,進(jìn)程控制塊(PCB)。當(dāng)一個(gè)進(jìn)程被換出時(shí),它的執(zhí)行狀態(tài)保存在PCB中,當(dāng)進(jìn)程繼續(xù)運(yùn)行時(shí),執(zhí)行狀態(tài)從PCB中恢復(fù),進(jìn)程繼續(xù)執(zhí)行,無論內(nèi)核進(jìn)程和用戶進(jìn)程都是如此。

A process that is using SAs does not have a kernel stack or PCB. Instead, every time a process is run, a SA is created that contains a kernel stack and thread control block (TCB), and the process runs in the context of the SA. When the SA is preempted or blocked, machine state is stored in the SA's TCB, and the kernel stack is optionally used for completion of a pending system call. See Figure 4.
使用SAs的進(jìn)程沒有系統(tǒng)堆棧和PCB,取而代之的是,每次一個(gè)進(jìn)程要運(yùn)行時(shí)系統(tǒng)創(chuàng)建一個(gè)SA,SA中包含一個(gè)系統(tǒng)堆棧和線程控制塊(TCB),進(jìn)程運(yùn)行在SA上下文中,當(dāng)這個(gè)SA被搶占或是阻塞,機(jī)器狀態(tài)被保存在SA的TCB中,系統(tǒng)堆棧可以用來完成一個(gè)未完成的系統(tǒng)調(diào)用。見圖4

It is possible to run more than one SA for the same process concurrently on multiple processors. However, the UTS needs to know at all times exactly what processors it is running on, so that it can make informed scheduling decisions. As part of the solution to this problem, every time a SA is started, it initially starts executing in the UTS. Additionally, the kernel makes upcalls to the process to notify it of important events that may affect thread scheduling decisions. The following upcalls are necessary:
一個(gè)進(jìn)程可能同時(shí)運(yùn)行多個(gè)SA在多個(gè)處理器上,但是,UTS需要每時(shí)每刻準(zhǔn)確的知道處理器上是誰在運(yùn)行,以便做出正確的調(diào)度決定。作為這個(gè)問題解決方案的一部分,每次一個(gè)SA啟動(dòng),它首先在UTS中啟動(dòng)執(zhí)行。另外,內(nèi)核會進(jìn)行回調(diào)來通知進(jìn)程去修改一些會影響線程調(diào)度決定的重要事件,下面的回調(diào)是必須的:

sa_new(cpu_id):
    Execute a thread on the processor with ID cpu_id.

sa_preempt(sa_id):
    A thread was preempted, with SA ID sa_id.

sa_block(sa_id, pcb):
    A thread blocked in the kernel, with SA ID sa_id and machine state pcb.

sa_unblock(sa_id, pcb):
    A thread that was blocked in the kernel has completed, with SA ID sa_id and machine state pcb.

Also, the following system calls are necessary:

sa_alloc_cpus(ncpus):
    Allocate ncpus additional CPUs, if possible.

sa_dealloc_cpu():
    Remove a CPU (reduce concurrency by one).

Figure 4: Scheduler activations
激活調(diào)度



3 Kernel-scheduled entities 內(nèi)核調(diào)度實(shí)體

Kernel-scheduled entities (KSEs) are similar in concept to scheduler activations, but support for threads with system-level scheduling contention is added. Figure 5 shows how KSEs and userland interact. Scheduling contention is controlled via KSE groups (KSEGs). A process has one or more KSEGs associated with it. Each KSEG has a concurrency level associated with it, which controls the maximum number of concurrent KSEs that can be run for that KSEG. Each KSEG is a separate entity from a timesharing perspective.
內(nèi)核調(diào)度實(shí)體(KSEs)和激活調(diào)度是類似的概念。但是加入了線程進(jìn)行系統(tǒng)級調(diào)度的支持。圖5表示了KSEs如何和用戶空間交互。調(diào)度競爭通過KSE組(KSEGs)來控制。一個(gè)進(jìn)程擁有一個(gè)或多個(gè)的KSEGs,每一個(gè)KSEG有一個(gè)并發(fā)級別,這個(gè)級別控制在這個(gè)KSEG中可以并發(fā)執(zhí)行的最大KSEs的數(shù)量。在一個(gè)分時(shí)的透視中,每一個(gè)KSEG是一個(gè)獨(dú)立的實(shí)體

Figure 5: Kernel-scheduled entities


A process starts out with one KSEG that has a concurrency level of one. As new threads are created, the concurrency level of that KSEG can be adjusted, up to the maximum number of CPUs available for execution of the process. If a thread with system-level scheduling contention is desired, a new KSEG with a concurrency level of one can be created for that thread.
一個(gè)進(jìn)程啟動(dòng)時(shí)有一個(gè)并發(fā)級別為一的KSEG,隨著新線程的創(chuàng)建,它的并發(fā)級別也會做相應(yīng)的調(diào)整,直到達(dá)到該進(jìn)程的最大可用CPU數(shù)。如果進(jìn)程希望創(chuàng)建一個(gè)系統(tǒng)級調(diào)度的線程,會為這個(gè)線程創(chuàng)建一個(gè)新的并行級別為一的KSEG。

Since each KSEG is a separate entity from a timesharing perspective, allowing a process to create an arbitrary number of KSEGs would give the process an unfair scheduling advantage. Therefore, instead of enforcing process limits, KSEG limits are enforced. In the common case of single-threaded applications, there is a one to one correspondence between the two methods of enforcing resource limits, but in the case of multi-threaded applications, this prevents a single user from being able to gain a larger portion of the CPU time than would be possible with multiple single-threaded applications.
因?yàn)閷τ诜謺r(shí)的透視,每一個(gè)KSEG是一個(gè)單獨(dú)的實(shí)體,允許進(jìn)程創(chuàng)建任意數(shù)目的KSEGs將會使進(jìn)程的調(diào)度不公平,因此,系統(tǒng)沒有限制進(jìn)程,而是限制了KSEG。在通常的單線程程序中,兩個(gè)方法在限制資源的權(quán)限內(nèi)進(jìn)行一對一的通信,但在多線程的程序中,這樣防止一個(gè)用戶獲得比多個(gè)單線程程序更多的CPU時(shí)間。

The KSE architecture makes a distinction between a KSE and a thread. KSEs are used in the kernel in the scheduler queues, and as a general handle, much the same way as a process is used in current BSD kernels. Given a pointer to a KSE, it is possible to access its associated KSEG, proc, and it's current thread. A Thread contains the state of a suspended thread of execution. When a running KSE is blocked, the execution state is saved in the Thread so that when it is possible to continue execution, the thread can be re-attached to a KSE and continued. When a KSE is preempted, the execution state is saved in a Thread so that any userland state can be handed to the UTS.
KSE的結(jié)構(gòu)使KSE和線程間存在不同,KSEs在內(nèi)核調(diào)度隊(duì)列中使用,作為一般處理,和現(xiàn)在BSD中使用的進(jìn)程相同。給一個(gè)指向KSE的指針,KSEG,proc和當(dāng)前線程通過這個(gè)指針來引用KSE。一個(gè)線程包含停止時(shí)的執(zhí)行狀態(tài)。當(dāng)一個(gè)運(yùn)行的KSE被阻塞時(shí),執(zhí)行狀態(tài)將會保存在線程中,以便于線程在以后運(yùn)行時(shí)使用,線程可以被重新聯(lián)接一個(gè)KSE后繼續(xù)執(zhí)行。當(dāng)一個(gè)KSE被搶先時(shí),執(zhí)行狀態(tài)也會被保存,以便任意的用戶級狀態(tài)可以被UTS處理。


[ 本帖最后由 DarkBlueSea 于 2006-4-10 13:09 編輯 ]

論壇徽章:
0
3 [報(bào)告]
發(fā)表于 2006-04-02 01:51 |只看該作者
KSEs are only evident within the kernel. The interface with userland only deals with processes, KSEGs, and Threads. KSEs themselves are irrelevant to userland because they serve essentially as an anonymous handle that binds the various kernel structures together.
KSEs運(yùn)行在內(nèi)核中,由進(jìn)程來處理它的用戶空間接口,KSEGs,線程,KSEs和用戶空間是不相干的,因?yàn)樗麄儗?shí)質(zhì)上被匿名的和不同的內(nèi)核結(jié)構(gòu)捆綁在一起,向用戶空間提供服務(wù)。

3.Operation without upcalls 無回調(diào)操作

Unless upcalls are explicitly activated by a program, execution is almost identical to the traditional BSD process model. The proc structure is still broken into four components, and the KSE is used as the handle to the process most places in the kernel, but otherwise, little is changed. Figure 6 shows the linkage between the four data structures that comprise the single-threaded component. The dashed lines denote linkage that only exists if upcalls are not activated.
除非程序明確的激活回調(diào),否則進(jìn)程的執(zhí)行還是按照BSD傳統(tǒng)模型進(jìn)行。Proc結(jié)構(gòu)還是分為四部分,KSE在內(nèi)核的大多數(shù)地方作為進(jìn)程的把手,但因此,有一點(diǎn)點(diǎn)改變。圖6顯示了單線程程序四個(gè)數(shù)據(jù)結(jié)構(gòu)之間的關(guān)聯(lián),虛線部分只有在回調(diào)沒有被激活時(shí)才存在。

Figure 6: KSE data structure linkage for a single-threaded process



3.Operation with upcalls 有回調(diào)操作

At the time upcalls are activated via a system call, program flow changes radically. Non-blocking system calls will behave normally, but blocking system calls, preemption, and running will cause the program to receive upcalls from the kernel. Figure 7 shows the data structure linkage for a process running on a 4 CPU machine that has real-time, system scope, and timeshared threads. The KSEs that have an associated Thread are currently running.
在任何系統(tǒng)調(diào)用中,一旦回調(diào)被激活,程序流程會發(fā)生根本的變化,非阻塞系統(tǒng)調(diào)用會很常見,但是可阻塞系統(tǒng)調(diào)用有優(yōu)先權(quán),并且運(yùn)行時(shí)程序會收到內(nèi)核發(fā)出的回調(diào)。圖7顯示了一個(gè)運(yùn)行在四個(gè)CPU上的進(jìn)程,其各個(gè)數(shù)據(jù)結(jié)構(gòu)之間的關(guān)聯(lián),這個(gè)進(jìn)程有實(shí)時(shí),系統(tǒng)級,分時(shí)的三種線程。和KSE關(guān)聯(lián)的線程正在運(yùn)行。

Figure 7: KSE data structure linkage for a multi-threaded process on a 4 CPU machine


3.APIs

KSEs require the ability to make upcalls to userland. This is a very different flow of control than normal process execution uses. Normal processes don't know when they are preempted or resumed; execution appears seamless. With KSEs, the process can be notified of every preemption and resumption, which requires upcalls.

Basically, there is only one upcall, with information as to what events caused it available from a pre_agreed mailbox.
KSEs需要有能力對用戶態(tài)進(jìn)行回調(diào)。這個(gè)完全不同于控制進(jìn)程正常執(zhí)行所用的的流程。正常的進(jìn)程不知道被搶先或是被恢復(fù)運(yùn)行,運(yùn)行幾乎是無縫的,有了KSEs,當(dāng)進(jìn)程被搶先或是恢復(fù)運(yùn)行時(shí)系統(tǒng)會通過回調(diào)通知進(jìn)程;旧希挥幸粋(gè)回調(diào),帶著來自一個(gè)預(yù)先設(shè)定的信箱信息,信息說明在什么事件發(fā)生時(shí),線程可用。
The following system calls (or versions thereof) are necessary:

void kse_new(mailbox_addr, flags):
    Start using a new KSE. mailbox_addr points to a structure that contains the necessary data for the kernel to make upcalls. Initially, there is only one KSEG (ID 0), which has a concurrency level of 1. If the flags specify, a new KSEG is created to hold the new KSE.

int kse_yield():
    The KSE returns to the kernel and does not return.

int kse_wakeup(kse_id):
    Trigger an upcall from this KSE if is is currently inactive.

int thread_cancel(thread_id):
    If the thread in question is stopped somewhere an attempt is made to abort the operation. (Similar to signals).

int kse_bind(int kse_id, int cpu_id):
    Bind the KSE with ID kse_id to the CPU with ID cpu_id. This system call returns the CPU ID that the KSE is bound to, or -1 if there is an error (e.g. invalid CPU ID)

Figure 7 shows the basic linkage between processes (proc), KSEGs (kseg), KSEs (kse), and Threads (thread). The diagram corresponds to a four processor machine. Two of the processors are running KSEs on behalf of bound KSEGs, and the other two processors are running KSEs on behalf of a softly-affined KSEG.
圖7顯示了proc,KSEGs,KSEs和線程之間的關(guān)聯(lián)。那個(gè)圖顯示了一臺有四個(gè)CPU的機(jī)器。兩個(gè)CPU正在運(yùn)行綁定KSEG,另兩個(gè)CPU在運(yùn)行軟親緣(softly-affined) KSEG.

In addition to the new system calls, the getpriority() and setpriority() system calls need to be extended to handle KSEGs. It may be necessary to do so in a way that is not backward compatible, since at a system level, a KSEG can only be uniquely specified using the process ID and the KSEG ID.
此外,對于新的系統(tǒng)調(diào)用,getpriority()和 setpriority()系統(tǒng)調(diào)用需要被擴(kuò)展來處理KSEGs。也許必須這么做,所以從某種程度上不能實(shí)現(xiàn)向后兼容,因?yàn)樵谙到y(tǒng)級上,KSEG通過進(jìn)程ID和KSEG Id來保證唯一性。
3.Kernel Scheduler 內(nèi)核調(diào)度

A number of changes to the kernel scheduler are mandatory in order to support KSEs. Chief among these changes are:
內(nèi)核調(diào)度為了支持KSEs進(jìn)行了許多修改。主要有下面的一些修改

    * KSEs are are placed in the scheduler queues, rather than processes. This enables concurrent execution of KSEs that are associated with the same process.
KSEs被放置在調(diào)度隊(duì)列,取代了進(jìn)程。這樣就允許一個(gè)進(jìn)程的多個(gè)KSE可以并發(fā)執(zhí)行。

    * Timesharing calculations are done with KSEGs, rather than processes. This allows multiple KSEGs in one process to compete for CPU time at a system-wide level.
分時(shí)計(jì)算由KSEGs完成,取代了進(jìn)程。一個(gè)進(jìn)程中可以有多個(gè)KSEGs在系統(tǒng)級上共同競爭CPU時(shí)間

    * The state for blocked (incomplete) system calls is stored in Threads. This means that this queue consists of Threads rather than KSEs. In other words, the scheduler deals with KSEs in some places, and Threads in other places.
系統(tǒng)調(diào)用的阻塞狀態(tài)被存儲在線程中,阻塞隊(duì)列由線程組成,而不是KSE換句話說,調(diào)度程序在一個(gè)地方處理KSE,另一個(gè)地方處理線程。

Additionally, soft processor affinity for KSEs is important to performance. KSEs are not generally bound to CPUs, so KSEs that belong to the same KSEG can potentially compete with each other for the same processor; soft processor affinity tends to reduce such competition, in addition to well-known benefits of processor affinity.
另外,軟處理器親緣對KSE的性能非常重要。KSE不能全部被綁定在CPU上,所以KSE所屬的KSEG之間可能會競爭同一個(gè)處理器,軟處理器親緣傾向于減少這種競爭,除了眾所周知的處理器親緣帶來的好處。

3.5 Userland threads scheduler 用戶空間調(diào)度

The KSE-based UTS is actually simpler than is possible for a userland-only threads implementation, mainly because there is no need to perform call conversion. The following is a simplified representation of the core UTS logic.
實(shí)際上,基于KSE的UTS比用戶空間的UTS實(shí)現(xiàn)要簡單,主要的原因是不再需要進(jìn)行調(diào)用轉(zhuǎn)換。下面是一個(gè)簡單的核心UTS邏輯處理的內(nèi)容說明
   1. Find the highest priority thread that is mapped to the kseg that this Thread is part of. Optionally heuristically try to improve cache locality by running a thread that may still be partially warm in the processor cache.
在KSEG中找到一個(gè)優(yōu)先級最高的線程,把這個(gè)線程放在一個(gè)還有部分該線程的緩存仍然的沒有被清除的處理器上,通過這種可選的試探性方法提高緩存的擊中率。
   2. Set a timer that will indicate the end of the scheduling quantum.
設(shè)置一個(gè)指示調(diào)度時(shí)間片結(jié)束的計(jì)時(shí)器
   3. Run the thread.
運(yùn)行這個(gè)線程
Of course, there are a number of ways to enter this code, such as getting a new Thread, running a thread to the end of a quantum, or rescheduling due to another thread blocking. However, the fact remains that the UTS logic is quite simple.
當(dāng)然,有很多種辦法進(jìn)入這段代碼,比如創(chuàng)建一個(gè)新線程時(shí),在一個(gè)運(yùn)行線程的調(diào)度時(shí)間片結(jié)束時(shí),或者由于另一個(gè)線程的阻塞而重新調(diào)度。但是UTS的邏輯仍然很簡單

3.5.1 Temporary priority inversion 臨時(shí)優(yōu)先級倒置
The UTS always has the information it needs to make fully informed scheduling decisions. However, in the case of priority-based thread scheduling, there are some circumstances that can cause temporary scheduling inversions, where a thread may continue to run to the end of its quantum despite there being a higher priority runnable thread. This can happen when:
UTS總是有它需要的信息來做一個(gè)明智的調(diào)度決定,但是,就一個(gè)基于優(yōu)先級的線程調(diào)度而言,有一些情況可能導(dǎo)致暫時(shí)的調(diào)度倒置,那種情況下,一個(gè)線程也許會在它的時(shí)間片結(jié)束后繼續(xù)運(yùn)行,即使有一個(gè)有一個(gè)優(yōu)先級更高的可運(yùn)行線程,這種情況會在下列情形下發(fā)生。

   1. A new thread is created that has a lower priority than its creator, but a higher priority than a thread that is concurrently running on another processor.
一個(gè)線程創(chuàng)建了一個(gè)比自己優(yōu)先級低的線程,但是一個(gè)有更高優(yōu)先級的線程正在另一個(gè)處理器上運(yùn)行。

   2. A KSE (running thread A) is preempted by the kernel, and the upcall notification causes preemption of thread B, which is higher priority than thread A, though thread C is also running on another processor and has a lower priority than both A and B. In this case, A will be scheduled and C will continue to run, even though thread B is higher priority than C. Note that there are much more convoluted examples of this same basic idea.
一個(gè)KSE(運(yùn)行線程A)被內(nèi)核搶先,回調(diào)通知一個(gè)比A優(yōu)先級高的B線程取得優(yōu)先權(quán),雖然線程C比A的優(yōu)先級更低,它仍然運(yùn)行在另一個(gè)CPU上,在這種情況下,雖然B的優(yōu)先級比C高,但是被調(diào)度的是A而不是C注意還有很類似的多錯(cuò)綜復(fù)雜的例子

Such temporary inversions are technically violations of the policy that lower priority threads never run in the presence of higher priority threads, but there are two reasons not to do anything about it:
這種暫時(shí)的倒置理論上會干擾優(yōu)先級調(diào)度策略,但有兩個(gè)原因不用去管它

   1. Solutions to this problem require additional system calls in which the UTS explicitly asks the kernel to preempt KSEs. This is expensive.
解決這個(gè)問題,UTS為了搶先KSE,需要額外的系統(tǒng)調(diào)用去查詢內(nèi)核得到準(zhǔn)確的KSE優(yōu)先級。這樣的開銷很大。
   2. Temporary inversions are logically equivalent to the race condition where the UTS determines that a thread should be preempted in favor of scheduling another thread, while the thread races to complete its quantum. It is not important whether the UTS or the thread wins the race; allowing the lower priority thread to complete its quantum without competition from the UTS simply allows the thread to always win.
暫時(shí)倒置邏輯上等同與競爭情形,UTS決定一個(gè)線程應(yīng)該被搶先,調(diào)度另一個(gè)線程更有利,同時(shí)線程與UTS競爭完成它的時(shí)間片。最終獲勝的是UTS還是線程并不重要,在沒有UTS競爭的情況下允許低優(yōu)先級的線程完成它的時(shí)間片的簡單性使線程總是會獲勝。

3.6 Initialization and upcalls 初始化和回調(diào)
The code that is necessary to start up KSEs in a userland program looks something like:

void
foo(void)
{
        struct kse_mailbox       km;

        kse_init(&km);
        /* Handle upcall and longjmp() to UTS. */
}

Each upcall appears to be a return from kse_init(). The stack on which kse_init() is called must never be unwound, since the kernel fakes up a context at the point where kse_init() is called for each upcall. The argument to kse_init() points to a data structure that contains enough information for the kernel to determine where to write events. Due to the asynchronous completion of blocked Threads, it is possible for a single upcall to report a large number of thread_unblock() events, the number of which is only (potentially) bounded by the resource limit settings for the process.
每個(gè)回調(diào)作為kse_init()產(chǎn)生一個(gè)返回值。調(diào)用kse_init()的堆棧永遠(yuǎn)不能被釋放,因?yàn)樵谡{(diào)用kse_init()的地方,內(nèi)核為每一個(gè)回調(diào)偽造了一個(gè)上下文。kes_init()的參數(shù)指向一個(gè)數(shù)據(jù)結(jié)構(gòu),這個(gè)數(shù)據(jù)結(jié)構(gòu)包含的信息足夠內(nèi)核決定要把事件寫在哪里。由于線程阻塞是異步完成的,一個(gè)回調(diào)可能會報(bào)告大量的 thread_unblock()事件,事件的數(shù)量由進(jìn)程的資源限制設(shè)置決定。

Since thread_unblock() events include the userland Thread execution state, the amount of space needed to report all events during an upcall can vary wildly; static event buffers are inadequate for such a dynamic problem. Therefore, km contains a pointer to a chain of objects with embedded TCBs, and the kernel stores the TCBs from thread_preempt() and thread_unblock() events in the chain elements. Figure 8 shows the basics of how struct kse_mailbox is laid out.
因?yàn)閠hread_unblock()事件包含用戶空間的線程執(zhí)行狀態(tài),在一次回調(diào)中,報(bào)告所有事件所用的空間可能很大,靜態(tài)事件緩存不能滿足這樣一個(gè)動(dòng)態(tài)問題。因此km包含指向一鏈內(nèi)嵌TCB的對象的指針,內(nèi)核把TCB保存在thread_preempt()和thread_unblock()事件形成的對象鏈元素中。圖8 簡單的顯示了kse_mailbox結(jié)構(gòu)是如何放置的。

Figure 8: TCB chaining for upcall event storage


[ 本帖最后由 DarkBlueSea 于 2006-4-10 13:10 編輯 ]

論壇徽章:
0
4 [報(bào)告]
發(fā)表于 2006-04-02 01:55 |只看該作者


3.6.1 Per-upcall event ordering 回調(diào)前的事件次序

The techniques described in the previous section for storing events separate the event types that pass TCBs from those that don't. This means that event ordering is not implicit, and would require extra work to preserve. However, it turns out that in most cases, order doesn't matter, and in the cases that it does matter, the UTS can always determine order.
前面的章節(jié)描述了存儲事件,區(qū)分事件類型的技術(shù)忽略了TCB來自那些禁止事項(xiàng)。這意味著事件的順序不是固定的,可能需要額外的工作去保護(hù)。但是在大多數(shù)情況下,順序沒關(guān)系,即使在順序有關(guān)系的情況下,UTS總能決定順序

As an example, suppose thread_block() and thread_unblock() events occur for the same Thread ID in the same upcall. In one interpretation, where the thread_block() event occurred first, the UTS last knew the corresponding thread to be running, so the two events simply mean that the thread can be continued. In the other interpretation, where the thread_unblock() event occurred first, the Thread ID has been recycled. However, this isn't possible, since there must have been an intervening thread_new() event, which couldn't have happened until after the Thread unblocked.
舉個(gè)例子,假設(shè)一個(gè)線程在一次回調(diào)是thread_block()和thread_unblock()事件都發(fā)生了。一種情況,如果thread_block()先發(fā)生,UTS最后知道通信的線程在運(yùn)行,所以兩個(gè)事件表示該線程可以繼續(xù)執(zhí)行。另一種情況,如果thread_unblock事件先發(fā)生,線程的ID可以被重新利用,但是,這不可能,因?yàn)樗欢ㄊ且粋(gè)在thread_new()事件之間,在一個(gè)線程被解鎖之前是不可能發(fā)生的。

3.7 Upcall parallelism 回調(diào)對應(yīng)
This section is not yet adequately fleshed out. Issues to consider:
這部分還有沒適當(dāng)?shù)某鋵?shí),關(guān)鍵要考慮
How do upcalls from multiple processors keep from stomping on each other? For example, if we have only one chain of TCBs, how does it get safely used?
怎么才能避免回調(diào)在多個(gè)處理器上跳來跳去,比如,如果只有一鏈TCB,怎么能讓它被安全的使用。
What if the UTS doesn't process the TCBs before another upcall happens? Do we run the risk of overflowing the TCB chain?
如果在新的回調(diào)發(fā)生時(shí),UTS不能處理TCB怎么辦,要運(yùn)行這個(gè)存在溢出危險(xiǎn)的TCB鏈嗎
    * How do we handle races between the UTS and the kernel? For example, signals cannot be cleared until delivered to a thread, and the kernel must look at the userland bitmap to see if a signal is still pending.
怎么處理UTS和內(nèi)核間的競爭呢,比如,在信號傳遞給線程前不能被清空,那么內(nèi)核必須檢查用戶空間的位圖,去看是否信號還在傳遞中。
3.8 Kernel scheduler 內(nèi)核調(diào)度
The kernel scheduler needs a major overhaul in order to support KSEs. Support for the following features is necessary:
內(nèi)核調(diào)度需要進(jìn)行仔細(xì)檢查以便支持KSE,下面的特征是必須支持的
    * Soft affinity. The scheduler should try to keep processes running on the same processors if there is the possibility of reaping benefits from increased cache locality.
        軟親緣。調(diào)度程序應(yīng)該盡力保證一個(gè)進(jìn)程在被調(diào)度前后運(yùn)行在同一個(gè)CPU        上,這樣可以通過增加緩存的擊中率提高性能。
    * Hard affinity (binding). Some programs may wish to bind a KSE (pedantically speaking, a KSEG is bound) to a particular processor in order to improve cache locality, or to tighten the bounds on a real-time task. Bound KSEs are run only on one CPU, unless priority inheritance forces it to be run on another CPU in order to avoid deadlock due to priority inversion.
硬親緣(綁定)。一些程序可能希望把KSE綁定在一個(gè)單獨(dú)的CPU上(準(zhǔn)確的說,KSEG)來提高緩存的擊中率,或是讓實(shí)時(shí)任務(wù)更嚴(yán)密,被綁定的KSE只運(yùn)行在一個(gè)CPU上,除非在優(yōu)先級倒置時(shí),為了避免死鎖,優(yōu)先級繼承迫使它運(yùn)行在另一個(gè)CPU上。

    * Increased parallelism. The current scheduler can only be executed by one processor at a time, which makes it a severe bottleneck, depending on the type of system load.
對應(yīng)增長。在同一時(shí)刻內(nèi),調(diào)度程序只能運(yùn)行在一個(gè)處理器上,隨著系統(tǒng)的負(fù)載量增大,這產(chǎn)生了嚴(yán)重的瓶頸。
Part of the solution is to implement per-CPU scheduling queues. Each CPU has an associated queue for softly affined KSEs, as well as a queue for bound KSEs. In addition, there is a global queue for fixed-priority KSEs, such as interrupt threads and real-time KSEs. Fixed-priority KSEs prefer to run on the same CPU as the last time they were run, but scheduling issues keep more complex affinity algorithms from being beneficial for this class of KSEs. See Figure 9 for a simplistic representation of the various scheduling queues. The diagram ignores some details such as split queues for various priorities.
這個(gè)問題的解決方案是實(shí)現(xiàn)單CPU調(diào)度隊(duì)列。每個(gè)CPU有一個(gè)軟親緣KSE隊(duì)列和一個(gè)綁定KSE隊(duì)列。此外,還有一個(gè)為固定優(yōu)先級KSE設(shè)的全局隊(duì)列。比如中斷處理線程和實(shí)時(shí)KSE。確定優(yōu)先級的KSE更希望運(yùn)行在他們上次運(yùn)行的CPU上,但是調(diào)度算法有更多對這類KSE有益的負(fù)雜親緣算法。圖9顯示了一個(gè)十分簡單的各種調(diào)度隊(duì)列的代表。圖忽略了某些細(xì)節(jié),比如把隊(duì)列分成不同優(yōu)先級

Each CPU schedules from its own queues, and resorts to stealing runnable softly affined KSEs from other CPUs if there are no runnable KSEs. Every so often (exact number to be determined by benchmarking, but a likely number is 5 seconds), CPU loads are calculated, and if the loads are too disparate, the under-loaded CPU(s) steal additional KSEs from the overloaded CPU(s) in an attempt to balance the load. Instantaneous per-CPU load is defined as the number of runnable timesharing KSEs in a CPU's queues. Instantaneous system load is the sum of the instantaneous per-CPU loads. Note that normal (non-balancing) KSE stealing is adequate to keep the system busy if there is enough work to do, but that without periodic load balancing, time sharing becomes unfair if there aren't approximately equal CPU loads.
每個(gè)CPU在自己的隊(duì)列中進(jìn)行調(diào)度,如果沒有可運(yùn)行的KSE,則從其他CPU那里取一些有軟親緣的KSE,并且重新排序。每過一段時(shí)間(確切得數(shù)子通過測試決定,但是可能的是5秒)會計(jì)算CPU負(fù)載,如果CPU之間的負(fù)載差別很大低負(fù)載的CPU從高負(fù)載CPU的調(diào)度隊(duì)列里取一些KSE加入自己的隊(duì)列,以達(dá)到負(fù)載平衡。瞬間單CPU負(fù)載定義為在當(dāng)前CPU的隊(duì)列里,可運(yùn)行的分時(shí)KSE的數(shù)量。瞬間系統(tǒng)負(fù)載是瞬間單CPU負(fù)載的和。注意,正常的KSE獲取是適量的,為了保證在有足夠工作時(shí),系統(tǒng)的繁忙。如果不進(jìn)行周期性的負(fù)載平衡,如果沒有大致均衡的CPU負(fù)載,分時(shí)就會變得不公平。

This is essentially the approach that was taken in DEC OSF/1 3.0 [Denham], and it seems to have worked well there. One notable difference between our kernel and OSF/1 is that our interrupts are backed by threads, whereas OSF/1 keeps spl()s (IPLs in their terminology). However, this doesn't affect the design, since OSF/1 already has to handle real-time scheduling.
這個(gè)在本質(zhì)上和DEC 的OSF/1.3.0相似,這種方法在OSF上好象很有效,一個(gè)值得注意的不同是,我們的中斷是基于線程的,雖然OSF/1擁有spl,但是,這個(gè)沒有影響設(shè)計(jì),因?yàn)镺SF/1必須處理實(shí)時(shí)調(diào)度。
Figure 9: Kernel scheduler queues 內(nèi)核調(diào)度隊(duì)列


4 Summary
Discussion about improved threads support in FreeBSD has been ongoing for several years. The KSE project aims to implement a kernel mechanism that allows the kernel and userland to support threaded processes and communicate with each other effectively so that the necessary information is available for both to do their jobs efficiently and correctly.
討論在FreeBSD中增加線程支持已經(jīng)進(jìn)行了幾年。KSE項(xiàng)目的目標(biāo)是實(shí)現(xiàn)一個(gè)內(nèi)核結(jié)構(gòu),允許內(nèi)核和用戶空間支持線程處理以及相互間有效的通信以便它們有效和正確的完成工作所必要的信息對兩者來說都是可用的。
Glossary

KSE:
    Kernel-scheduled entity. This gets scheduled.

Thread:
    A runnable context. This gets suspended.

KSEG:
    Kernel-scheduled entity group.

PCB:
    Process control block.

SA:
    Scheduler activation.

TCB:
    Thread control block.

UTS:
    Userland threads scheduler. Userland, multi-level, and scheduler activation-based threads libraries all have a UTS.

Bibliography

Anderson
    Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska, and Henry M. Levy, Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism, ACM Transactions on Computer Systems, Vol. 10, No. 1, February 1992, Pages 53-79.

Boykin
    Joseph Boykin, David Kirschen, Alan Langerman, and Susan LoVerso, Programming under Mach, Addison-Wesley Publishing Company, Inc. (1993).

Butenhof
    David R. Butenhof, Programming with POSIX threads, Addison Wesley Longman, Inc. (1997).

Denham
    Jeffrey M. Denham, Paula Long, and James A. Woodward, DEC OSF/1 Symmetric Multiprocessing, Digital Technical Journal, Vol. 6, No. 3.

Kleiman
    Steve Kleiman, Devang Shah, and Bart Smaalders, Programming with Threads, SunSoft Press (1996).

Lemon
    Jonathan Lemon, Kqueue: A generic and scalable event notification facility, BSDcon Conference Proceedings (2000), http://people.freebsd.org/~jlemon/kqueue.pdf.

Mauro
    Jim Mauro and Richard McDougall, Solaris Internals, Sun Microsystems Press (2001).

McKusick
    Marshall Kirk McKusick, Keith Bostic, Michael J. Karels, and John S. Quarterman, The Design and Implementation of the 4.4BSD Operating System, Addison-Wesley Publishing Company, Inc. (1996).

Vahalia
    Uresh Vahalia, UNIX $^{TM}$ Internals: The New Frontiers, Prentice-Hall, Inc. (1996).

About this document ...
Kernel-Scheduled Entities for FreeBSD

This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.70)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -split 0 -dir html -no_navigation -no_white -show_section_numbers freebsd_kse.tex

The translation was initiated by Chris Knight on 2003-01-01
Chris Knight 2003-01-01


[ 本帖最后由 DarkBlueSea 于 2006-4-10 13:11 編輯 ]

論壇徽章:
0
5 [報(bào)告]
發(fā)表于 2006-04-02 10:05 |只看該作者
因?yàn)閮?nèi)容較多,理論性強(qiáng),小弟到現(xiàn)在對很多細(xì)節(jié)也沒有把握,望各位大蝦指點(diǎn)

論壇徽章:
0
6 [報(bào)告]
發(fā)表于 2006-04-02 10:47 |只看該作者
直接用代碼展示也許更好的

論壇徽章:
2
亥豬
日期:2014-03-19 16:36:35午馬
日期:2014-11-23 23:48:46
7 [報(bào)告]
發(fā)表于 2006-04-02 11:02 |只看該作者
原帖由 DarkBlueSea 于 2006-4-2 01:37 發(fā)表
Kernel-Scheduled Entities for FreeBSD
FreeBSD中的“內(nèi)核調(diào)度實(shí)體”
Jason Evans

jasone@freebsd.org

January 1, 2003

譯者: DarkBlueSea

Abstract:
摘要
FreeBSD has histori ...


FreeBSD has historically had less than ideal support for multi-threaded application programming
第一句就有問題,ideal是理想的。原文大意為“以前的FreeBSD缺乏對多線程應(yīng)用程序編程的良好支持”

不過我還是支持樓主分享自己的翻譯成果,暫時(shí)翻譯的不準(zhǔn)確沒關(guān)系,大家一起完善

論壇徽章:
2
亥豬
日期:2014-03-19 16:36:35午馬
日期:2014-11-23 23:48:46
8 [報(bào)告]
發(fā)表于 2006-04-02 11:05 |只看該作者
原帖由 lileding 于 2006-4-2 10:47 發(fā)表
直接用代碼展示也許更好的


代碼雖然可以說明一切,但是現(xiàn)在有可以使自己理解更深刻,學(xué)習(xí)更方便的論文,沒必要老拿代碼分析出來壓人吧。
不過,我并不介意你寫一篇代碼分析出來給大伙看看。

論壇徽章:
0
9 [報(bào)告]
發(fā)表于 2006-04-02 11:49 |只看該作者
謝謝樓上的兄弟
莫大的鼓勵(lì)

論壇徽章:
0
10 [報(bào)告]
發(fā)表于 2006-04-02 14:46 |只看該作者
摘要
FreeBSD has historically had less than ideal support for multi-threaded application programming. At present, there are two threading libraries available. libc_r is entirely invisible to the kernel, and multiplexes threads within a single process. The linuxthreads port, which creates a separate process for each thread via rfork(), plus one thread to handle thread synchronization, relies on the kernel scheduler to multiplex "threads" onto the available processors.
從前,F(xiàn)reeBSD缺乏對多線程編成的良好支持。目前,有兩個(gè)線程庫可以使用了。libc_r線程庫,是內(nèi)核完全不可見的,并且一個(gè)進(jìn)程中可以有多個(gè)線程。devel/linuxthreads port線程庫,它使用rfork()為每個(gè)線程創(chuàng)建一個(gè)獨(dú)立的進(jìn)程,再創(chuàng)建一個(gè)線程,來處理線程間的同步,依賴內(nèi)核調(diào)度器,使多個(gè)“線程”能在可用的處理器上執(zhí)行。

A process that is using SAs does not have a kernel stack or PCB. Instead, every time a process is run, a SA is created that contains a kernel stack and thread control block (TCB), and the process runs in the context of the SA. When the SA is preempted or blocked, machine state is stored in the SA's TCB, and the kernel stack is optionally used for completion of a pending system call. See Figure 4.
使用SAs的進(jìn)程沒有系統(tǒng)堆棧和PCB,取而代之的是,每次一個(gè)進(jìn)程要運(yùn)行時(shí)系統(tǒng)創(chuàng)建一個(gè)SA,SA中包含一個(gè)系統(tǒng)堆棧和線程控制塊(TCB),進(jìn)程運(yùn)行在SA上下文中,當(dāng)這個(gè)SA被強(qiáng)占或是阻塞,機(jī)器狀態(tài)被保存在SA的TCB中,系統(tǒng)堆?梢杂脕硗瓿梢粋(gè)未完成的系統(tǒng)調(diào)用。見圖4

錯(cuò)字了……

[ 本帖最后由 antijp 于 2006-4-2 14:49 編輯 ]
您需要登錄后才可以回帖 登錄 | 注冊

本版積分規(guī)則 發(fā)表回復(fù)

  

北京盛拓優(yōu)訊信息技術(shù)有限公司. 版權(quán)所有 京ICP備16024965號-6 北京市公安局海淀分局網(wǎng)監(jiān)中心備案編號:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年舉報(bào)專區(qū)
中國互聯(lián)網(wǎng)協(xié)會會員  聯(lián)系我們:huangweiwei@itpub.net
感謝所有關(guān)心和支持過ChinaUnix的朋友們 轉(zhuǎn)載本站內(nèi)容請注明原作者名及出處

清除 Cookies - ChinaUnix - Archiver - WAP - TOP