平臺(tái) 論壇博客文庫

› 論壇 › 操作系統(tǒng) › HP-UX › HP文檔中心 › 性能調(diào)試---(三)CPU性能分析

性能調(diào)試---(三)CPU性能分析 [復(fù)制鏈接]

愛之旅

稍有積蓄

論壇徽章:: 0

電梯直達(dá)

1樓 [收藏(0)] [報(bào)告]

發(fā)表于 2008-01-23 18:01 |只看該作者 |倒序?yàn)g覽

1:CPU的體系結(jié)構(gòu)和工作原理
2:操作系統(tǒng)和進(jìn)程
3:衡量CPU閑忙程度的指標(biāo)
4:CPU資源成為系統(tǒng)性能的瓶頸的征兆
5:哪些進(jìn)程是占用CPU資源的大戶?
6:利用SAR工具分析CPU的利用率
7:利用SAR工具分析運(yùn)行進(jìn)程隊(duì)列長(zhǎng)度
8:利用SAR工具分析系統(tǒng)調(diào)用
9:利用time命令測(cè)試某個(gè)命令和程序的執(zhí)行效率
10:利用top命令查看最耗CPU資源的進(jìn)程
11:利用uptime命令查看系統(tǒng)整體情況
12:利用GlancePlus分析系統(tǒng)CPU資源利用率
13:對(duì)CPU需求密集型系統(tǒng)的性能調(diào)試
CPU的體系結(jié)構(gòu)和工作原理
我們所說的CPU一般是指微處理器，即Microprocessor，一般地，一個(gè)CPU的主要組成部分為：
CPU(central processing unit)
cache：cache就是高速內(nèi)存，它的存取時(shí)間一般是10-20微秒(ns)，這樣，CPU可以在一個(gè)時(shí)鐘周期內(nèi)訪問一次cache；而一般的內(nèi)存的存取時(shí)間為80-90微秒(ns)，它的大小對(duì)CPU的性能有很大的影響。
TLB(translation lookaside
boffer)：TLB是高速cache，它用于存放最近訪問的虛擬地址和與其對(duì)應(yīng)的物理地址對(duì)，這樣TLB將可以把虛擬地址轉(zhuǎn)換為物理地址。TLB是內(nèi)
存中系統(tǒng)轉(zhuǎn)換表的一個(gè)子集；TLB通常是指向一個(gè)內(nèi)存頁面，而不是一個(gè)內(nèi)存地址；它的大小對(duì)CPU的性能有很大的影響。
coprocessor
不同的CPU，一般有不同的時(shí)鐘頻率和高速緩存容量。
CPU在一次時(shí)鐘周期內(nèi)一般可以從高速緩存內(nèi)取到一個(gè)指令并執(zhí)行它。因此，從理論上說，只要CPU的主頻越快，單位時(shí)間內(nèi)所能執(zhí)行的指令則越多。目前，有些CPU可以在一個(gè)時(shí)鐘周期內(nèi)執(zhí)行多條指令，如PA8500可以執(zhí)行4條指令。
高速緩存的大小會(huì)制約CPU的執(zhí)行效率，雖然CPU主頻很快，但它取不到數(shù)據(jù)，則只有空運(yùn)行。因此，高速緩存的大小很重要；高速緩存又分?jǐn)?shù)據(jù)高速緩存和指令高速緩存，分別存放從內(nèi)存預(yù)先取來的即將執(zhí)行的數(shù)據(jù)和指令單元。
虛擬尋址
一般，系統(tǒng)中的虛擬地址空間要比物理地址空間大得多，例如，如果系統(tǒng)是64位的，則理論上，它的尋址空間可以達(dá)到2的64次冪(2**64=18,447PB)，但由于受費(fèi)用的因素的影響，實(shí)際上的物理內(nèi)存最大只有十幾GB的內(nèi)存。
每個(gè)進(jìn)程都有自己的唯一虛擬地址空間，然而，進(jìn)程的運(yùn)行必須把虛擬地址映射到物理地址，這需要TLB、高速緩存和內(nèi)存三者的配合。如果需要的信息不在內(nèi)存，則導(dǎo)致一個(gè)頁面錯(cuò)。
流水線(Pipelining)
TLB和高速緩存試圖在一個(gè)時(shí)鐘周期內(nèi)給CPU提供它所需的信息，然而，這個(gè)過程是100%的利用率，對(duì)CPU來說，它必須先用一個(gè)時(shí)鐘周期去取下一個(gè)指
令，再一個(gè)時(shí)鐘周期去執(zhí)行這條指令，這樣，CPU的利用率也只有50%。為了讓CPU更忙，通常的做法是，采用流水線的方法。如PA8500是采用7個(gè)步
驟的流水線。
操作系統(tǒng)和進(jìn)程
HP-UX一個(gè)多用戶、多任務(wù)的UNIX操作系統(tǒng)。它的性能依賴于用戶數(shù)的多少、用戶任務(wù)的類型、硬/軟件件的配置。
HP－UX有兩種運(yùn)行級(jí)別：
用戶級(jí)：系統(tǒng)用戶可以與操作系統(tǒng)進(jìn)行交互操作，如運(yùn)行應(yīng)用和系統(tǒng)命令。用戶級(jí)通過系統(tǒng)調(diào)用接口訪問內(nèi)核級(jí)。
內(nèi)核級(jí)：操作系統(tǒng)自動(dòng)運(yùn)行一些功能，它們主要對(duì)硬件進(jìn)行操作。
在操作系統(tǒng)中，用戶程序是以進(jìn)程方式運(yùn)行。進(jìn)程的狀態(tài)有以下幾種：
SRUN
SSLEEP
SZOMB
SIDL
SSTOP
CPU的調(diào)度
一旦進(jìn)程所需的數(shù)據(jù)調(diào)入內(nèi)存后，它將等待CPU調(diào)度者來分配CPU時(shí)間。一般，在HP-UX中，每個(gè)進(jìn)程都可以得一個(gè) 固定的時(shí)間片來運(yùn)行，這個(gè)時(shí)間片的長(zhǎng)度為十分之一秒(1/10秒)。
由于HP-UX是一個(gè)多任務(wù)的操作系統(tǒng)，它需要一種手段來進(jìn)程的執(zhí)行次序，這就是中斷。在系統(tǒng)中，時(shí)鐘中斷處理器是用來處理時(shí)鐘中斷的系統(tǒng)軟件。具體地
說，它將收集系統(tǒng)和accounting statistics and does context
switching.系統(tǒng)性能也與這種中斷發(fā)生的頻率有關(guān)。
進(jìn)程何優(yōu)先級(jí)
每個(gè)進(jìn)程都有自己的優(yōu)先級(jí)；
實(shí)時(shí)優(yōu)先級(jí)：-32~127，一個(gè)進(jìn)程如果想以實(shí)時(shí)優(yōu)先級(jí)運(yùn)行，則必須用命令#rtprio來設(shè)置；
分時(shí)系統(tǒng)優(yōu)先級(jí)：128～177；
分時(shí)用戶優(yōu)先級(jí)：178～251；
優(yōu)先級(jí)：252～255 are used by the system as virtual memory management priorities for process deactivation.
分時(shí)進(jìn)程在初始優(yōu)先級(jí)是由系統(tǒng)分配的，為一個(gè)定值。用戶可以通過改變進(jìn)程的nice值來改變分時(shí)進(jìn)程的優(yōu)先級(jí)。因?yàn)檫M(jìn)程會(huì)隨著它的執(zhí)行，將以nice值來降低它的優(yōu)先級(jí)，當(dāng) 它在等待執(zhí)行時(shí)，又將以nice值來增加其優(yōu)先級(jí)。nice值的系統(tǒng)缺值為20。
在系統(tǒng)性能分析過程中，我關(guān)心不僅僅在完成一個(gè)進(jìn)程耗時(shí)多少，還包括時(shí)間花在哪以及它的時(shí)間多少。
衡量CPU閑忙程度的指標(biāo)
要分析系統(tǒng)的CPU資源是否夠的前提誰占用了CPU資源，占用了多少，時(shí)間多長(zhǎng)。下面是一些衡量CPU閑忙程度的經(jīng)用指標(biāo)：
1)用戶使用CPU的情況
CPU運(yùn)行常規(guī)用戶進(jìn)程
CPU運(yùn)行niced process
CPU運(yùn)行實(shí)時(shí)進(jìn)程
2)系統(tǒng)使用CPU的情況
用于系統(tǒng)調(diào)用
用于I/O管理：中斷和驅(qū)動(dòng)
用于內(nèi)存管理：paging and swapping
用于進(jìn)程管理：context switch and process start
3)WIO：由于進(jìn)程等待I/O而使CPU處于空閑狀態(tài)的比率，這些I/O主要指block I/O,raw I/O,VM paging/swapins；
4)CPU的空閑率，即除了上面的WIO以外的空閑情況；
5)CPU用于上下文交換的比率(Context Switch CPU utilization)
6)nice
7)real-time
8)運(yùn)行進(jìn)程隊(duì)列的長(zhǎng)度，即處于可運(yùn)行狀態(tài)的進(jìn)程個(gè)數(shù)的大小，不過我們關(guān)心的是這些在等待CPU調(diào)度執(zhí)行時(shí)所花的時(shí)間；
9)平均負(fù)載(load average)
CPU資源成為系統(tǒng)性能的瓶頸的征兆
CPU就像人的大腦，完成各種交給它的任務(wù)。如果任務(wù)太多，CPU就要忙不過來，它的運(yùn)行效率就要下降。就像人生病會(huì)有一典型癥狀一樣，當(dāng)CPU資源成為系統(tǒng)性能的瓶頸時(shí)，它也有一些典型的癥狀：
很慢的響應(yīng)時(shí)間(slow response time)
CPU空閑時(shí)間為零(zero percent idle CPU)
過高的用戶占用CPU時(shí)間(high percent user CPU)
過高的系統(tǒng)占用CPU時(shí)間(high percent system CPU)
長(zhǎng)時(shí)間的有很長(zhǎng)的運(yùn)行進(jìn)程隊(duì)列(large run queue size sustained over time)
processes blocked on prority
必須注意的是，如果系統(tǒng)出現(xiàn)上面的這些癥狀并不能說一定是由于CPU資源不夠，事實(shí)，有些癥狀
的出現(xiàn)很可能是由于其他資源的不足而引起，如內(nèi)存不夠時(shí)，CPU會(huì)忙內(nèi)存管理的事，這時(shí)從表面上，
CPU的利用是100%，甚至顯得不夠，如果據(jù)此就簡(jiǎn)單地認(rèn)為增加CPU就可以解決問題是大錯(cuò)特錯(cuò)了。
因此，還是那句話，必須用不同的工具、從不同的方面對(duì)系統(tǒng)進(jìn)行分析后，才能做出結(jié)論，即使這樣，經(jīng)驗(yàn)將起到不可替代的作用。
哪些進(jìn)程是占用CPU資源的大戶?
在操作系統(tǒng)中，并不是所有的進(jìn)程都以同樣的方式使用CPU資源。通常情況下，有些進(jìn)程需要比其他進(jìn)程更多的CPU時(shí)間片才能順利地完成任務(wù)。下面是一些典型的占用CPU資源的大戶：
進(jìn)程創(chuàng)建(process creation)
終端字符進(jìn)程(teminal character processes(MUX- and LAN-based)
計(jì)算密集型進(jìn)程和實(shí)時(shí)進(jìn)程
X-終端和X-服務(wù)器進(jìn)程(X-terminals and X-servers)
利用SAR工具分析CPU的利用率
利用SAR進(jìn)行CPU的利用率分析的命令形式：
#sar -u，這時(shí)數(shù)據(jù)是通過sa1在后臺(tái)定時(shí)生成；
#sar -u 5 100，每隔5秒取樣一次，共取100次；
SAR -u:Report CPU utilization (the default); portion of time running in
one of several modes. On a multi-processor system, if the -M option is
used together with the -u option, per-CPU utilization as well as the
average CPU utilization of all the processors are reported. If the -M
option is not used, only the average CPU utilization of all the
processors is reported:
cpu: cpu number (only on a multi-processor system with the -M option);
%usr: user mode;
%sys: system mode;
%wio: idle with some process waiting for I/O (only block I/O, raw I/O, or VM pageins/swapins indicated);
%idle: otherwise idle;
對(duì)結(jié)果的分析
首先，我們看%idle列的值，如果為接近零，則再看對(duì)應(yīng)%wio列的值，如果這列的大于7，則表明系統(tǒng)的磁盤或其他I/O可能有問題，需要進(jìn)一步的分析：
用iostat命令分析各個(gè)磁盤的傳輸閑忙狀況，如#iostat -t 5 2，每隔5秒取樣一次，共取2次；
用sar -d命令分析各塊設(shè)備(磁盤、磁帶)活動(dòng)情況；
用sar -b命令分析系統(tǒng)的緩存的活動(dòng)情況；
用sar -w命令分析進(jìn)程的deactivation/reactivation and switching activities of the system;
如果%idle列很小，而對(duì)應(yīng)的%wio列的值也很小，這時(shí)，我們查看%usr列和%sys列的值。如果%usr列的值很大，說明有用戶進(jìn)程占用很多CPU時(shí)間；如果%sys列的值很大，則說明系統(tǒng)管理方面花了很多時(shí)間。需要進(jìn)一步的分析：
用GlancePlus對(duì)占用CPU時(shí)間最大的進(jìn)程進(jìn)行單獨(dú)分析，為什么它會(huì)占用如此多的CPU時(shí)間。
如果%sys列的值很大，可以用SAR
-C命令對(duì)系統(tǒng)調(diào)用進(jìn)行進(jìn)一步分解，看這些系統(tǒng)調(diào)用主要是做些什么。同時(shí)，還必須分析是否有其他瓶頸，如paging也會(huì)引起%sys的值很大，這時(shí)，可
以用sar -q查看系統(tǒng)的運(yùn)行進(jìn)程隊(duì)列長(zhǎng)度，也可以用GlancePlus和vmstat查看內(nèi)存的使用情況；
利用SAR工具分析運(yùn)行進(jìn)程隊(duì)列長(zhǎng)度
利用SAR進(jìn)行運(yùn)行進(jìn)程隊(duì)列長(zhǎng)度分析的命令形式：
#sar -q，這時(shí)數(shù)據(jù)是通過sa1在后臺(tái)定時(shí)生成；
#sar -q 5 100，每隔5秒取樣一次，共取100次；
SAR -q: Report average queue length while occupied, and percent of time
occupied. On a multi-processor machine, if the -M option is used
together with the -q option, the per-CPU run queue as well as the
average run queue of all the processors are reported. If the -M option
is not used, only the average run queue information of all the
processors is reported:
cpu: cpu number (only on a multi-processor system with the -M option);
runq-sz: Average length of the run queue(s) of processes (in memory and runnable);
%runocc: The percentage of time the run queue(s) were occupied by processes (in memory and runnable);
swpq-sz: Average length of the swap queue of runnable processes (processes swapped out but ready to run);
%swpocc: The percentage of time the swap queue of runnable processes (processes swapped out but ready to run) was occupied.
對(duì)結(jié)果的分析：
這些數(shù)據(jù)越小越好。
如果runq-sz大于4，或者%swapocc大于5時(shí)，則表明系統(tǒng)的CPU或內(nèi)存可能有問題，需要進(jìn)一步的分析：
用sar -u命令分析CPU的使用情況；
用sar -w命令分析進(jìn)程的deactivation/reactivation and switching activities of the system;
也可以用GlancePlus；
利用SAR工具分析系統(tǒng)調(diào)用
利用SAR進(jìn)行系統(tǒng)調(diào)用分析的命令形式：
#sar -c，這時(shí)數(shù)據(jù)是通過sa1在后臺(tái)定時(shí)生成；
#sar -c 5 100，每隔5秒取樣一次，共取100次；
SAR -c: Report system calls:
scall/s: Number of system calls of all types per second;
sread/s: Number of read() and/or readv() system calls per second;
swrit/s: Number of write() and/or writev() system calls per second;
swpq-sz: Average length of the swap queue of runnable processes (processes swapped out but ready to run);
fork/s: Number of fork() and/or vfork() system calls per second;
exec/s: Number of exec() system calls per second;
rchar/s: Number of characters transferred by read system calls block devices only) per second;
wchar/s: Number of characters transferred by write system calls (block devices only) per second.
對(duì)結(jié)果的分析：
如果scall/s列的值很大，那么這么多的系統(tǒng)調(diào)用的原因就必須仔細(xì)分析了。
我們可以查看fork/s和exec/s列的值，看看系統(tǒng)是否在創(chuàng)建大量新的進(jìn)程。
利用time命令測(cè)試某個(gè)命令和程序的執(zhí)行效率
我們可以利用time命令來測(cè)試一個(gè)命令的執(zhí)行效率，語法為：
time command
command is executed. Upon completion, time prints the elapsed time
during the command, the time spent in the system, and the time spent
executing the command. Times are reported in seconds.
Execution time can depend on the performance of the memory in which the program is running.
當(dāng)我們覺得某個(gè)進(jìn)程的性能不好時(shí)，最簡(jiǎn)單的方法就是利用time命令來查看一下進(jìn)程執(zhí)行時(shí)它的時(shí)間分布情況，然后再用其他工具進(jìn)一步分析。
利用top命令查看最耗CPU資源的進(jìn)程
我們可以利用top命令來查看最耗CPU資源的進(jìn)程。top命令還會(huì)根據(jù)進(jìn)程占用CPU資源的多少而動(dòng)態(tài)改變。
它的語法為：
top [-s time] [-d count] [-q] [-u] [-h] [-n number]
其中各選項(xiàng)的含義為：
-s time: 屏幕刷新的時(shí)間間隔time，缺省為5秒；
-d count: 屏幕刷新count次后，top命令自己也退出；
-q: This option runs the top program at the same priority as if it is
executed via a nice -20 command so that it will execute faster (see
nice(1)). This can be very useful in discovering any system problem
when the system is very sluggish. This option is accessibly only to
users who have appropriate privileges.
-u: User ID (uid) numbers are displayed instead of usernames. This
improves execution speed by eliminating the additional time required to
map uid numbers to user names.
-h: Hides the individual CPU state information for systems having
multiple processors. Only the average CPU status will be displayed.
-n number: Show only number processes per screen. Note that this option
is ignored if number is greater than the maximum number of processes
that can be displayed per screen.
在top命令運(yùn)行時(shí)，我們可用以下幾個(gè)快捷鍵來翻屏：
j: 向前翻；
k: 向后翻；
t: 回到第一頁；
對(duì)結(jié)果的分析：
通過top命令，我們可以快速了解到目前系統(tǒng)的CPU資源使用情況，尤其是占用CPU資源最多的進(jìn)程是我們必須關(guān)注的對(duì)象。
我們通過RES(the current size of the process resident in memory)列可以知道每個(gè)進(jìn)程占用內(nèi)存的數(shù)量。
我們通過NICE列可以知道系統(tǒng)是否使用NICE值來調(diào)節(jié)該進(jìn)程的工作負(fù)載平衡。
利用uptime命令查看系統(tǒng)整體情況
uptime prints the current time, the length of time the system has been
up, the number of users logged on to the system, and the average number
of jobs in the run queue over the last 1, 5, and 15 minutes.
w is linked to uptime and prints the same output as uptime -w, displaying a summary of the current activity on the system.
它的語法為：
uptime [-hlsuw] [user]
w [-hlsuw] [user]
其中各選項(xiàng)的含義為：
-h: Suppress the first line and the heading line. This option should
not be used with the -u option. This option assumes the use of the -w
option to uptime.
-l: Use long output. This option assumes the use of the -w option to uptime.
-s: Use the short form of output for displaying terminal information.
The terminal name is abbreviated; the login time and CPU times are
suppressed.
-u: Print only the first line describing the overall state of the
system. This is the default for the uptime command.ormation for systems
having multiple processors. Only the average CPU status will be
displayed.
-w: Print a summary of the current activity on the system for each user. This is the default for the w command.
利用GlancePlus分析系統(tǒng)CPU資源利用率
利用HP的GlancePlus工具可以對(duì)進(jìn)程的整體情況和單獨(dú)的某個(gè)進(jìn)程都詳細(xì)分析。
1)對(duì)CPU的整體使用情況的分析：
進(jìn)入GlancePlus；
按?鍵進(jìn)入聯(lián)機(jī)幫助界面；
按c鍵進(jìn)入CPU的詳細(xì)界面；
按b鍵表示向后翻頁，按f鍵表示向前翻頁；
通過CPU Detail Screen，我們可以知道CPU時(shí)間的分布情況，用戶用了多少，系統(tǒng)用了多少等。
2)對(duì)單個(gè)進(jìn)程的CPU資源占用情況分析：
進(jìn)入GlancePlus；
按?鍵進(jìn)入聯(lián)機(jī)幫助界面；
按g鍵進(jìn)入進(jìn)程列表界面；
按s鍵進(jìn)入進(jìn)程選擇界面，通常最忙的進(jìn)程會(huì)作為缺省進(jìn)程；
輸入想查看的進(jìn)程號(hào)；
按b鍵表示向后翻頁，按f鍵表示向前翻頁；
在對(duì)單個(gè)進(jìn)程的分析中，我們通常要關(guān)注以下幾個(gè)值：
CPU Usage;
User CPU;
System CPU;
Priority;
Logical and Physical Reads and Writes;
Total RSS/VSS;
blocked on(通過按shift+>來得到);
對(duì)CPU需求密集型系統(tǒng)的性能調(diào)試
1)基于硬件的方法：
升級(jí)到更快的CPU；
升級(jí)到更大的高速緩存；
增加CPU個(gè)數(shù)；
把應(yīng)用分布到多個(gè)系統(tǒng)中；
使用無盤結(jié)點(diǎn)；
增加浮點(diǎn)處理器；
2)基于軟件的方法：
在不是高峰時(shí)間運(yùn)行批處理；
Nice umimportant application;
使用rtpio命令來幫助重要的應(yīng)用；
使用plock命令來幫助重要的應(yīng)用；
Turn off system accounting;
Consider using Taskbroker or DCE;
優(yōu)化應(yīng)用；
考慮使用進(jìn)程資源管理器(Process Resource Manager)，不過PRM只有在HP-UX平臺(tái)上有。

本文來自ChinaUnix博客，如果查看原文請(qǐng)點(diǎn)：http://blog.chinaunix.net/u/468/showart_470100.html

文庫|博客

Apache官方強(qiáng)心劑：開源不受出口管理?xiàng)l例約束！
Linux基礎(chǔ)命令---lynx瀏覽器
Dell R740服務(wù)器設(shè)置磁盤直通,不做RAID虛擬磁盤陣列
Linux基礎(chǔ)命令---elinks文本瀏覽器
Linux基礎(chǔ)命令---wget下載工具

返回列表

Chinaunix › 論壇 › 操作系統(tǒng) › HP-UX › HP文檔中心 › 性能調(diào)試---(三)CPU性能分析

積分 0, 距離下一級(jí)還需積分

亚洲av成人无遮挡网站在线观看,少妇性bbb搡bbb爽爽爽,亚洲av日韩精品久久久久久,兔费看少妇性l交大片免费,无码少妇一区二区三区

性能調(diào)試---(三)CPU性能分析 [復(fù)制鏈接]