- 論壇徽章:
- 0
|
php抓取百度快照、百度收錄、百度熱詞程序代碼 - <?/*抓取百度收錄代碼*/function baidu($s){ $baidu="http://www.baidu.com/s?wd=site%3A".$s; $site=file_get_contents($baidu); //$site=iconv("gb2312", "UTF-8", $site); ereg("找到相關(guān)網(wǎng)頁(yè)(.*)篇,", $site,$count); $count=str_replace("找到相關(guān)網(wǎng)頁(yè)","",$count); $count=str_replace("篇,","",$count); $count=str_replace("約","",$count); $count=str_replace(",","",$count); return $count[0];}echo baidu(www.hzhuti.com); //獲取好主題在百度中的收錄數(shù)量?>
- 復(fù)制代碼
復(fù)制代碼 獲取百度的熱詞- <?php /** * * @user 小杰 * @return array 返回百度的熱詞數(shù)據(jù)(數(shù)組返回) */ function getBaiduHotKeyWord() { $templateRss = file_get_contents('http://top.baidu.com/rss_xml.php?p=top10'); If (preg_match('/<table>(.*)</table>/is', $templateRss, $_description)) { $templateRss = $_description [0]; $templateRss = str_replace("&", "&", $templateRss); } $templateRss = "<?xml version="1.0" encoding="GBK"?>" . $templateRss; $xml = simplexml_load_String($templateRss); foreach ($xml->tbody->tr as $temp) { if (!empty ($temp->td->a)) { $keyArray [] = trim(($temp->td->a)); } } return $keyArray; } print_r(getBaiduHotKeyWord());
- 復(fù)制代碼
復(fù)制代碼 這是在網(wǎng)上找的 稍微修改了下 將下面代碼寫(xiě)入php文件
百度收錄和百度快照時(shí)間- <?php $domain = “http://www.hzhuti.com/nokia/5230/ *欲查詢的域名*/ $site_url = ‘http://www.baidu.com/s?wd=site%3A’; $all = $site_url.$domain; /*域名所有收錄的網(wǎng)址*/ $today = $all.’&lm=1′; /*域名今日收錄的網(wǎng)址*/ $utf_pattern = “/找到相關(guān)結(jié)果數(shù)(.*)個(gè)/”; $kz_pattern = “/<span class=”g”>(.*)</span>/”; /*用以匹配快照日期的字符串*/ $times = “/d{4}-d{1,2}-d{1,2}/”; /*匹配快照日期的正則表達(dá)式,如:2011-8-4*/ $s0 = @file_get_contents($all); /*將site:www.ninthday.net的網(wǎng)頁(yè)置入$s0字符串中*/ $s1 = @file_get_contents($today); preg_match($utf_pattern,$s0,$all_num); /*匹配”找到相關(guān)結(jié)果數(shù)*個(gè)”*/ preg_match($utf_pattern,$s1,$today_num); preg_match($kz_pattern,$s0,$temp); preg_match($times,$temp[0],$screenshot); if($all_num[1] == “”) $all_num[1] = 0; if($today_num[1] == “”) $today_num[1] = 0; if($screenshot[0] == “”) $screenshot[0] = “暫無(wú)快照”;?><html> <head> <title>Test</title> </head><body> <table> <tr> <td>日期</td><td>百度收錄</td><td>百度今日收錄</td><td>百度快照日期</td> </tr> <tr> <td><?php echo date(‘m月d日G時(shí)’);?> </td><td><?php echo $all_num[1]; ?></td><td><?php echo $today_num[1]; ?></td><td><?php echo $screenshot[0]; ?></td> </tr> </table> <p>百度收錄:<a href=”<?php echo $all; ?>” target=”_blank”><?php echo $all_num[1]; ?></a></p> <p>百度今日收錄:<a href=”<?php echo $today; ?>” target=”_blank”><?php echo $today_num[1]; ?></a></p> <p>百度快照日期:<a href=”<?php echo $all; ?>”><?php echo $screenshot[0]; ?></a></p></body></html>
- 復(fù)制代碼
復(fù)制代碼 上面的方法未經(jīng)過(guò)嚴(yán)格考慮,如果服務(wù)器不支持file_get_contents函數(shù)我們就無(wú)法操作了,所以還可以利用curl操作,這個(gè)更方便可以模仿用戶哦。
|
|