- 論壇徽章:
- 0
|
import requests
import re
url = 'http://www.shubang.net/book/66_2151.html'
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36'}
web_data = requests.get(url, headers=headers)
web_data.encoding = 'utf-8'
txt = web_data.text
items = re.findall(r'line_en\" \>(.*)<|line_cn\" title=\"(.*)\"', txt)
for item in items:
print(item)
結(jié)果如下所示
。。。。。
('"It doesn't look new. It looks old," one of the boys said.', '')('', '“房子一點(diǎn)也不新,舊死了,”其中一個男孩說。')('It just couldn't be.', '')('', '絕對不可能。')('The other members of his family turned to stare at me.', '')('', '其他人都把目光轉(zhuǎn)向了我。')
............
請問:
1.上面的 ') , ( 是哪來的?
2.couldn't 變成了 couldn' 是咋回事?
|
|