- 論壇徽章:
- 4
|
代碼如下:- content = """<?xml version="1.0" encoding="UTF-8"?>
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "xxxx">
- <html xmlns="xxxx">
- XXXX
- </html>
- <?xml version="1.0" encoding="UTF-8"?>
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "xxxx">
- <html xmlns="xxxx">
- XXXX
- </html>
- <?xml version="1.0" encoding="UTF-8"?>
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "xxxx">
- <html xmlns="xxxx">
- XXXX
- </html>"""
- import re
- replacer = re.compile("</html>.*?<html .*?>", re.M | re.DOTALL)
- result = replacer.sub("", content)
- print result
復(fù)制代碼 不過替換之后,原來的地方是個(gè)空行。 |
|