NotImplementedError: To remove HTML markup, use BeautifulSoup's get

    xiaoxiao2022-07-07  152

    NotImplementedError: To remove HTML markup, use BeautifulSoup's get_text() function

    经查阅nltk的相关方法可能已经失效了,改用BeautifulSoup的同类方法即可,代码如下

    from bs4 import BeautifulSoup response = urllib.request.urlopen('your url') html = response.read() clean = BeautifulSoup(html).get_text()

     

    No module named 'bs4'

    pip install bs4

     

     

     

    最新回复(0)