~lioploum/offpunk-users

2

lxml_html_clean error when installing Offpunk 2.2

Details
Message ID
<CAPLTeA3ynd58+sHy296hU1sV9DL6QvG87oKFL52ydGDMzQBCbA@mail.gmail.com>
DKIM signature
pass
Download raw message
This is my first post so please bear with me. When I tried installing
Offpunk (v 2.2) for the first time, I got a similar error both on
MacOS (via MacPorts) and on Ubuntu. The error message is quoted below.

I hope this message helps someone. To fix the error on Ubuntu, I
needed to install an additional package
(sudo apt-get install python3-lxml-html-clean)


The error was:

Traceback (most recent call last):
  File "/usr/bin/offpunk", line 5, in <module>
    from offpunk import main
  File "/usr/lib/python3/dist-packages/offpunk.py", line 25, in <module>
    import netcache
  File "/usr/lib/python3/dist-packages/netcache.py", line 15, in <module>
    import ansicat
  File "/usr/lib/python3/dist-packages/ansicat.py", line 19, in <module>
    from readability import Document
  File "/usr/lib/python3/dist-packages/readability/__init__.py", line
3, in <module>
    from .readability import Document
  File "/usr/lib/python3/dist-packages/readability/readability.py",
line 11, in <module>
    from .cleaners import clean_attributes
  File "/usr/lib/python3/dist-packages/readability/cleaners.py", line
3, in <module>
    from lxml.html.clean import Cleaner
  File "/usr/lib/python3/dist-packages/lxml/html/clean.py", line 18, in <module>
    raise ImportError(
ImportError: lxml.html.clean module is now a separate project lxml_html_clean.
Install lxml[html_clean] or lxml_html_clean directly.


Thanks for such an inspiring project!

best regards
Marcin Warpechowski
Details
Message ID
<171649611316.6.5025759765609859653.337894883@ploum.eu>
In-Reply-To
<CAPLTeA3ynd58+sHy296hU1sV9DL6QvG87oKFL52ydGDMzQBCbA@mail.gmail.com> (view parent)
DKIM signature
pass
Download raw message
On 24 mai 23 09:59, Marcin Warpechowski wrote:
>This is my first post so please bear with me. When I tried installing
>Offpunk (v 2.2) for the first time, I got a similar error both on
>MacOS (via MacPorts) and on Ubuntu. The error message is quoted below.
>
>I hope this message helps someone. To fix the error on Ubuntu, I
>needed to install an additional package
>(sudo apt-get install python3-lxml-html-clean)


Hi Marcin, 

Thanks for your email and you are absolutely right. This bug seems to 
happen only on Ubuntu 24.04 and, if my investigations are right, is not 
really related to offpunk but to python3-readability.

Indeed, python3-readability depends upon python3-lxml but, between 
Ubuntu 23.10 and Ubuntu 24.04, python3-lxml has been splitted into two 
different packages: python3-lxml and python3-lxml-html-clean (it seems 
many python packages are buggy under Ubuntu 24.04 due to the upgrade to 
python3.12 while not all packages have been tested with this version)

I would be curious to investigate if this is an Ubuntu specific change 
or if it also happens on Debian unstable (I should upgrade my Debian 
laptop to test). This may be of interest for Etienne Mollier, who is 
doing the Debian packaging.

In the meantime, I’ve updated the ubuntu_dependencies.txt in the 
repository to include python3-lxml-html-clean.

Thanks again,

Ploum
>
>
>The error was:
>
>Traceback (most recent call last):
>  File "/usr/bin/offpunk", line 5, in <module>
>    from offpunk import main
>  File "/usr/lib/python3/dist-packages/offpunk.py", line 25, in <module>
>    import netcache
>  File "/usr/lib/python3/dist-packages/netcache.py", line 15, in <module>
>    import ansicat
>  File "/usr/lib/python3/dist-packages/ansicat.py", line 19, in <module>
>    from readability import Document
>  File "/usr/lib/python3/dist-packages/readability/__init__.py", line
>3, in <module>
>    from .readability import Document
>  File "/usr/lib/python3/dist-packages/readability/readability.py",
>line 11, in <module>
>    from .cleaners import clean_attributes
>  File "/usr/lib/python3/dist-packages/readability/cleaners.py", line
>3, in <module>
>    from lxml.html.clean import Cleaner
>  File "/usr/lib/python3/dist-packages/lxml/html/clean.py", line 18, in <module>
>    raise ImportError(
>ImportError: lxml.html.clean module is now a separate project lxml_html_clean.
>Install lxml[html_clean] or lxml_html_clean directly.
>
>
>Thanks for such an inspiring project!
>
>best regards
>Marcin Warpechowski
>

-- 
Ploum - Lionel Dricot
Blog: https://www.ploum.net
Livres: https://ploum.net/livres.html
Details
Message ID
<171658451834.6.2989649585438860468.338792723@ploum.eu>
In-Reply-To
<CAPLTeA3ynd58+sHy296hU1sV9DL6QvG87oKFL52ydGDMzQBCbA@mail.gmail.com> (view parent)
DKIM signature
pass
Download raw message
On 24 mai 23 09:59, Marcin Warpechowski wrote:
>This is my first post so please bear with me. When I tried installing
>Offpunk (v 2.2) for the first time, I got a similar error both on
>MacOS (via MacPorts) and on Ubuntu. The error message is quoted below.
>
>I hope this message helps someone. To fix the error on Ubuntu, I
>needed to install an additional package
>(sudo apt-get install python3-lxml-html-clean)
>

Hi Marcin,

Étienne Mollier has confirmed that this is not an offpunk bug but a bug 
in python-readability.

The bug has been fixed in Debian but Ubuntu did not fix and stick with a 
buggy version.

If you have a launchpad account, it might worth it to report the bug to:

https://bugs.launchpad.net/ubuntu/+source/readability   

You may include a reference to the Debian bug:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1069756

Thanks again!
Reply to thread Export thread (mbox)