Parser Compatibility¶
By default, the package will detect installed parser libraries and choose the first one we find, so a vanilla Workbook() instantiation should Just Work.
from htmxl.compose import Workbook
workbook = Workbook(parser='beautifulsoup')
workbook = Workbook(parser='lxml')
Importantly, the parsers will not exhibit identical behavior when handed the same templates.
In general the lxml
parser will be stricter and more prone to requiring “correct” HTML, while
beautifulsoup
is more permissive and will allow erroneous HTML. That tradeoff, however, generally
leads to lxml being noticeably faster with large amounts of data.
While you can look to the specific libraries for a comprehensive set of behaviors, we can identify those which we’ve we’ve seen in the wild during the use of this library.