You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I process HTML files and uses the partition_html function to do so. However, I noticed that this function is capable of extracting Tables as an elements, but not Images.
Describe the solution you'd like
I would like partition_html to be able to extract Images, like how shared.PartitionParameters is able to.
Describe alternatives you've considered
I have tried parsing the same HTML file into shared.PartitionParameters, but this also do not extract Images. One alternative I explored was to convert the HTML file to PDF. While this might be possible, it is not guaranteed that the conversion will still yield the same expected output.
Additional context
nil
The text was updated successfully, but these errors were encountered:
Hi @jiarongkoh - thanks for the issue! We haven't supported image extraction from HTML in the past because images in HTML are linked rather than embedded directly in the document. We'll revisit internally though and follow up.
Is your feature request related to a problem? Please describe.
I process HTML files and uses the partition_html function to do so. However, I noticed that this function is capable of extracting Tables as an elements, but not Images.
Describe the solution you'd like
I would like partition_html to be able to extract Images, like how shared.PartitionParameters is able to.
Describe alternatives you've considered
I have tried parsing the same HTML file into shared.PartitionParameters, but this also do not extract Images. One alternative I explored was to convert the HTML file to PDF. While this might be possible, it is not guaranteed that the conversion will still yield the same expected output.
Additional context
nil
The text was updated successfully, but these errors were encountered: