Skip to content

Any interest in new feature: Getting URLs from pdfinfo?#257

Draft
jpreiss wants to merge 1 commit into
Belval:masterfrom
jpreiss:urlparse
Draft

Any interest in new feature: Getting URLs from pdfinfo?#257
jpreiss wants to merge 1 commit into
Belval:masterfrom
jpreiss:urlparse

Conversation

@jpreiss

@jpreiss jpreiss commented Feb 17, 2023

Copy link
Copy Markdown

I am using your library to rasterize PDFs in my presentation viewer https://github.com/jpreiss/pypdfdeck (branch videos).

I want to add a feature where any embedded URL that starts with file:// is interpreted to mean "instead of the PDF contents, display the video from this local path when viewing this page".

The command line pdfinfo can extract the URLs, but it is not exposed through the current python interface.

To do this, it would be nice if I can lean on pdf2image to properly find the poppler binaries, etc. Therefore, I added the option to extract URLs in pdfinfo_from_path().

This is not ready to merge - it needs design review, tests, equivalent _from_bytes() version, better docs, etc. Just wanted to check if this feature is actually desired before I finish the work.

Thanks!

fix type annotations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant