Skip to content

Reading columns correctly. #25

@bes827

Description

@bes827

I would like to thank you for working on this great package. It's extremely useful and has plenty of applications. I hope you continue to work and maintain it.

I noted that one of the limitations (as you mentioned) is text fragmentation when the text in pdf are in columns (eg most scientific articles). I came across this function tabulizer::extract_text(file) which can read multiple columns. I wonder if you can use something similar in your package to fix that issue. This tabulizer function will also still also cause issues with tables and images/table captions but at least will get the flow of the main text correct.

thank you

Metadata

Metadata

Assignees

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions