[GNC] Third party OFX/CVS providers
Peter West
pbw at pbw.id.au
Sat Aug 6 19:40:25 EDT 2022
The other great tool for processing pdfs is PDFBox. It’s a java jar file, so you need to have a reasonably recent JVM installed.
https://pdfbox.apache.org/ <https://pdfbox.apache.org/>
I’m still using version2.
Here are the commandline tools.
https://pdfbox.apache.org/2.0/commandline.html <https://pdfbox.apache.org/2.0/commandline.html>
I have a series of scripts for various types of manipulation and text extraction.
It is worth noting that people who naïvely believe that text extraction from a PDF is simple will get burnt if they are not checking the results. PDF files are not obliged to store their text in any particular order. Mostly they do, until they don’t.
I can post my scripts if anyone is interested.
There are also a number of other tools for linux that i access on my Mac through MacPorts.
God bless you.
—
Peter West
pbw at pbw.id.au
“The kingdom of heaven is like treasure hidden in a field, which a man found and covered up. Then in his joy he goes and sells all that he has and buys that field.”
> On 7 Aug 2022, at 2:52 am, Tom Browder <tom.browder at gmail.com> wrote:
>
> On Sat, Aug 6, 2022 at 11:43 AM Glenn Fowler <gfowler1 at outlook.com> wrote:
>>
>> My scripts are in PowerShell. For GhostScript I'm just using CLI:
>
> Thanks, Glenn, that's close to what I've found for Linux:
>
> $ gs -sDEVICE=txtwrite -o output.txt input.pdf
>
> It just needs some tweaking and post-conversion parsing (very bank
> specific). I'll see how my current PDF statements look after text
> conversion.
>
> But I'll also keep looking at YNAB for a more general solution.
>
> Cheers!
>
> -Tom
> _______________________________________________
> gnucash-user mailing list
> gnucash-user at gnucash.org
> To update your subscription preferences or to unsubscribe:
> https://lists.gnucash.org/mailman/listinfo/gnucash-user
> -----
> Please remember to CC this list on all your replies.
> You can do this by using Reply-To-List or Reply-All.
More information about the gnucash-user
mailing list