Hi everyone, I’ve recently taken an interest in self-hosted solutions for document management and budgeting, specifically Paperless-ngx, Firefly III, and n8n. A bit about me: I run a Proxmox server with a freshly set up Docker LXC container. I’m still quite new to all this, but i am infected with the homelab fever.

After spending hours on Google, I’ve come across a few services that caught my eye:

Paperless-ngx: A tool for scanning and organizing all my receipts, invoices, and documents in a searchable database.

Firefly III: A budgeting app with lots of cool features. My goal is to use it to get a better overview of my finances.

n8n: To automate the process, because I know I’m lazy and won’t keep up with manual data entry for long.

My idea: I want to scan receipts and invoices, store them in Paperless-ngx, use OCR to extract the text, total amount, and maybe even individual items, and then pass that data to Firefly III via n8n.

My questions:

Does anyone have experience with these tools? Is this a good approach, or should I consider other software?

I’ve seen that n8n is getting a lot of hype, but also has some critical, glaring issues. Is it still a good choice for this kind of automation?

Are there any tutorials or blog posts out there that cover a similar setup? I haven’t found much online. Are there any additional Docker containers I should consider, like a dedicated AI container or a special database? I have only a weak Intel I5 7th Gen PC.

I’d love to hear your thoughts, experiences, or any concerns you might have about this project. If you know someone who has done something similar, or if there’s a hidden tutorial I’ve missed, please let me know!

  • HotChickenFeet@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    5
    ·
    4 hours ago

    Paperless-ngx does include OCR, as well as supporting document types (which can have fields, etc) - but there is no built-in way to intelligently extract field values. You can use the python API to access & update the data and fields. So field extraction via your own code is feasible.

    Given the large variety in receipt layout & potential for habdwritten totals after tips - I’d encourage part of your workflow to include manual ispection/correction of every processed receipt - or at the bare minimum that you include check-in points where you verify your end balances 100% match after all transactions have been entered, so you can detect & root-cause errors ASAP.

    • MIXEDUNIVERS@discuss.tchncs.deOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 hours ago

      Well that is something i also have read online. Thats the reason why i think of using n8n

      1. upload picture from some app (maybe something like telegram chat or email)
      2. workflow in n8n mistreal ocr and an ki tool to sort the infos in an upload ready format for firefly III
      3. and dann some api magic to paperless so that the same picture is stored in the right folder. and is archived.

      Thats the workflow which should be feasible. i think i have brainstormed with ki (say what you want) and i think thats the plan i try to implement first.

  • Creat@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    3
    ·
    3 hours ago

    First my context: I’m also running multiple Proxmox hosts (personal and professional), and havea paperless-ngx instance (personal/family). I tried Firefly, but the effort required to get it to a point where it would be if use to me was too high, so I dropped it. Haven’t used n8n.

    For the setup I’d just use the Proxmox community scripts, if you haven’t heard of them. Makes updates trivial and lowers the bar to just trying something to basically zero.

    Paperless-ngx I actually use, cause it means I can find something when i need it. It’s all automatically ocr’d and all you have to do is categorize them. With time, it’ll learn and do this for you. You can (manually) setup your scanner to just directly upload files to the “consume” folder and it just works. PC/server power is near irrelevant, it just means OCR takes slightly longer, otherwise it’s a web server. You can run this just fine on a raspberry pi.

    I don’t have any real automation setup, so I can’t really comment on that. My advice is to just install it, see what it does and how it feels. Try to anticipate if and how much automation you need. Many aspects of all this are of the “setup once” variety, where once it’s working, you don’t have to touch it again. Try to gauge if the one time effort is worth it for you, then go from there. As I said, it was fine for paperless for me, but not for Firefly (but I might need to revisit this).

  • LordFireCrotch@lemmy.today
    link
    fedilink
    English
    arrow-up
    5
    ·
    4 hours ago

    I’ve been hosting my own firefly iii instance for a couple years now. It has its quirks but overall it’s a great home finance option. Better than some others I’ve tried.

    As an example, the creator is adamantly against future transactions and projecting future transactions. There’s the ability to create “recurring” transactions and the app has a daily cron that will create the recurring transactions on the day they’re supposed to hit. I for one want to see these future transactions and want those to show up about a month early. To do this I have to spoof a cron job to make the app think it’s creating recurring transactions for 30 days from today. This works well enough to show my balances for the future.

    I like it has a pretty nice API that you can hook I to from your own apps.

    I wrote an importer app that uses Plaid to connect to my bank accounts and credit cards and it gives me the option to categorize transactions and import them into firefly. It’s made managing my finances incredibly simple. I used to spend hours on the weekend manually importing transactions from the previous week. Now I have something that I can tab through in a matter of minutes and see everything.