Hi everyone, I’ve recently taken an interest in self-hosted solutions for document management and budgeting, specifically Paperless-ngx, Firefly III, and n8n. A bit about me: I run a Proxmox server with a freshly set up Docker LXC container. I’m still quite new to all this, but i am infected with the homelab fever.

After spending hours on Google, I’ve come across a few services that caught my eye:

Paperless-ngx: A tool for scanning and organizing all my receipts, invoices, and documents in a searchable database.

Firefly III: A budgeting app with lots of cool features. My goal is to use it to get a better overview of my finances.

n8n: To automate the process, because I know I’m lazy and won’t keep up with manual data entry for long.

My idea: I want to scan receipts and invoices, store them in Paperless-ngx, use OCR to extract the text, total amount, and maybe even individual items, and then pass that data to Firefly III via n8n.

My questions:

Does anyone have experience with these tools? Is this a good approach, or should I consider other software?

I’ve seen that n8n is getting a lot of hype, but also has some critical, glaring issues. Is it still a good choice for this kind of automation?

Are there any tutorials or blog posts out there that cover a similar setup? I haven’t found much online. Are there any additional Docker containers I should consider, like a dedicated AI container or a special database? I have only a weak Intel I5 7th Gen PC.

I’d love to hear your thoughts, experiences, or any concerns you might have about this project. If you know someone who has done something similar, or if there’s a hidden tutorial I’ve missed, please let me know!

  • HotChickenFeet@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    5
    ·
    7 hours ago

    Paperless-ngx does include OCR, as well as supporting document types (which can have fields, etc) - but there is no built-in way to intelligently extract field values. You can use the python API to access & update the data and fields. So field extraction via your own code is feasible.

    Given the large variety in receipt layout & potential for habdwritten totals after tips - I’d encourage part of your workflow to include manual ispection/correction of every processed receipt - or at the bare minimum that you include check-in points where you verify your end balances 100% match after all transactions have been entered, so you can detect & root-cause errors ASAP.

    • MIXEDUNIVERS@discuss.tchncs.deOP
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      6 hours ago

      Well that is something i also have read online. Thats the reason why i think of using n8n

      1. upload picture from some app (maybe something like telegram chat or email)
      2. workflow in n8n mistreal ocr and an ki tool to sort the infos in an upload ready format for firefly III
      3. and dann some api magic to paperless so that the same picture is stored in the right folder. and is archived.

      Thats the workflow which should be feasible. i think i have brainstormed with ki (say what you want) and i think thats the plan i try to implement first.