Skip to content

Add Abacus.AI#79

Open
carstenschroeder wants to merge 6 commits intoinvoiceradar:mainfrom
carstenschroeder:abacus-ai
Open

Add Abacus.AI#79
carstenschroeder wants to merge 6 commits intoinvoiceradar:mainfrom
carstenschroeder:abacus-ai

Conversation

@carstenschroeder
Copy link
Copy Markdown
Contributor

This is a WIP.

Abacus.AI only provides the date and amount in the web portal, not a unique invoice number.
If both fields are the same for two invoices, the second invoice is ignored. But worse still, the following invoices are downloaded out of order because the table row is not also ignored. This means that the PDF for October 2025 is downloaded for November 2025.

image

So far, I have gotten around this by making the field attribute tuples unique via a random field and extending the invoice number by the index. But this is obviously a dirty hack.

It would be great if you could create a way to extract the invoice number from the PDF in such cases.

@tobiaslins
Copy link
Copy Markdown
Collaborator

We where thinking about adding a way to use the PDF hash as ID for providers that don't generate PDFs on the fly.
Can you please check if you download one invoice twice, if they have the same md5 hash? Thanks!

@carstenschroeder
Copy link
Copy Markdown
Contributor Author

Unfortunately the hashes are not equal. The PDFs are created on the fly.

PS C:\Users\Carsten\Downloads> Get-FileHash -Path ".\da5a1f13c_invoice (1).pdf" -Algorithm MD5

Algorithm       Hash                                                                   Path
---------       ----                                                                   ----
MD5             F84AF8306CDEED4BD1F3C70C60E20912                                       C:\Users\Carsten\Downloads\da...


PS C:\Users\Carsten\Downloads> Get-FileHash -Path ".\da5a1f13c_invoice.pdf" -Algorithm MD5

Algorithm       Hash                                                                   Path
---------       ----                                                                   ----
MD5             2C3503F6FF9E8B78B7D0E1CCB28F1B09                                       C:\Users\Carsten\Downloads\da...

@carstenschroeder
Copy link
Copy Markdown
Contributor Author

carstenschroeder commented Apr 9, 2026

@tobiaslins
I worked around the issue with the duplicate dates by defining a variable called “random,” which is populated with the date plus a random number for each row. That works.

But could it be that Invoice Radar has a bug in how it indexes table rows in this specific case? In our example for abacus.ai, the table containing the invoices has 16 rows. The first 8 rows are omitted because the invoice dates are prior to the earliest month (12/2025). Of the remaining invoices, the first 5 have already been loaded. However, invoices 14–16 are not downloaded; instead, invoices 6–8 are downloaded (but with the metadata of 14-16).

image

Even in the Developer Tools, rows 6–8 are displayed as relevant, which is incorrect.

image

Could you please help? I'm stuck since months.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants