Automating a Parking Expenses Report
For the past year, I was consulting at a place where I could request reimbursement of parking expenses. This had to be claimed using a standard Excel template.
Doing this manually every month seemed boring, so I automated a significant part of that:
- Fetch parking costs for a certain month from my Paperless-ngx instance
- Calculate costs after taxes and a grand total
- Fill in an Excel sheet with the details
- Convert the Excel sheet to PDF and append photos of parking costs
Note: I don’t expect anybody to have the same requirements as me here, but hopefully pieces can be useful to some. The open-sourced project can be found here.
Organizing the Parking Costs
Each time I would park, I would accumulate a parking ticket. This ticket contains the following information:
- Date and time
- Car license plate
- Price
- VAT
In order to automate the report, I had to have this information in a structured way. Therefore, I turned to Paperless-ngx. Paperless-ngx is an self-hostable open-source service that helps digitalizing paper documents. Third-party mobile apps help quickly scanning documents and uploading them to the server. This is exactly what I did with the parking tickets as well: each time I received a ticket, I would scan it with the Paperless app.
Paperless-ngx uses Optical Character Recognition (OCR) to cleverly extract the date of the ticket automatically. I also created a custom label to indicate a parking ticket, and I added custom fields to indicate a ticket’s cost and VAT amount. You can see this information in the screenshot below.
Translated from Dutch: aanmaakdatum means creation date, parkeerkaart means parking ticket, BTW means VAT and bedrag means cost
Paperless-ngx will also automatically learn to label the tickets with the “parking ticket” label, based on the document’s content. That just leaves the VAT and price data points. Unfortunately I found the OCR to be too unreliable to extract that text from the documents. Therefore, I had to manually set those two data points for each ticket.
Neat, we now have all parking tickets organized on Paperless-ngx! In order to create a report, we can just query the Paperless-ngx API!
Using the Paperless-ngx API
The remaining part of this post will be code explanations. You can find the full open-sourced Python code here.
Paperless-ngx has an API we can now use to query our parking tickets. I found the filtering logic pretty hard to understand. Therefore I simply looked at what API calls the web UI makes with the query I needed. This turned out to be something like this:
response = requests.get(
f"{paperless_ngx_url}/api/documents/?page_size=50&query=created:[{start_date} TO {end_date}]&tags__id__all={FILTER_TAG_ID}&correspondent__id__in={FILTER_CORRESPONDENT_ID}",
headers={"Authorization": f"Token {token}"},
)
In the above code snippet, FILTER_TAG_ID
is the ID of the “Parking ticket” label and FILTER_CORRESPONDENT_ID
is the ID of the correspondent (I included this to filter out potential unrelated parking tickets).
Perfect, this returns a JSON list of parking tickets we can use to create the report!
Creating the Report
All that’s left is to fill in an Excel template now. This actually proved to be the most annoying part but also the most boring part. I used the OpenPyXL Python library to manipulate the Excel sheets and basically just add a row for each parking ticket. For code, check here.
To convert the Excel sheet to a PDF, I used the (apparently now deprecated) unoconv utility. It uses LibreOffice under the hood for the conversion, which adds a 2GiB dependency for the project 🫠.
Finally, to merge the report PDF with photos of the parking tickets, I used the PyPDF2 Python library.
I would now show some example output, but it contains quite some sensitive information so unfortunately I can’t… But I never had any complaints from the finance department, so it worked great!