- Advances in Artificial Intelligence Research
- Volume:4 Issue:1
- Efficient and Accurate Date Extraction from Invoices: A Comprehensive Three-Step Methodology Integra...
Efficient and Accurate Date Extraction from Invoices: A Comprehensive Three-Step Methodology Integrating Custom Object Detection, OCR, and Refined Regular Expressions
Authors : Mehmet Hilmi Emel, Murat Terzioğlu, Ramazan Özkan
Pages : 10-17
Doi:10.54569/aair.1401234
View : 58 | Download : 45
Publication Date : 2024-08-30
Article Type : Research Paper
Abstract :In the realm of contemporary document processing, the challenge of extracting crucial information from diverse invoices necessitates innovative solutions. This article presents a comprehensive three-step methodology to address the complexity of date extraction from invoices. Leveraging LabelStudio, Python, and OpenCV, we constitute a dataset and train a custom object detection model using Ultralytics YOLOv8. Optical Character Recognition (OCR) provides us to convert the image data to string data that is enable to be processed. Regular expressions refine the extracted text, achieving precise date formats. The developed system significantly enhance the time efficiency, marking a noteworthy advancement in date extraction from invoices.Keywords : Custom Object Detection, OCR, Invoice Processing, YOLOv8