id author title date pages extension mime words sentences flesch summary cache txt work_522o6sdt2jfpxejcre57oeradm Mirjam Cuper An Optical Character Recognition Software Benchmark for Old Dutch Texts on the EYRA Platform 2020 1 .pdf application/pdf 541 30 55 An Optical Character Recognition Software Benchmark for Old Dutch Texts on However, acquiring high-quality machine readable texts using currently available Optical Character benchmark to enable the evaluation of the performance of OCR software on old Dutch texts. For the pilot version of the benchmark a data set containing 2055 Dutch book pages (16301796) This data set contains both scanned pages (OCR method input data) and machine readable text (ground truth that can be used to assess the quality of the OCR method output). the validation data and therefore provides a fair comparison of the performance of the OCR Also, if new validation data is available and added to the benchmark later on, the OCR Various metrics could be used to assess the performance of the OCR methods in comparison to the visualize algorithm results on the platform, to gain more insight into algorithm performance. ./cache/work_522o6sdt2jfpxejcre57oeradm.pdf ./txt/work_522o6sdt2jfpxejcre57oeradm.txt