Palaeography aims to study ancient documents and the identification of the people who participated in the handwriting process of a given document is one of the most important problems. To this aim, expert paleographers typically analyze handwriting features such as letter heights and widths, distances between characters and angles of inclination. With the aim of achieving more precise measures and also thanks to the availability of high-quality digital images, paleographers are starting to use digital tools. In this context, in previous studies, we proposed a pattern recognition system for distinguishing the writers of mediaeval books and also investigated which is the minimum amount of training data needed to achieve satisfactory results in terms of accuracy. In this paper, we present a reject option that allows us to implement a highly-reliable system for writer identification, trained on a reduced set of data. The experimental results, performed on two sets of digital images from medieval Bibles, show that rejecting only a few samples it is possible to strongly reduce the error rate.
Minimizing Training Data for Reliable Writer Identification in Medieval Manuscripts
Cilia N. D.;
2019-01-01
Abstract
Palaeography aims to study ancient documents and the identification of the people who participated in the handwriting process of a given document is one of the most important problems. To this aim, expert paleographers typically analyze handwriting features such as letter heights and widths, distances between characters and angles of inclination. With the aim of achieving more precise measures and also thanks to the availability of high-quality digital images, paleographers are starting to use digital tools. In this context, in previous studies, we proposed a pattern recognition system for distinguishing the writers of mediaeval books and also investigated which is the minimum amount of training data needed to achieve satisfactory results in terms of accuracy. In this paper, we present a reject option that allows us to implement a highly-reliable system for writer identification, trained on a reduced set of data. The experimental results, performed on two sets of digital images from medieval Bibles, show that rejecting only a few samples it is possible to strongly reduce the error rate.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.