Academic Journal
Exploring topological data analysis for information extraction: application to recognition of Arabic machine-printed numerals
| Τίτλος: | Exploring topological data analysis for information extraction: application to recognition of Arabic machine-printed numerals |
|---|---|
| Συγγραφείς: | Djamel Bouchaffra, Fayçal Ykhlef |
| Πηγή: | Journal of Engineering and Applied Science, Vol 71, Iss 1, Pp 1-27 (2024) |
| Στοιχεία εκδότη: | Springer Science and Business Media LLC, 2024. |
| Έτος έκδοσης: | 2024 |
| Θεματικοί όροι: | Data Analysis, Arabic machine-printed numeral recognition, Artificial intelligence, Information extraction, Topological data analysis, 02 engineering and technology, Pattern recognition (psychology), Topology, 01 natural sciences, 7. Clean energy, Statistical Topology, Arabic numerals, Betti number, Barcode filtration, Connected Component Labeling Algorithms, Chaincode representation, FOS: Mathematics, 0202 electrical engineering, electronic engineering, information engineering, 0101 mathematics, Topology (electrical circuits), Shape Analysis, Topological Methods, Invariant (physics), Discrete mathematics, 15. Life on land, Dynamic-time warping, Engineering (General). Civil engineering (General), Computer science, Numeral system, Algorithm, Topological Data Analysis in Science and Engineering, Computational Theory and Mathematics, Combinatorics, Mathematical physics, Computer Science, Physical Sciences, Computer Vision and Pattern Recognition, TA1-2040, Mathematics |
| Περιγραφή: | This manuscript explores the capability of topological data analysis (TDA) based on homology theory (HT: a subfield of algebraic topology) to extract relevant information for recognition of confusing Arabic machine-printed numerals. In fact, topological properties may significantly reduce the confusion between some numerals such as “1” and “4” in the context of small data sets. These two latter digits differ in the sense that digit 1 has no hole and digit 4 has one hole. Our contribution consists of evaluating the contribution of TDA with its invariant descriptors such as Betti numbers in machine-printed Arabic numerals recognition. Our investigation is driven by the following set of actions: (i) we extract Betti numbers invariant features of each numeral image and partition the ten numerals into three different clusters with respect to these features. (ii) We then perform a classification by assigning a test image to its corresponding cluster, and map this image to a numeral using dynamic-time warping as a metric defined in the Freemans’ chaincode space. We compared our proposed approach with major state-of-the-art methods depicting various ways of using TDA in character recognition. The advantages and limitations of TDA (including its pros and cons) are discussed further based on numeral recognition results. |
| Τύπος εγγράφου: | Article Other literature type |
| Γλώσσα: | English |
| ISSN: | 2536-9512 1110-1903 |
| DOI: | 10.1186/s44147-023-00346-x |
| DOI: | 10.60692/fjpdj-hpg05 |
| DOI: | 10.60692/arpgv-qk982 |
| Σύνδεσμος πρόσβασης: | https://doaj.org/article/4ab11352a1f1403bab2a71401224badb |
| Rights: | CC BY |
| Αριθμός Καταχώρησης: | edsair.doi.dedup.....a75f052e552b4362909fb45b21b68f6f |
| Βάση Δεδομένων: | OpenAIRE |
| ISSN: | 25369512 11101903 |
|---|---|
| DOI: | 10.1186/s44147-023-00346-x |