Book2speech: a low-cost machine for converting textual content into audible feedback via synthesized speech

CANAVARRO, João Victor da Silva Dias

Book2speech: a low-cost machine for converting textual content into audible feedback via synthesized speech

dc.contributor.advisor1	SAMPAIO NETO, Nelson Cruz
dc.contributor.advisor1Lattes	http://lattes.cnpq.br/9756167788721062	pt_BR
dc.contributor.advisor1ORCID	https://orcid.org/0000-0003-0408-4187	pt_BR
dc.creator	CANAVARRO, João Victor da Silva Dias
dc.creator.Lattes	http://lattes.cnpq.br/1994255852377245	pt_BR
dc.date.accessioned	2025-03-07T15:11:24Z
dc.date.available	2025-03-07T15:11:24Z
dc.date.issued	2021
dc.description.abstract	Even with the great advances in technology regarding the ease of disseminating information digitally, printed media continues to be an important and frequently used means of conveying knowledge. However, for the visually impaired, there is still a major barrier in accessing phys ically distributed content, since alternative methods of reading, such as Braille, are often not available, in addition to the lack of literacy in these writing systems among the blind population. To overcome these difficulties, solutions in the Assistive Technology areas have been proposed, both in academic and commercial ambits, aiming at increasing the independence, quality of life and inclusion of this portion of the population. The present work seeks to develop a stand alone reading machine to recognize and convert the textual content of books and derivatives into audible feedback via synthesized speech. Based on the Raspberry Pi 3, an embedded mi crocomputer, equipped with the Pi NoIR (InfraRed) camera module, Book2Speech is a machine that performs the image acquisition and the Optical Character Recognition and Text-to-Speech procedures, making use of modules for image and text processing to improve the representa tiveness of the synthesized voice, reproduced through an external speaker. Due to the lack of available document-image datasets in Brazilian Portuguese, the ICDAR2015 dataset, composed mostly of English text, was used to evaluate Book2Speech’s performance. Also, the processing time and error rate metrics were considered, which are calculated from the difference between the text recognized by the machine and the reference. Regarding the results obtained, the de warping method using the L-BFGS-B algorithm as the optimizer obtained the lowest word error rate (12.32%), while the average threshold pipeline followed by dewarping with L-BFGS-B obtained the lowest character error rate (13.27%). On the other hand, the spelling correction methods evaluated did not lead to good results, often increasing the book2speech error rate. Fi nally, it is worth mentioning that the system developed, along with the tools and resources used, are freely available.	pt_BR
dc.identifier.citation	CANAVARRO, João Victor da Silva Dias. Book2speech: a low-cost machine for converting textual content into audible feedback via synthesized speech. Orientador: Roberto Samarone dos Santos Araújo. 2021. 63 f. Trabalho de Conclusão de Curso (Bacharelado em Ciência da Computação) – Faculdade de Computação, Instituto de Ciências Exatas e Naturais, Universidade Federal do Pará, Belém, 2021. Disponível em:. Acesso em:.	pt_BR
dc.identifier.uri	https://bdm.ufpa.br/jspui/handle/prefix/7777
dc.rights	Acesso Aberto	pt_BR
dc.source	1 CD-ROM	pt_BR
dc.subject	Assistive technology	pt_BR
dc.subject	Image processing	pt_BR
dc.subject	Spelling correction	pt_BR
dc.subject	Optical character recognition	pt_BR
dc.subject	Text-to-speech	pt_BR
dc.subject	Embedded systems	pt_BR
dc.subject.cnpq	CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO	pt_BR
dc.title	Book2speech: a low-cost machine for converting textual content into audible feedback via synthesized speech	pt_BR
dc.type	Trabalho de Curso - Graduação - Monografia	pt_BR

Arquivo(s)

Pacote Original

Agora exibindo 1 - 1 de 1

Nome:: TCC_Boo2speechLowCost.pdf
Tamanho:: 9.64 MB
Formato:: Adobe Portable Document Format

Baixar

Licença do Pacote

Agora exibindo 1 - 1 de 1

Nome:: license.txt
Tamanho:: 1.84 KB
Formato:: Item-specific license agreed upon to submission
Descrição:

Baixar

Aparece na Coleção

Faculdade de Computação - FACOMP/ICEN