Journal of Multimedia Processing and Technologies

DLINE Journals portal

Home

New Journals

Browse Journals

Journal Prices

For Authors

Print ISSN: 0976-4127
Online ISSN: 0976-4135

About JMPT
	DLINE Portal Home Home Aims & Scope Editorial Board Current Issue Next Issue Previous Issue Sample Issue Upcoming Conferences Self-archiving policy Alert Services Be a Reviewer Publisher Paper Submission Subscription Contact us

How To Order
	Order Online Price Information Request for Complimentary Print Copy

For Authors
	Guidelines for Contributors Online Submission Call for Papers Author Rights

RELATED JOURNALS

Journal of Digital Information Management (JDIM)

International Journal of Computational Linguistics Research (IJCL)

International Journal of Web Application (IJWA)

Journal of Multimedia Processing and Technologies

Snap & Hear: Comic Book Analyst for Children having Literacy and Visual Barriers

R. B. Dias Yapa, T. L. Kahaduwa Arachchi, V. S. Suriyarachchi, U. D. Abegunasekara, S. Thelijjagoda
Department of Software Engineering, Department of Information Technology, Department of Information Systems Engineering Sri Lanka Institute of Information Technology & Malabe, Sri Lanka

Abstract: Comic books are very popular across the world due to the unique experience they provide for all of us in the society without any age limitation. Because of this attraction, which comic books have received, it has proved that comic literature will be able to survive in the twenty first century, even with the existence of multi-dimensional movie theatres as its competitors. While the biggest global filmmakers are busy with making movies from comic books, many researchers have been investigating their time on digitizing the comic stories as it is, expecting to create a new era in the comic world. But most of them have focused only on one or few components of the story. This paper is based on a research which aims to give the full experience of enjoying the comic books for everyone in the world despite of visual and literacy barriers people are having. Proposed solution comes as a web application that translates input image of a comic story into a text format and delivers it as an audio story to the user. The story will be created using extracted components such as characters, objects, speech text and balloons and considering the association among them with the use of image processing and deep learning technologies.

Keywords: Comics, Visual and Literacy Barriers, Recognition, Association, Image Processing, Machine Learning, Audio Story Snap & Hear: Comic Book Analyst for Children having Literacy and Visual Barriers

DOI:https://doi.org/10.6025/jmpt/2020/11/1/1-10

Full_Text PDF 4.2 MB Download: 517 times

References:

[1] Pham, Duc-Minh., Dam-Nguyen, Trong-Nhan., Nguyen-Vo, Phuc-Thinh., Tran, Minh-Triet. (2013). Smart Teddy Bear a vision-based story teller, 2013 International Conference on Control, Automation and Information Sciences (ICCAIS), 2013. Available: 10.1109/iccais.2013.6720564
[2] Rigaud, C. (2016). Segmentation and indexation of complex objects in comic book, ELCVIA Electronic Letters on Computer Vision and Image Analysis, 14 (3). Available: 10.5565/rev/elcvia.833
[3] Ponsard, C., Fries, V. (2009). Enhancing the Accessibility for All of Digital Comic Books, 1 (5). Available: http://www.eminds.hcirg.com
[4] Sutheebanjard, Phaisarn., Premchaiswadi, Wichian. (2010). A Modified Recursive X-Y Cut Algorithm for Solving Block Ordering Problems, 2010 2nd International Conference on Computer Engineering and Technology, Available: v3-307.
[5] Pang, Xufang., Cao, Ying., Lau, Rynson., W. H., Chan, Antoni, B. (2014). A Robust Panel Extraction Method for Manga.
[6] Nguyen, N., Rigaud, C., Burie, J. (2018). Multi-task Model for Comic Book Image Analysis, MultiMedia Modeling, 637-649.
[7] Nguyen, N., Rigaud, C., Burie, J. (2018). Digital Comics Image Indexing Based on Deep Learning, Journal of Imaging, 4 (7) 89.
[8] Ogawa, T., Otsubo, A., Narita, R., Matsui, Y., Yamasaki, T., Aizawa, K. (2018). Object Detection for Comics using Manga109 Annotations, Research Gate. Available:https://www.researchgate.net/publication/324005785_Object_Detection_for_Comics_using_Manga109_Annotations/citations.
[9] Siddiqui, Ahmed., K. (2015). Skin Detection Of Animation Characters, International Journal on Soft Computing, 6 (1) 37-52.
[10] Shejwal, M., Bharkad, S. (2017). Segmentation and extraction of text from curved text lines using image processing approach, 2017 International Conference on Information, Communication, Instrumentation and Control (ICICIC).
[11] MuhammadArsalanBashir, S. (2013). Font Acknowledgment and Character Extraction of Digital and Scanned Images, International Journal of Computer Applications, 70 (8) 1-5.
[12] Tolle, H., Arai, K. (2011). Method for Real Time Text Extraction of Digital Manga Comic, International Journal of Image Processing (IJIP), 4 (6) 669–676.
[13] A. K. N. HO., Burie, J. C., Ogier, J. M. (2012). Panel and Speech Balloon Extraction from Comic Books, presented at Tenth IAPR International Workshop on Document Analysis Systems, p. 424-428, (March).
[14] Rigaud, C., Burie, J. C., Ogier, J. M. (2017). Text-Independent Speech Balloon Segmentation for Comics and Manga, 133-147.
[15] Rigaud, C., Burie, J. C., Ogier, J. M., Karatzas, D., Jo. (2013). An Active Contour Model for Speech Balloon Detection in Comics, In: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR 2013.
[16] Rigaud, C., Thanh, N. L., Burie, J. C., Ogier, J. M., Iwata, M., Imazu, E., Kise, K. (2015). Speech balloon and speaker association for comics and manga understanding, In: Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia, August. 23–26, 351–355.
[17] Wang, Xin., Chen, Wenhu., Wang, Yuan-Fang., Wang, William Yang. (2018). No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling. ACL 2018.
[18] Wang, Jing., Fu, Jianlong., Tang, Jinhui., Li, Zechao., Mei, Tao. (2018). Show, Reward and Tell: Automatic Generation of Narrative Paragraph From Photo Stream by Adversarial Training, AAAI 2018.
[19] Liu, Y., Fu, J., Chen, C. W. Let Your Photos Talk: Generating Narrative Paragraph for Photo Stream via Bidirectional Attention Recurrent Neural Networks, AAAI Conference on Artificial Intelligence.
[20] Domale, Ajinkya., Padalkar, Bhimsen., Parekh, Raj., Joshi, M. A. (2013). Printed Book to Audio Book Converter for Visually Impaired, 2013 Texas Instruments India Educators’ Conference.
[21] Mishra, Taniya., Greene, Erica., Conkie, Alistair. (2012). Predicting Character-Appropriate Voices for a TTS-based Storyteller System. 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012. 3.

DLINE Journals portal