Abstract
The process of recognizing handwritten Tamil char- acters through deep learning methodologies like AlexNet and Vision Transformer (ViT) is a comprehensive endeavor encompassing various stages seamlessly integrated into a cohesive framework. It begins with the acquisition and standardization of a diverse corpus of handwritten Tamil characters, followed by meticulous preprocessing steps such as resizing, grayscale conversion, and pixel normalization to ensure uniformity and enhance model compatibility. Subsequently, data augmentation techniques are employed to enrich dataset variability and mitigate overfitting, employing strategies like rotation, scaling, shearing, and flipping. Central to the recognition pipeline is the strategic selection of deep learning models, with AlexNet and ViT emerging as primary contenders. While AlexNet offers a classical convolutional neural network (CNN) architecture well-suited for image classification tasks, ViT presents a transformative approach leveraging transformer architectures, particularly adept at handling large-scale vision tasks. Training initiates with dataset partitioning into training, validation, and testing subsets, ensuring robust model evaluation. AlexNet is trained utilizing popular deep learning libraries such as PyTorch or TensorFlow, whereas ViT leverages implementations like Google’s TensorFlow or Hugging Face’s Transformers library. Throughout the training process, models undergo iterative optimization, finetuning hyperparameters, and architecture adjustments to maximize performance while guarding against overfitting. Evaluation metrics such as accuracy, precision, recall, and F1-score serve as benchmarks for model proficiency, with validation data acting as a litmus test for generalization. Rigorous testing on an independent test set solidifies performance assessments before transitioning to deployment. Deployment involves integrating the trained models into practical applications, whether web-based, mobile, or desktop, requiring efficient inference mechanisms and user-friendly interfaces. The real-world efficacy of the recognition system hinges on seamless integration, scalability, and optimization of the user experience, ensuring its practical utility and effectiveness.
Authors
A. Muthulakshmi, M. Ragavan, S. Siddarth
Mepco Schlenk Engineering College, India
Keywords
Tamil Character, Handwritten, Character Recognition, Vision Transformer, AlexNet, Deep Learning, Neural Networks