Part of Speech (POS) tagging is a general process of classifying words into their parts, labeling them into different categories like Nouns, Verbs, Adjective, and Adverbs and so on. These different categories of words can be defined in a tagset. The tagset can be used for automatically assigning a word with a tag. Noun forms the first category of words generally found in any sentence and verb comes next. A verb actually describes the event occurrence or an Tamil language, a verb form gets inflected by suffixes based on person, count, tense and voice. These suffixes are identified by reverse splitting method and the word is tagged as Verb. In this paper, tagging of words for Tamil language particular to Verb has been carried out. The implementation involves a rule based suffix stripping method for identifying verbs where suffixes are checked with the grammatical rules and are tagged as verb. The implementation proposed uses the traditional way of identifying a word based on grammatical rules in Tamil language, thus avoiding the process of transliteration. The implementation identifies a word based on grammatical rules, applies reverse splitting method and categorizes the words as VERB. The input is considered as Tamil words in their Unicode format, thus avoiding the process of transliteration. Most of the work done in the area of Text mining of Tamil documents mainly involves transliteration. Applying Tamil grammatical rules enrich the identification and tagging of words during morphological analysis as morphophonemic rules are considered. This is an advantage while tagging of words is considered for Tamil documents.

M Mercy Evangeline, K Shyamala
Dr. Ambedkar Government Arts College, India

Morphological Analyzer, Tagging, Verb, Tamil Language, Classification, Identification
Published By :
Published In :
ICTACT Journal on Soft Computing
( Volume: 11 , Issue: 1 )
Date of Publication :
October 2020

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.