Show simple item record

dc.contributor.authorHerath, HML G
dc.contributor.authorKumara, BTGS
dc.date.accessioned2020-12-31T20:53:01Z
dc.date.available2020-12-31T20:53:01Z
dc.date.issued2020
dc.identifier.urihttp://ir.kdu.ac.lk/handle/345/2965
dc.description.abstractAbstract: Document similarity is important in different areas dealing with textual data such as knowledge management, information extraction, natural language processing, and artificial intelligence. Several methods are existing to calculate document similarity. But the results of most approaches are unsatisfactory because specific domain and contextual similarity are not taken into consideration. In this paper, a domain-based similarity calculation method to calculate document similarity is proposed by integrating context, World Wide Web (WWW), and WordNet Similarity. Context is gathered by implementing a topic modeling algorithm and generating a domain context. There are many topic modeling algorithms available and here Latent Dirichlet Allocation (LDA) is used. The World Wide Web is used to capturing the latest knowledge. The method makes it possible to get a similarity value to the words in different domains. The quality of the obtained model is compared and evaluated using human judgment to ensure the accuracy of the calculation. Results indicate the accuracy of the calculation and the proposed model can achieve the limitations of existing measures.en_US
dc.language.isoenen_US
dc.subjectDomain-based Similarityen_US
dc.subjectTopic modelingen_US
dc.subjectWordnet Similarityen_US
dc.subjectWorld Wide Weben_US
dc.titleDomain-Based Similarity Calculation Method for Calculating Document Similarityen_US
dc.typeArticle Full Texten_US
dc.identifier.journal13th International Research Conference General Sir John Kotelawala Defence Universityen_US
dc.identifier.pgnos155-162en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record