Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Medical Documents Classification Based on the Domain Ontology MeSH (1207.0446v1)

Published 2 Jul 2012 in cs.IR

Abstract: This paper addresses the problem of classifying web documents using domain ontology. Our goal is to provide a method for improving the classification of medical documents by exploiting the MeSH thesaurus (Medical Subject Headings) which will allow us to generate a new representation based on concepts. This approach was tested with two well-known data mining algorithms C4.5 and KNN, and a comparison was made with the usual representation using stems. The enrichment of vectors using the concepts and the hyperonyms drawn from the domain ontology has significantly boosted their representation, something essential for good classification. The results of our experiments on the benchmark biomedical collection Ohsumed confirm the importance of the approach by a very significant improvement in the performance of the ontology-based classification compared to the classical representation (Stems) by 30%.

Citations (9)

Summary

We haven't generated a summary for this paper yet.