Abstract
Ontologies play a vital role in organizing and constructing knowledge across various domains, enabling effective knowledge management and sharing. The development of domain-specific ontologies, such as the ONTO-TDM ontology for teaching domain modeling, is essential for providing a comprehensive and standardized representation of knowledge within a given discipline. However, to maximize the usefulness and relevance of such ontologies, it is crucial to automate their population with domain-specific information, reducing manual work and ensuring scalability. This paper presents a novel method for ontology population by extracting and integrating relevant information from diverse sources. The method combines the TextRank algorithm with Word2Vec to enhance keyword extraction, capturing both semantic meaning and textual importance. Keywords are then annotated and used to train a machine learning classifier, which aids in integrating new instances into the ontology. Experiments show that the proposed method achieves a precision of 63.33%, a recall of 61.29% and an F1-score of 62.28%, significantly improving keyword extraction and ontology population accuracy compared to existing methods. This validates the method’s effectiveness in semi-automatically extracting relevant instances from diverse data sources, enhancing the efficiency and accuracy of ontology population, and advancing automated knowledge management in domain-specific contexts.