한국어 제목 개체명 인식 및 사전 구축:도서, 영화, TV프로그램 [韩语论文]

资料分类免费韩语论文 责任编辑:金一助教更新时间:2017-04-27
提示:本资料为网络收集免费论文,存在不完整性。建议下载本站其它完整的收费论文。使用可通过查重系统的论文,才是您毕业的保障。

A named entity recognition method is used to improve the performance of information retrieval systems, question answering systems, machine translation systems and so on. The targets of the named entity recognition are usually PLOs(persons, locations a...

A named entity recognition method is used to improve the performance of information retrieval systems, question answering systems, machine translation systems and so on. The targets of the named entity recognition are usually PLOs(persons, locations and organizations). They are usually proper nouns or unregistered words, and traditional named entity recognizers use these characteristics to find out named entity candidates. The titles of books, movies and TV programs have different characteristics than PLO entities. They are sometimes multiple phrases, one sentence, or special characters. This makes it difficult to find the boundary of the named entity candidates.
In this we propose a method to extract title named entities from news articles and automatically build a named entity dictionary for the titles. For the candidates identification, the word phrases enclosed with special symbols in a sentence are firstly extracted, and then verified by the SVM with using feature words and their distances. For the classification of the extracted title candidates, SVM is used with the mutual information of word contexts.
The experiment was done on 19K news articles with 90% for learning data and 10% for testing data. The evaluation was done with 200 sentences randomly selected from the testing data. The performance of title identification is 81.17% in F1-score and that of title classification is 92.92% in each module. And the performance of the integrated module is 81.09% in F1-score. The dictionary construction performance, which is measured by deleting the duplicate extracted titles, is 71.01% in F1-score.

免费韩语论文韩语论文
免费论文题目: