•   Quick Search:       Advanced Search
    ZENG Zhaolin,YAN Xin,YU Bingbing,ZHOU Feng,XU Guangyi.Khmer multi-document extractive summarization method based on hierarchical maximal marginal relevance[J].Journal of Hebei University of Science and Technology,2020,41(6):508-517
    基于分層最大邊緣相關的柬語多文檔抽取式摘要方法
    Khmer multi-document extractive summarization method based on hierarchical maximal marginal relevance
    Received:July 16, 2020  Revised:October 23, 2020
    DOI:10.7535/hbkd.2020yx06005
    中文關鍵詞:  自然語言處理  柬語  抽取式摘要  深度學習  瀑布法  最大邊緣相關
    英文關鍵詞:natural language processing  Khmer  extractive summarization  deep learning  waterfall method  maximal marginal relevance(MMR)
    基金項目:國家自然科學基金(61562049,61462055)
    Author NameAffiliationE-mail
    ZENG Zhaolin Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming  
    YAN Xin Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming kg_yanxin@sina.com 
    YU Bingbing Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming  
    ZHOU Feng Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming  
    XU Guangyi Yunnan Nantian Electronic Information Industry Company Limited, Kunming  
    Hits: 1535
    Download times: 1516
    中文摘要:
          為了解決傳統多文檔抽取式摘要方法無法有效利用文檔之間的語義信息、摘要結果存在過多冗余內容的問題,提出了一種基于分層最大邊緣相關的柬語多文檔抽取式摘要方法。首先,將柬語多文檔文本輸入到訓練好的深度學習模型中,抽取得到所有的單文檔摘要;然后,依據類似分層瀑布的方式,迭代合并所有的單文檔摘要,通過改進的最大邊緣相關算法合理地選擇摘要句,得到最終的多文檔摘要。結果表明,與其他方法相比,通過使用深度學習方法并結合分層最大邊緣相關算法共同獲得的柬語多文檔摘要,R1,R2,R3和RL值分別提高了4.31%,5.33%,6.45%和4.26%;诜謱幼畲筮吘壪嚓P的柬語多文檔抽取式摘要方法在保證摘要句子多樣性和差異性的同時,有效提高了柬語多文檔摘要的質量。
    英文摘要:
          In order to solve the problem of ineffective utilization of the semantic information between documents in the traditional multi-document extractive summarization method and the excessive redundant content in the summary result, a Khmer multi-document extractive summarization method based on hierarchical maximal marginal relevance(MMR)was proposed. Firstly, the Khmer multi-document text was input into the trained deep learning model to extract all the single-document summaries. Then, all single document summaries were iteratively merged according to a similar hierarchical waterfall method, and the improved MMR algorithm was used to reasonably select summary sentences to obtain the final multi-document summary. The experimental results show that the R1, R2, R3, RL values of the Khmer multi-document summary obtained by using the deep learning method combined with the hierarchical MMR algorithm increases by 4.31%, 5.33%, 645% and 4.26% respectively compared with other methods. The Khmer multi-document extractive summarization method based on hierarchical MMR can effectively improve the quality of Khmer multi-document summary while ensuring the diversity and difference of the summary sentences.
    View Full Text  View/Add Comment  Download reader
    Close
    购乐彩