Index module: first of all, discuss the design method of Chinese word segmentation and choose a word segmentation algorithm.
索引模块中:首先,讨论了中文分词的设计思想,选择了分词的算法。
Initially, it is based on the application of the main open source project Luence, the combination of sub-word dictionary and grammar of Chinese word segmentation algorithm components.
最初,它是以开源项目Luence为应用主体的,结合词典分词和文法分析算法的中文分词组件。
The major work includes:1 Propose an improved Chinese word segmentation algorithm for large-scale Chinese information processing, which is the basic phase of the building of Chinese search engine.
其中主要的工作包括:1大规模中文信息处理是构建中文搜索引擎的基本环节,为了实现大规模中文信息处理,本文提出了一种改进的中文分词算法。
Targeting at extending the dictionary for word segmentation so as to improve its accuracy, this paper presents a high-frequency Chinese word extraction algorithm based on information entropy.
为扩展分词词典,提高分词的准确率,本文提出了一种基于信息熵的中文高频词抽取算法,其结果可以用来识别未登录词并扩充现有词典。
The experimental result shows: under the same condition, the improved maximum match algorithm that based on the two-word bitmap has fastened segmentation speed than original algorithm.
实验结果表明:在相同条件下,基于二字词检测位图表的最大匹配分词算法较原算法分词速度更快。
A fast algorithm for generating Chinese word segmentation digraph was given.
给出了一种汉语分词有向图的快速生成算法。
In this paper, Chinese word segmentation is introduced first, and then algorithm named two-way matching term is designed, which effectively reduces the ambiguity of the Chinese words.
本文首先对中文文本分词进行了介绍,在常用分词算法的基础之上设计了一种双向匹配分词算法,有效的减少了歧义词对正确分词的影响。
This paper analyzes several existing Chinese word segmentation methods, brings out a keywords extraction algorithm which according to the weight formula.
分析现有几种中文分词方法,提出一种关键词抽取算法。
Using prefix tree and dynamic programming, this algorithm boosts the speed of Chinese word segmentation and guarantees relatively high precision.
基于前缀树和动态规划,该算法提高了中文分词速度,同时保持了相对较高的分词准确性。
The algorithm of twice word segmentation based on the title and first-sentences in paragraphs is brought forward.
两次分词算法,根据句子在段落的标题和第一被提出。
Furthermore, according to the study on omni-segmentation, a model of parallel searching in word omni-segmentation algorithm is given.
并针对全切分分词算法进行了研究,给出了全切分分词方法算法中的并发检索模型。
Furthermore, according to the study on omni-segmentation, a model of parallel searching in word omni-segmentation algorithm is given.
并针对全切分分词算法进行了研究,给出了全切分分词方法算法中的并发检索模型。
应用推荐