王文涛,穆晓峰,王玲霞.一种基于特征嵌入神经网络的中文分词方法[J].中南民族大学学报自然科学版,2017,(1):102-106
一种基于特征嵌入神经网络的中文分词方法
An Approach for Chinese Word Segmentation Based on Feature Embedding Neural Network
  
DOI:
中文关键词: 中文分词  神经网络  特征嵌入
英文关键词: Chinese word segmentation  neural network  feature embedding
基金项目:国家民委教改项目(15013);中南民族大学研究生创新基金资助项目(2016sycxjj199)
作者单位
王文涛,穆晓峰,王玲霞 中南民族大学 计算机科学学院武汉 430074 
摘要点击次数: 315
全文下载次数: 286
中文摘要:
      针对传统基于特征的中文分词模型中,参数相对于训练数据过多而难以准确估计特征权值这一问题,提 出了一种基于特征嵌入的神经网络方法.嵌入方法将特征转化为低维实值向量,能有效降低特征维度.另外,为了 增强模型的性能,给出了一种学习速率线性衰减方法.研究了正则项的方法来增强模型的泛化能力.实验表明: 文 中提出的模型可以提高中文分词问题的求解效率.
英文摘要:
      The feature weights are poorly estimated, because the number of parameters is much greater than the limited amount of training data under the traditional Chinese word segmentation model based on feature. To address above problem,this paper proposed an approach based on feature embedding neural network for Chinese word segmentation. The embedding method can reduce the dimensional of features because the model transforms features into low-dimensional real-valued vectors. In addition, in order to enhance performance of the model, we proposed a learning rate linear decay method.Finally, we studied the regularization method to enhance the generalization ability of the model. The experiment results showed that the model can improve the solving efficiency of Chinese word segmentation.
查看全文   查看/发表评论  下载PDF阅读器
关闭