Tải bản đầy đủ (.pdf) (4 trang)

Xây dựng hệ thống trích chọn tên riêng cho văn bản tiếng việt bằng phương pháp học thống kê

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (192.66 KB, 4 trang )

ng h th
n ting Vit bc
thng k

Nguyn Th 

i h
Lu : 1 01 10
ng dn: TS Nguy
o v: 2007


Abstract: M t ng h 
m cc thng
ng thng h 
n ting Vi ng
mt h n ting Vit s d CRF++ ca
t s kt qa thc nghic

Keywords: , Thu, c, X 


Content
MỞ ĐẦU
 









 .  , 


, 





 , 





 /










, 

























.


, 




mt s gi 








 . 










nhau, 





 th ting h tr
n bn ting Vit.  m cc th thu thp d liu,
d  p vt ra cho lu
n ting Vit c ng d 


tional
Random Fields (CRF- Laferty, 2001) thu perceptron  d liu
dng chui (M.Collins, 2002). 














 ,  















 hun luyn. 


















.
Luc t chu:
 Chương 1 Tổng quan
  ng h 
 cc sc tin ca h n
ca hng ca h a chn p trong tng
ng hp c thng thi trong pha lu cc v 
n ving h th

dc th
 Chương 2 Các kiến thức nền tảng về học thống kê
 cn mt s c th
perceptron.              m ca tng
 ng s tp trung ving h  chn
ting Vi.
 Chương 3 Xây dựng một hệ trích chọn tên riêng sử dụng học thống kê
 ng mt h n
ting Vit s dg c CRF++ ct s kt qu thc nghim ca
c.

References
[1]. Duglas E.Appelt, D.J.Israel. Introduction to Information Extraction Technology.
1999.
[2]. A.Berger. The Improved Iterative Scaling Algorithm: A gentle Introdution. School
of Computer Science, Carnegie Mellon University. 1999.
[3]. M.Collins. Discriminative Training Methods for Hidden Markov Models: Theory
and Experiment with Perceptron Algorithms.2002.
[4]. J.Cowie, W.Lehnert. Information Extraction. Paper. 1996
[5]. R.Dugad, U.B.Desai - "A Tutorial on Hidden Markov Model" - Technical Report
No: SPANN-96.1, Indian Institute of Technology.1996.
[6]. D.Freitag, S.Khadivi. .A Sequence Alignment Model Based on the Averaged
Perceptron. 2006.
[7]. Freund & Schapire. Large Margin Classification Using the perceptron Algorithm.
Machine Learning 37(3) 277-296, 1999.
[8]. J.Lafferty, A.McCallum, and F.Pereira. Conditional random fields: probabilistic
models for segmenting and labeling sequence data. In Proc. ICML, 2001.
[9]. Dong C.Liu and Jorge Nocedal. On the limited memory BFGS method for large
scale optimization.Mathematical Programming 45 (1989),pp.503-528.
[10]. Walter F.Mascarenhas. The BFGS method with exact line searches fails for non-

convex objective functions. Published May 7, 2003.
[11]. A. McCallum, K. Rohanimanesh, and C. Sutton. Dynamic Conditional Random
Fields for Jointly Labeling Multiple Sequences. 2004
[12]. A.McCallum, C.Shutton. An introduction for Conditional Random Fields for
Relational Learning. 2005
[13]. A.McCallum, D.Freitag, and F. Pereira. Maximum entropy markov models for
information extraction and segmentation. In Proc. Iternational Conference on
Mechine Learning, 2000, pages 591-598.
[14]. A.McCallum, W.li. Early Results for Named Entity Recognition with Conditional
Random Fields, Feature Induction and Web-Enhanced Lexicons. 2003.
[15]. A.McCallum. Efficiently Inducing Features of Conditional Random Fields. 2003.
[16]. A.B.Poritz - "Hidden Markov Models - A Guide Tour" - IEEE, 1988.
[17]. L.R.Rabiner - "A Tutorial on Hidden Markov Models and Selected Applications in
Speech Recognition" - Proceedings of IEEE, VOL.77, NO.2, FEB 1989.
[18]. A.Ratnaparkhi.A maximum entropy model for part-of-speech tagging.In Proc.
Emparical Methods for Natural Language Processing, 1996.
[19]. B.Roask, M.Saraclar, M.Collins, M.Johnson. Discriminative Language Modeling
with Conditional Random Fields and the Perceptron Algorithm. 2004.
[20]. Sunita Sarawagi, William W. Cohen. Semi-Markov Conditional Random Fields for
Information Extraction. 2004.
[21]. H.Wallach. Efficient Training of Conditional Random Fields. University Of
Edinburgh, 2002.
[22]. Y.Zhang, S.Clark. Chinese Segmentation with a Word-Based Perceptron Algorithm.
2006.
[23]. n dn Ting Vit. 1999.
[24]. Nguyn Cn bii thc th tronn Ting Vit nhm h tr
Web ng ng thc th. 2005.
[25]. Nguyn C, 

y. Named Entity

Recognition in Vietnamese Free-Text and Web Documents Using Conditional
Random Fields. 2005
[26]. Tri Tran Q., Thao Pham T.X., Hung Ngo Q., Dien Dinh and Niegl Collier. Named
Entitiy Recognition in Vietnamese Document. 2007.


×