Tải bản đầy đủ (.pdf) (8 trang)

Báo cáo khoa học: "Growing Finely-Discriminating Taxonomies from Seeds of Varying Quality and Size" docx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (251 KB, 8 trang )

Proceedings of the 12th Conference of the European Chapter of the ACL, pages 835–842,
Athens, Greece, 30 March – 3 April 2009.
c
2009 Association for Computational Linguistics
Growing Finely-Discriminating Taxonomies from Seeds
of Varying Quality and Size
Tony Veale




Guofu Li




Yanfen Hao




Abstract


         
  
    

 !  "     
     



#
$"%%
!
1 Introduction
&          

'

!&

           
(')*++,-
./(%01,,*-
"%(.et al.*++,-!2
  
$            



      !  &    


3(!!4
51,,6-!
)     
 
     3!    %'0    
        78  #  3
          
   


      (!!  59 (*+:6-
52;- #  


!

#
!0
"%
2

  
!0


      

    (5  1,,<-!  4  "%  

              
#!"


=




!


     grown
           seeds!
&
      
7>?%/%8 


 !""%
    #      2
835
&7
8

 sharp!;
(-
@"%

 X-ness!  /      


!
0
1
A      

!<
=

#

!>
B
#
!&

6!
2 Related Work
 #
 


      (>  
2*+::-!;5
(*++1-

#
    
   4
(*+++- !

 3(
-
!  & KnowItAll     2 et al.
(1,,<-            
5(!!7%0%0*%01C8-

 # 
!      "  (1,,D-  
5 
# 

   =
          7  E    
%/%87%/%E8
(-(-
(-!
?%(1,,<-
##

!&
"%
=

!>
 
 
        !  F et al.
(1,,B-
   
!  >        

      =  


!F et al. (1,,:-
 
    5  (*++1-      
!; 
7%/%

%/%



 E8      

E
 %/%

 
%/%
!

 

 !&
$
 reckless
 
(E-(%/%

-
=

!
&
F et al!(1,,:-!"
 
          #  
7>?

 %/%


 E87>?

 E
%/%

8!>
        

(>?

-      #      !
&
 
      
"%"!Fet al!(1,,:-

3     states countries (
836
-singersfish(
-

food
 sweet (G51,,D-!4
          
 "%  

"%"%
!
3 Seeds for Taxonomic Growth

> 
&
3
HI


3
0

J



0


    
3
=        



    
3
0

!  &      
      
Icola, carbonated, drinkJ!;
           cola

(treatrefreshment-#
7E8
#7
       E8!       
         
#
     
 #!  "        

$"
%%!
3.1 WordNet
& "%
           

!;"%
 {feline, felid}      
{true_cat, cat}  {big_cat, cat}    
            

              !
5 
"%
6,K  Xess 
"%(ess,
ess, ?ess!-
 female       !
% 
"% 


  !  >        
              
        Ilioness, female,
lionJ   Iespresso, strong, coffeeJ    
Imessiah, awaited, kingJImessiah, expect-
ed, delivererJ!
3.2 ConceptNet
        "%
 
        !  
%('1,,<-
 
"""
!
 
%

     
(-
 !'
>
%  espresso 
strong coffee("%-
bagelJewish word(usemen-
tion-!'expressionism
artistic style ("%
 artistic movement- explosion 
suicide attack(-!
%


"%
!  "          %
A,,,,>

(78-
   (!!7 8-
(!!78-
"%!
&          IWyoming, great,
stateJ  Iwreck, serious, accidentJ    Iwolf,
wild, animalJ!
3.3 Web-derived Stereotypes
G5(1,,D-
 #
 
7>?%/%8
!&



#!!
#
 
!5
*BL 
837
(!!787
 8!-

          

6,,,
  1,,,  3  !  5  
G59

            !
&
=
Isurgeon, skilful, ?JIvirus, malicious, ?J
Idog, loyal, ?J!&
          
 #
!
3.4 Overview of Seed Resources
%
 !&"%

            
#
    !&
%
 
"%
!> 
G5
          #  3  

3
!>#

&*!

"% % 
M

*111D **AA 6B*1
M

B*A*< *:,: *66::
M

<!*1 *!6 1!B6
M

1A,B BB, **D1
&*$&
!
""% 

  $
(-(
-
(
            
-! 4 
#
   

!B
              
#
!

4 Bootstrapping from Seeds
&
!

              
NN
!&
 &
3
 HI


3
0

J 
#
    #    (E      
-$
*! 7
3
E

8
1! 7
3
0

E8
#




 
3
0

 


!
 #
#$
A! 7E
3
0

E8
<! 7E
3
E

8
&
            

   !  ;
7
    8        #  
Ilemonade, cold, beverageJIlemon-

ade, refreshing, beverageJ!&

          
(
3
-
!
"&
#
#O ex-
pand(T')!"
) >0! /
# 1,,
#B,
#!
838
"
#
StK
t
S
. &
K
,
S
=S
K
*
S
=K

,
S

{T

T ' ∈S ∧ T ∈expand T ' }
K
t*
S
=K
t
S

{T

T ' ∈K
t
S
∧ T ∈expand T ' }
"#
3
!
 # ex-
pand(T') !;
Fet al.(1,,:-
  reckless bootstrapping  
 #      

!&
 

3
!"
"% near-miss$
I


3
0

J"%
  

   
0

(-



0

( -!&
#
"%
 #
"%(
"%-!&

$
K

tP
S
=K
t
S

{
T

T '∈K
t
S

T ∈ filter
near−miss
 expand T '
}
;*1
               
!
Q*
%
ND
 <,
"%N
  
    !  & "% near-
miss 
   
            

#!
;*$)#
B!
;1$)
#B
!
4.1 An Example
cola
$Icola, refreshing, beverageJ!>
cola
effervescent beverage
sweet beverage   nonalcoholic beverage 
!>
sugary foodfizzy drinkdark mixer!
> sensitive
beverage everyday beverage common
drink!>
 irritating food  unhealthy drink!>
       stimulating
drinktoxic foodcorrosive substance!
cola
*<*<A1
D1A+A<*,1
B!
 refreshing beverage 
         champagne
lemonadebeer!
0 1 2 3 4 5
0
200000

400000
600000
800000
1000000
1200000
1400000
1600000
1800000
WordNet
Simile
ConceptNet
Bootstrapping Cycle
# Triples
0 1 2 3 4 5
0
50000
100000
150000
200000
250000
300000
350000
WordNet
Simile
ConceptNet
Bootstrapping Cycle
# Terms
839
5 Empirical Evaluation
&"% near-miss 

(0

-
     
(

-
 (
3
-   #
!&
            
#
#
3




#

3
!;
          >  
0  (1,,B-        
#
O

    "
%!

>0
    <,1      1*    
"%!&#
( hot
red!-#R(a|an|
the) * C
i
(is|was)R 
3



!
 not # 


#
(0

-(

Temperature  hot
-!&#+<+:+
<,1!&

            '&/
F(1,,1-!4
                  <,1
    1*      >
0

1*  "%          
!&
"%
(-
!
3B6!DL
 
"%!4 
 

61!DL!.0>
,!61D
,!AA:B*A<B
<,1!
*
We replicate the above experiments using the
same 402 nouns, and assess the clustering accur-
acy (again using WordNet as a gold-standard)
after each bootstrapping cycle. Recall that we use
only the D
j
fields of each triple as features for the
clustering process, so the comparison with the
WordNet gold-standard is still a fair one. Once
again, the goal is to determine how much like the
human-crafted WordNet taxonomy is the tax-
onomy that is clustered automatically from the
discriminating words D
j
only. The clustering ac-

curacy for all three seeds are shown in Tables 2,
3 and 4.
Cycle E P # Features Coverage
1
st
!A1D !61+ +,D 66L
2
nd
!1BA !D*1 *<:1 DDL
3
rd
!1D1 !D*D 1**< :1L
4
th
!A*1 !6<, 1<DA :AL
5
th
!1:+ !6:< 1DB1 :AL
&1$WordNet
(2200-
Cycle E P # Features Coverage
1
st
!**B !:<1 A6A <*L
2
nd
!1BB !D1< D:D B+L
3
rd
!1:6 !6+< *A61 D<L

4
th
!1D+ !6+< *:BA D+L
5
th
!1++ !6DA 11D< :1L
&A$ConceptNet

Cycle E P # Features Coverage
1
st
!1B< !D*6 :AD B+L
2
nd
!1:, !D*1 *AA: DAL
3
rd
!1:+ !6+A *+<< D+L
4
th
!A*A !66, 1A*1 :1L
5
th
!*BD !:<A 16*< :1L
&<$Simile

& <,1 
#casuarina, cinchona, do-
decahedron concavity>
*

"
!"=
,!61D61!DL!
840
0 
 #B *,,
4%!'
                
          0    >
#B
    !    
#(
&*-
(S:1L-B ! &

 yesteryear, nonce ( -
salient(3-jag, droop,
fluting, fete, throb, poundage, stinging, rouble,
rupee, riel, drachma, escudo, dinar, dirham,
lira,dispensationhoardairstream(
-riversidecurling!;
A<
$

#
!
;A$)
!
;<$0
!&

0>$
H,!61D!
4    "%    %  
6:L6DL
    B      
61!DL
    0    >! 5

:<!AL            
66!<L0>
 and ( Tem-
peratureColor!-
D,!+L
!;
<,1
 G5(1,,:-

 (6+!:BL-!
4
#
!&
316*<
    0    >    B*A<B
 
 !&  

 
0>!
6 Conclusions
&

#

            
B
!%#
           

         
  !  4    


"%   
0>!"

!
&
        #!  ;  
G5(1,,D-
          
O=
        

  
!G
5(1,,:-
  

1 2 3 4 5
0.40
0.45

0.50
0.55
0.60
0.65
0.70
0.75
0.80
0.85
0.90
WordNet
Simile
ConceptNet
Bootstrapping Cycle
Coverage
1 2 3 4 5
0.40
0.50
0.60
0.70
0.80
0.90
1.00
WordNet
Simile
ConceptNet
Poesio & Alm.
Bootstrapping Cycle
Purity
841
 

 !
7D
j
C
i
8
       
7D
j
P
k
 C
i
8=
C
i

 D
j
! 7 8

 
!
>
 :1L
0> !
    B        
A1:
<,1  
:*!B+L !


!>

#
!  &      

 #F et
al. (1,,:-
!    
      #        
 
#  
       #      
!
References
>&!2.!(*+::-!0!&
0!
In Proc. of the 26
th
>.>'
1*D 11<!
>  >!    0  .! (1,,B-!  
'          "!  
Proc. of the annual meeting of the Cognitive 
Society?!
4  >!    5  )! (1,,6-!   2
"% .'Q
!Computational Linguistics,A1(*-$*A <D!
0!"?!(1,,D-!>>
#    Q  T      

"! In Proc. of the 45
th
Annual Meeting of the
ACL::: :+B!
2!4.!(*+++-!;
    ! In Proc. of the 37
th
Annual
Meeting of the ACLBDN6<!
2  /!  F  !    !    .!
0> .!"!!&!
U>! (1,,<-!" 
    F>  (  -! In
Proc. of the 13
th
WWW Conference*,,N*,+!
5F!?!(*+:6-!52;$>.
0! In Proc. of the 5
th
National Con-
ference on Artificial Intelligence    16D 1D*
00!>>
>!
50!(1,,<-!"%$"V
Proc. of GWC’2004, the 2
nd
Global WordNet con-
ference.4!
5  .! (*++1-!  >  #    
! In Proc. of the

14
th
Int. Conf. on Computational Linguistics  
BA+NB<B!
F  G!  Q  !      &!  >!
(1,,B-!&.$
! Int. Jour-
nal of Web and Grid Services*(1-1<, 166!
F  )! (1,,1-!  '&/$  >    !
Technical Report 02-017.!
$OO !!!OSOO!
FW!Q2!52!(1,,:-!
 ' "5
0')!In Proc. of the 46
th
Annu-
al Meeting of the ACL.
'!4!)Q!G!(*++,-!4
   $      
3!%U$> "!
'5!0!(1,,<-%$>0
Q&! BT Technology
Journal11(<-$1** 116!
.)!4Q!;!)!
.F!?! (*++,-! "%$
 !!?'
A(<-$1ABN1<<!
%!0>!(1,,*-!&
!In Proc. of the 2
nd

International Con-
ference on Formal Ontology in Information Sys-
tems (FOIS-2001)!
Q!?!%>!U!(1,,<-!'

! Advances in Neural Information Process-
ing Systems*D!
G&!5U! (1,,D-!.'/
; !In Proc.
of the 45
th
Annual Meeting of the ACLBDN6<!
G&!5U! (1,,:-!>;F
Q )
.!In Proc. of Coling 2008, The
22
nd
International Conference on Computational
Linguistics.!
842

×