SofieVanGijsel∗&CarlVogel†
Abstract
TechniquesfromcorpuslinguisticsareappliedtotheanalysisofanumberofEuropeanright-wingpartiesinanefforttoextendmethodsforrankingpartiesonaleft-rightspectrumwithinandacrosscountriesandlanguages.Focusisplacedonpartiesnotingovernment,andanalysisisderivedfromcorporaderivedfromelectionmanifestospublishedbythoseparties.Thetechniquesappliedareobjectiveinthattheyapplystatisticalmeasureswithconfidenceteststoobjectivelyquantifiablelinguisticfeaturesofthedocuments.Validapplicabilityofthetechniquesisdemonstrated.ThemethodsarethenusedtoestimatepairwisesimilarityofanumberofEuropeanpoliticalparties,includingcross-nationalcomparisons.
1Introduction
WereportonapplicationofrecentcorpuslinguisticmethodstotheanalysisofanumberofEu-ropeanright-wingpartymanifestos.Inrecentmanifestoresearchaimingatestimatingthepolicypositionsofgovernmentalpartiesofanationinanobjectiveway,computerizedapproacheshavebeenusedtolocatepartiesonaprioriestablishedpolicydimensions(seeanynumberofarti-clesinLaver,2001a).Wefocusontherelativelyunresearchedpolicyspaceof(often)small,non-governmental,right-wingpartiesofanumberofEuropeancountries.Anaimistoidentifyobjectivemeanstorankthesepartiesonapoliticalspectrumusingonlylimiteddataavailablefromsuchparties.Weuseaninductivemethodwhichtreatselectionmanifestosascorporatobeanalyzed.Thus,insteadofinideologicalterms,themanifestosarecomparedonthebasisoflinguisticallyquantifiablefeatures.Clearly,ideologicalissuesenterintheselectionofpartieswhosemanifestosareexamined,butbeyondthesepre-theoreticchoices,content-freestatisticaltechniquesareusedtorankthelevelofsimilaritybetweentheparties.
Aninitialmethodologicalquestionisindeterminingwhetheritislegitimatetoconsidertheright-wingmanifestos,allclearlybelongingtoonesubgenre,as‘corpora’whicharedistinguish-ableonthebasisofsignificantlinguisticdifferences.Asrecentresearchincorpuslinguisticsshows(Kilgarriff,2001),inordertovalidlycomparecorpora,theirinternalhomogeneityhastobelargerthanthedistancebetweenthem.Tomeasurethewithin-corpusdistances,weapplyre-centlyproposedauthorshipidentificationtechniques(AID),attemptingtoassignsubpartsofthemanifestoscorrectly.Thisway,itispossibletocross-validaterecentattributionalresearchwhichshowsthatsubstringsofwordsareexcellentauthordiscriminators.Weestablishtheinternalhomogeneityofthecorpora,prerequisiteformeasuringthesimilaritylevelsamongthem.
Asecondquestionisthenifacorpussimilaritymeasurecanbeappliedtoevaluatethedis-tancebetweenthedifferentparties,bothonanationalandacross-nationallevel.Therecently
χ
)givesproposedChibyDegreesofFreedomsimilaritymeasure(Kilgarriff(2001);hereafter,d.f.arankingwhichwewillattempttointerpretasanindicationofthepositionofthedifferentpartiesinacommonpolicyspace.Theresultssuggestencouragingpotentialfornewmethodsinanalyzingmanifestosinpoliticalscienceandotherfieldsinwhichtext-basedinductionofpartially-orderedposition-spacesisuseful.Wearguetheobjectiveanalysisofsmall,‘reallan-guage’setsoftextsascorpora,isaninteresting,albeitchallengingfieldofcorpuslinguistics.
QuantitativeLexicologyandVariationalLinguistics,KatholiekeUniversiteitLeuven,Belgium:Sofie.VanGijsel@arts.kuleuven.ac.be†
ComputationalLinguisticsGroup&CentreforComputingandLanguageStudies,TrinityCollege,U.ofDublin:vogel@tcd.ie
∗
1
2ManifestoResearchinPoliticalScience
Manifestoanalysisisconsideredafruitfulwayofgaininginsightintothepositionsofpoliticalpartiesinonepolicyspace(Mair,2001;Laver,2001b).TheManifestoResearchGroupcollectsandanalyzespoliticalprogramsbywayofcomparativecontentanalysis,classifyingeach‘quasi-sentence’accordingtoacodingschemeof56categories,whichbelongtoaprioriestablisheddimensionsofthepolicyspace(e.g.economicalleft-right,socialliberal-conservative).Therationalebehindthissystemissaliencetheory(Budge,2001),whichstatesthatthesalienceofanissueinthemanifestoprovidesinformationaboutthepositionofthepartyonthatissue.Yet,thistheorycanbecriticized:forexample,immigrationwillbeahotissueformanyparties—especiallyfortheresearchedright-wingparties—butmentioningtheissueintheprogramdoesnotautomaticallypointatbeing‘infavor’or‘against’it.Thismethodalsorequiresalargeamountofhumancodingeffort,whichistimeandmoneyconsuming,withoutbeingcompletelyobjective.Therefore,recentmethodsanalyzethemanifestosinamorequantitativeway.
AfirstimprovementisthecomputerizedcontentanalysisproposedbyLaverandGarry(2000).Onthebasisoftworeferencetextsormanifestosofpartiesforwhichthepositiononanumberofpre-establishedpolicydimensionsisknownapriori,theresearchersmakeupakeywordlist,1whichwillthenbeusedtocodeothermanifestosor‘virgin’texts.Yet,thecompo-sitionofthekeyworddictionaryisnotonlytimeconsuming,butalso,thevalidityoftheanalysisishighlydependentonthekeywords,whicharesensitivetoboththesubstantiveandthetemporalcontextofthereferencemanifestos.2Therefore,Laver,Benoit,andGarry(2003)recentlyhaveproposedaprobabilisticdictionaryapproach,measuringtherelativefrequencyofallthewordsinthereferencestexts.Fortheanalysisof‘virgin’texts,thepolicypositionisthendeterminedonthebasisofthescoresforallthewordswhicharegivenacertainscoreonadimensionunderinvestigationonthebasisofthereferencetexts.Thismethodallowsrapidanalysisandreanalysisoflargequantitiesoftexts.Itisalsoapplicabletonon-Englishtexts,anadvantageifmanifestosarecomparedcross-nationally.Yet,thereliabilityisstillhighlydependentonthechoiceofrefer-encetexts.Positioningvirgintextsonaprioriestablisheddimensions,abstractedfromreferencetexts,mightbeagoodapproachforwell-researchedpolicyspaces,butfortheanalysisoftheoftensmallandnon-governmentalright-wingpartiesanalyzedinthisproject,thisisnotoptimal.Insteadofusingpre-establisheddimensions,weattempttoanalyzethemanifestosinductivelyintoapartial-ordering,treatingthecompletetextsascorpora(alsosensitivetotextchoice,butbecauseofthepartiesanalyzed,thisamountstoalloftheavailabletext,ratherthanchoice),thedistancesamongwhichcanbemeasured.Thedistancescanonlybeinterpretedaposteriori.3
AuthorshipIdentificationTechniques(AID)
Asexplained,theinternalhomogeneityofthemanifestoshastobeestablishedbeforeavalidcorpuslinguisticcomparisonispossible.WeuseanumberofAIDtechniquestoprovethatthewithin-corpusdistancesaresmallerthanthethosebetweenthemanifestos.First,ashortoverviewanddiscussionofAIDmethodsusedintheanalysisofstyleorstylometrywillbegiven.Oakes(1998)andHolmes(1998)(forexample)providemorecomprehensiveoverviews.Themethodsweadoptareoutlinedin§3.2;later,§4and§5detailouranalysis.
Everywordwhichoccursatleasttwiceasmanytimesintheright-ortheleft-wingreferencetextisclassifiedasaright-orleft-wingkeywordrespectively.2
SeeVanGijsel(2002,p.82-88)fortheimplementationofakeyworddictionaryforDutch,asdevisedbydeVries(1999).Theresultsshowthatfortheanalysisofright-wingpartymanifestosofBelgiumandFlanders(theDutch-speakingpartofBelgium),whichentailsacross-nationalandtemporalextension,thekeyworddictionarydoesnotgivevalidresults.
1
2
3.1OverviewoftheAIDtechniques
StylometryasanAIDtechniquedatesatleastto1851,whenthelogiciandeMorgansuggestedthattheauthenticityofsomelettersofStPaulmightbetestedcomparingthewordlength.Yule(1944)developedameasureofvocabularyrichness,K,basedontheprobabilitythatanyran-domlyselectedpairofwordsareidentical.Overtheyears,anumberofothervocabularyrichnessmeasuresasdiscriminatorshavebeenproposed,suchas,forexample,thetype-tokenratioortheproportionofuniquewordstothetotalsizeofthevocabularyused(e.gMorton,1986),althoughmorerecentresearch(e.g.Holmes,1998)showsthatthesetechniquesarenotreliable,beinghighlydependentonthechoiceandlengthofthetextsunderanalysis.MostellerandWallace(1964)famouslyattributedofthepurposelyanonymous,disputedFederalistPaperstoMadi-soninsteadofHamilton,onthebasisofaprobabilisticanalysisofthemostfrequentwords.Theseweremainlyfunctionwords,whichareratherunconsciousandthereforeeffectivemark-ersofauthorship.Whilemostmeasurestakethelexicalitem(orpre-terminallexicalcategoriesasparts-of-speech)astheunitofanalysis,somerecentmethodsfocusonsublexicalunits,es-peciallyletteruni-andbigrams.Withoutrequiringsyntacticorlexicalanalysis,theseelementsareeasilyandobjectivelyquantifiable,whilebeingusefulfortextsofvaryingandlimitedlength(e.g.Forsyth(1997),KhmelevandTweedie(2001),Chaski(1998)).
Inliterarystylistics,theCusumtechnique(Farringdon,1996)hasbeendeveloped;itgraph-icallyplotstheaveragesentencelengthofanauthor’ssample,superimposedbyplotsforthefrequencyofaselected‘linguistichabits’oftheauthor,suchastheuseoftwoandthreeletterwords.Thetechniquehasbeencriticizedforbeinglaborintensiveandhighlysubjective,e.g.withregardtothechoiceofthe(limited)numberofsentencesanalyzed,choiceofselectedlin-guistichabitsandtheinterpretationoftheplots(Canter,1992;Chaski,1998).Foster’s(2001)analysisofthe‘literaryDNA’ofawriterisakintotheCusummethodandcansimilarlybecriticizedforbeingsubjectiveandunscientific.Fosterclaimstouncoverauthorshiponthebasisof‘external’(e.g.thehistoricalbackgroundofawriter)and‘internal’evidence(e.g.charac-teristicssuchaspunctuationhabits),buthisrecentincorrectattributionof‘AFuneralElegy’toShakespeareinsteadoftoJohnFordrevealsthemethodologicalunreliabilityofhismethod.3.2
TheAIDtechniquesimplemented
WehaveexploredAIDmethodsavailingofletterunigramsandbigrams,sincetheycouldbeap-pliedcross-linguistically,andwithoutsubjectivecontentbasedjudgements,tosmall,unequally-sizedtexts.Thus,letterunigramsandletterbigramswerecounted.Further,wordunigramswerecounted,totestifsubstringsgivebetterresultsthanwordcounts.
McCombe(2002)soughtcross-validationofanumberofAIDtechniquesandconfirmedre-centwork(e.g.Chaski,1998)inthatletteruni-andbigramsperformremarkablybetterthan,inthatorder,wordunigramfrequency,syntactictagging,highern-gramsorkeywordsasmet-ricbasesforpredictingauthorshipofdisputedtexts.WeusedMcCombe’ssoftwaretotestthevalidityofdifferentAIDmethodsinassigningarbitrarilyselectedsubpartsofthemanifestostothecorrectparty.Fordetailedanduser-orienteddescriptionsofitsfunctionalityseeMcCombe(2002)orVanGijsel(2002).Theprogramtakesaninputfileconsistingofnamesofplaintextfiles,labeledtoencodeoneormoreuncontestedcategories,orasfilestobecategorized.Giveninputparameters(e.g.lettervs.wordn-gramanalysis,thevalueofnton-gram,etc.),thetextsareconcordancedandfrequencyanalyzed.Theprogram’soutputisapairwiseranking,giving
χ3
.Here,threethesimilarityofthevariouscorporainreversemagnitude,ascalculatedbyd.f.rankingsaregiven(letteruni-andbigramsandwordunigrams),constitutingaranklist.
χ
Thed.f.measureinsteadofsimplyχ2isused,sincethistakesintoaccountboththeχ2valueandthefrequencyinformationofthecorpora.Thisisusefulfornaturallanguagecorpora,likethemanifestos,whichareinherentlynon-randomlydistributed(Kilgarriff&Salkie,1996)3
3
Thisranklististheinputfortwostatisticaltests,whichcompareresultsofthetestsasrunwitharangeofparametervalues.First,theratiobetweentheaverageofthesimilarityscoresforallthepairsofcorporainthesameuncontestedcategoryandtheaveragesimilarityscoresforallthepairsofcorporaindistinctuncontestedcategories.4Thelargertheratio,themoresuggestivethemeasureis.McCombe(2002,p.37)noticesthattherankingoftheassignmentscoresisoftenamoredirectindicationoftheattributionalresult.AsecondtestistheMann-Whitneytest(alsocalledtheWilcoxonranksumstest;seeOakes,1998),5whichgivesanoverallsignificancemeasureforeachofthethreemethods,whilealsooutputtingamoredetailedlistofsignificancemeasuresforeachofthethreemethods,showingtheprobabilityoftheassignmentofeachoftheanonymouslycodedtextstothedifferentauthors.4AnalysisoftheManifestosUsingAIDTechniques4.1DataCollection
Themanifestoswerecollectedbydownloadingthetextsfromtheirrespectivepartywebsites.Tokeepthehumaninterventiontoaminimum,the(thematic)subpartsofthewebsiteswerekeptintactasseparatefilesofcomparablesize,butthenumberofthemesbypartyvaried.Inthispaper,theanalysisoftheDutchlanguagemanifestosisaddressed,bothonanationalandacross-nationallevel.ForTheNetherlandsweanalyzedthemanifestosofthepartiesLijstPimFortuyn(ListPimFortuyn,LPF)andLeefbaarNederland(LiveableNetherlands,LN).TheLPF-manifestoconsistsofasingletextofalittleunder4,000words,whiletheLN-textcontains10subparts(justover10,000wordsintotal).TheBelgianpartymanifestooftheVlaamsBlok(FlemishBlock,VB)wasdownloadedin13chunks,amountingtomorethan20,000words.64.2
AnalysisoftheManifestosinOneNation
WeanalyzeLPFandLNaswithin-NetherlandsDutch-languageparties.Distinguishingthetwopartiesisapotentiallydifficulttask,sincetheyoriginallyformedoneparty,thepopulistpartyLN,foundedinJune2001,withPimFortuynaspartyleader.Afterbeingoustedforblatantanti-Muslimcomments,Fortuynlaunchedhisownnationalparty,LPF.WhileitisoftenclaimedthatLNisapopulistratherthananextremerightparty,LPFcanbeexpectedtobeslightlymoreright-wing(Buyse,2002).Yet,Fortuynwasopenlyhomosexualandadvocatedliberalsocialvalues,whichareverydifferentfromtraditionalright-wingvalues.InordertocheckiftheAIDmethodscoulddistinguishbetweenthetwomanifestos,asubpartoftheLN-manifestowascoded‘anonymous’,whiletheothersubparts(i.e.theother9LN-subpartsandtheLPF-part)weregivenanarbitrarycode(i.e.lforLNandpforLPF).ThetaskgiventotheprogramistoassignthesubparttoLNinsteadoftoLPF,usingAIDmethods.
Weconcordancedthemanifestosubpartsusingletterunigrams,letterbigramsandworduni-grams.Then,boththesimilarityratioandMann-Whitneywerecalculated.
Letterunigrams1.299
ln4fitsincategorylln4fitsincategorypp<0.0005Letterbigrams1.131
ln4fitsincategorylln4fitsincategorypp<0.025Wordunigrams1.028
ln4fitsincategorylln4fitsincategorypp<0.25RatioRankingMann-WhitneyTable1:ResultsofAID-testsclassificationof‘anonymous’subtextln4toLN(l)vs.LPF(p)
Inauthorshipattributions,thecategorycorrespondstoauthoridentity.
ThisisinspiredbytheproposalofKilgarriff(1996)forequally-sizedsubcorpora.6
Forthecorpusanalysisinonenation(§4.2)andinonelanguage(§4.3),repeatedtestsforseveralsubparts,foracommunistpartyandformanifestosofGermany,AustriaandGreat-Britaingavesimilarresults(VanGijsel,2002).
54
4
TherankingindicatesthatallthreetestscorrectlyattributethesubparttothecorrectmanifestoofLN,withahigherratiomeasureforletterunigrams,followedbyletterbigrams,indicatingthatletterunigramsperformbest.Similarly,theoutputofMann-Whitneyshowsthattheattributionishighlysignificantforletterunigrams(p<0.0005),7whileletterbigramsarealsosignificant(p<0.025).Bythistest,awordunigramcountisnotsignificant(p<0.25).Theseresultscross-validaterecentAIDwork,specificallyMcCombe’s(2002)results.Moreimportantly,theconsistentcorrectattributionpointsattheinternalhomogeneityofthemanifestos,whichcanthereforebeconsideredfully-fledgedcorpora.4.3
AnalysisoftheManifestosinOneLanguage
Sinceweintendedtocompareright-wingpartiescross-nationally,inasecondstep,themani-festosinonelanguagewereanalyzedsimilarly.SupplementingthemanifestosofTheNether-lands,themanifestoofthetraditionallyfascistFlemishpartyVB(FlemishBlock)isanalyzed.TheattributionalresultsforsubpartsoftheVB-manifestoareconsistent,validatingitasacorpus.TofurtherverifyiftheAIDmethodsarerobustenoughtocopewiththeinterferenceofinher-entlynation-andcontext-dependentelementsinthemanifestos,acommunistmanifesto,oftheFlemishpartyPvdA(PartijvandeArbeid/LabourParty),wasincludedasadummy.AlthoughtheattributionalresultsforsubpartsofPvdAarenotsignificant,repeatedtestsmakeclearthatintheoutputranking,thecommunistpartyisnotcloserrelatedtotheotherFlemishparty,VB,thantotheDutchparties,suggestingthatacross-nationalextensionoftheright-wingmanifestoanal-ysisisviable.Thus,internalhomogeneityofthemanifestosonacross-nationallevelshowsthewithin-corpusdifferencestobesmallerthanbetween-corporadifferenceslegitimatingmeasure-mentofdistancesamongtherepresentativecorpora,sothatinanextstep,thesimilarityamongthecorporacouldbemeasuredasanindicationoftheparties’politicalandideologicalpositions.5
PlacingRight-WingPartiesinaLeft-RightSpectrum
Inthissectionwediscussthedistancebetweenthepartiesasmeasuredbytreatingthemani-festosintheirentiretyascorpora.Ingeneral,statisticalmethodstoreliablymeasurethedistancebetweensmall,unequallysizedcorporaarescarce,Kilgarriff(2001)proposedχ2asa‘singlemeasure’ofdistancebetweeninternallyhomogeneouscorpora.Thepairwisesimilarityranking
χ
,isinterpretedasindicatingthelevelofsimilarityamongthemanifestos.Here,thebasedond.f.
manifestosintheirentiretyarecompared,byletterunigrams,whichconsistentlyemergedastheclearestmethodtodistinguishbetweenthem.Althoughtheanalysiswouldclearlybenefitfromabettersimilaritymeasure,enablingthedirectstatisticalcomparisonofanumberofcorporacross-linguistically,thismeasurewillbeinterpretedasindicatingthedistancesamongthetexts.
χd.f.LN-VBLPF-VBLN-LPF15.618.182.97p-valuep<0.0001p<0.001p>0.05Table2:ResultsoftheinversesimilarityrankingoftheDutchparties
TheinversesimilarityrankingshowsthatthedifferencebetweenLPFandLNisnotsig-nificant,onthebasisofletterunigramfrequencies.ThedifferencebetweenLPFandVBwassignificantata0.001levelandbetweenLNandVBevenata0.0001level.Thesefigurestiein
7
Notethatpmeasurestheprobabilitythatthesimilarityjudgementisduetomerechance.
5
withbackgroundknowledge:LNandLPFarebothpopulist,‘newstyle’rightwingparties,com-biningstronganti-immigrationviewswithliberalsocialvalues,whileVBisatraditionalfascistparty.Further,aswassaidbefore,LPFismoreright-wingthanLN,whichisalsoclearfromthehighersimilarityscoreforLPFwithVB.8
SimilaranalysiswascarriedoutforthemanifestostranslatedinEnglish(VanGijsel,2002,pp.93-95),9enablingextensionofthecross-nationalanalysis.Again,theinversesimilarityrankingbringsoutthedifferencebetween‘traditional’right-wingparties,likeforexampleBNP,andthe
¨,whicheclecticallycombinestronganti-morepopulist,new-styleright-wingparties,likeFPO
immigrationviewswithliberalsocialvalues;adifferencewhichcouldnotbetakenintoaccountwithanapriorianalysistryingtopositionthepartiesonpre-establishedpolicydimensions.106
Conclusion
WehavedescribedourattemptstolocateanumberofEuropeanright-wingpartiesinsinglecline,analyzingtheirmanifestosusingtoolsofcorpuslinguistics.Toverifyapplicabilityofcor-pustechniques,weappliedAIDmethodstoestablishthatintra-categorydifferencesaresmallerthaninter-categorydistancesamongthetexts.ThisconfirmedagainthatAIDmethodsusingletterfrequenciesarehighlyreliable,andverifiestheinternalhomogeneityofthemanifestosas
χ
corpora.Theresults,whichshowthatthemanifestoanalysisasmeasuredbyd.f.differentiates‘traditional’and‘new-style’right-wingparties,demonstratethatafullycomputerizedanalysis(specificallylackingcontentanalysis)cangiveinsightintherelativelyunresearchedpolicyspaceofright-wingparties.However,theanalysiscouldbenefitfrommethodologicalimprovementsandacross-linguisticextensionofthestatisticalmeasure.Thisworkillustratestheuseandlimitsofautomatedcorpuslinguistictechniquesforsmall,unequally-sized‘reallanguage’datasets.
References
Budge,I.(2001).ValidatingtheMRGapproach.InLaver,M.(Ed.),EstimatingthePolicyPositionof
PoliticalActors,pp.3–9.London:Routledge/ECPRStudiesinEuropeanPoliticalScience.Buyse,A.(Ed.).(2002).NieuwRadicaalRechtsinEuropa.Antwerpen/Amsterdam:Houtekiet.
Canter,D.(1992).AnEvaluationofthe“Cusum”StylisticAnalysisofConfessions.ExpertEvidence,
1(3),93–99.
Chaski,C.(1998).ADaubert-InspiredAssessmentofCurrentTechniquesforLanguage-BasedAuthor
Identification.InILETechnicalReport1098,pp.97–148.
deVries,M.(1999).GoverningwithYourClosestNeighbour:AnAssessmentofSpatialCoalitionFor-mationtheories.Ph.D.thesis,UBNijmegen.
Farringdon,J.M.(1996).AnalysingforAuthorship.Cardiff:UniversityofWalesPress.Withcontributions
byMorton,A.Q.,M.G.FarringdonandM.D.Baker.
Forsyth,R.S.(1997).ShortSubstringsasDocumentDiscriminators:AnEmpiricalStudy.Paperpresented
atACH-ALLC’97.
Foster,D.(2001).AuthorUnknown.OnthetrailofAnonymous.Macmillan:London,Basingstokeand
Oxford.
Holmes,D.I.(1998).TheEvolutionofStylometryinHumanitiesScholarship.LiteraryandLinguistic
Computing,13(3),111–117.
Khmelev,D.&Tweedie,F.J.(2001).UsingMarkovChainsforIdentificationofWriters.Literaryand
LinguisticComputing,16(3),299–308.
Settingasidepoliticalcontentof‘left’or‘right’andtaking∼torepresentsimilarity;<,strictdifference,wecanmakethefollowinginferencefrompairwisecomparisons:LPF∼LN,LPF Itcanberemarkedthatcross-linguistically,onlyimpressionisticconclusionsarepossible,linkinganon-EnglishmanifestosuchasforexampletheLN-text,whichisknowntobe‘populist’andcloselyrelatedtoLPF,ratherto¨,whichisclosertoLPF,thantoBNP.FPO 8 6 Kilgarriff,A.&Salkie,R.(1996).CorpusSimilarityandHomogeneityviaWordFrequency.InProceed-ingsofEuralex96. Kilgarriff,A.(1996).Whichwordsareparticularlycharacteristicofatext?Asurveyofstatisticalap-proaches..InLanguageEngineeringforDocumentAnalysisandRecognition.Proceedings,AISBWorkshop,Falmer,Sussex. Kilgarriff,A.(2001).ComparingCorpora.InternationalJournalofCorpusLinguistics,6(1),97–133.Laver,M.(Ed.).(2001a).EstimatingthePolicyPositionofPoliticalActors.Routledge. Laver,M.(2001b).PositionandSalienceinthePoliciesofPoliticalActors.InLaver,M.(Ed.),Estimating thePolicyPositionofPoliticalActors,pp.66–75.London:Routledge/ECPRStudiesinEuropeanPoliticalScience. Laver,M.,Benoit,K.,&Garry,J.(2003).ExtractingPolicyPositionsfromPoliticalTextsUsingWords asData.AmericanPoliticalScienceReview,97. Laver,M.&Garry,J.(2000).EstimatingPolicyPositionsfromPoliticalTexts.AmericanJournalof PoliticalScience,44(3),619–634. Mair,P.(2001).SearchingforPositionsofPoliticalActors.InLaver,M.(Ed.),EstimatingthePolicy PositionofPoliticalActors,pp.33–49.London;Routledge/ECPRStudiesinEuropeanPoliticalScience. McCombe,N.(2002).MethodsofAuthorIdentification.B.A.(Mod)CSLLFinalYearProject,TCD.Morton,A.Q.(1986).Once.ATestofAuthorshipBasedonWordsWhichAreNotrepeatedinthe Sample.LiteraryandLinguisticComputing,1(1),1–8. Mosteller,F.&Wallace,D.(1964).AppliedBayesianandClassicalInference:TheCaseoftheFederalist Papers.Reading:Addison-Wesley. Oakes,M.P.(1998).StatisticsforCorpusLinguistics.EdinburghTextbooksinEmpiricalLinguistics. Edinburgh:EdinburghUniversityPress. VanGijsel,S.(2002).ACorpusLinguisticAnalysisofEuropeanRight-WingPartyManifestos.Master’s thesis,CentreforLanguageandCommunicationStudies,TrinityCollege,UniversityofDublin.Yule,G.(1944).TheStatisticalStudyofLiteraryVocabulary.Cambridge:CambridgeUniversityPress. 7 因篇幅问题不能全部显示,请点此查看更多更全内容