CS 360: Natural Language Processing
Winter 2004
Welcome to CS 360! This class is an introduction to the field of
natural language processing, or computational linguistics, with
particular emphasis on statistical methods. A general CS background is
assumed; basic information theory and linguistics will be covered.
This class meets 3rd hour on MWRF, in D-205.
Course handouts
Relevant links
Papers being read
- Project 1:
- Demb07:
Vera Demberg, ``A Language-Independent Unsupervised Model
for Morphological Segmentation''. ACL '07. For implementation.
- Kesh05:
Keshava and Pitler, ``A simpler, intuitive approach to morpheme
induction'', MorphoChallenge '05 (the RePortS paper).
- HMM/POS papers:
- Ha06:
Reduced n-gram models for English and Chinese corpora (to be
presented by Yvonne, Wednesday)
- Moon07:
Part-of-speech tagging for Middle English through alignment and
projection of parallel diachronic texts (to be presented by Nic,
Thursday)
- Gold07:
A fully Bayesian approach to unsupervised part-of-speech
tagging (to be presented by Matt, Friday)
- Huan07:
Mandarin part-of-speech tagging and discriminative reranking (not
chosen)
- John07:
Why doesn't EM find good HMM POS-taggers? (not chosen)
- Parsing papers:
- Reic07:
Self-Training for Enhancement and Domain Adaptation of
Statistical Parsers Trained on Small Datasets (Nic, Friday 15th)
- Segi07:
Fast unsupervised incremental parsing (Matt, Monday 18th)
- Fili07:
Recovery of empty nodes in parse structures (Yvonne, Wednesday 20th)
- Smit07:
Bootstrapping feature-rich dependency parsers with entropic priors (not
chosen)
- Hall07:
k-best spanning tree parsing (not chosen)
- Project 2:
- Klem06:
Alexandre Klementiev and Dan Roth, ``Named entity transliteration and
discovery from multilingual comparable corpora''. NAACL '06. For
implementation.
- system data flow plan
- WSD/clustering papers:
- Both
- Cuce07:
Cucerzan, "Large-Scale Named Entity Disambiguation
Based on Wikipedia Data" and
- Miha07:
Mihalcea, "Using Wikipedia for Automatic Word Sense Disambiguation"
(Matt, Monday)
- Snow07:
Snow et al, "Learning to merge word senses" (Yvonne, Wednesday)
- Spec07:
Specia et al, "Learning Expressive Models for Word Sense
Disambiguation" (Nic, Friday)
- Rose07:
Rosenberg and Hirschberg, "V-Measure: A conditional entropy-based
external cluster evaluation
measure" (not chosen)
- Zhu07:
Zhu and Hovy, "Active Learning for Word Sense Disambiguation with
Methods for
Addressing the Class Imbalance Problem" (not chosen)
Lecture "slides"
- 4 Jan: left,
middle,
right
- 7 Jan: left,
middle,
right
- 9 Jan: left,
middle,
right
- 11 Jan: left,
middle,
right
- 14 Jan:
middle,
right
- 16 Jan: left,
middle,
right
- 18 Jan:
middle,
right
- 21 Jan: left,
middle
- 23 Jan: left,
middle,
right
- 25 Jan: left,
middle,
right
- 28 Jan: left,
middle left,
middle right
- 4 Feb: left left,
left (right),
middle,
right
- 7 Feb: left,
middle
- 11 Feb: left,
middle left,
middle right
- 15 Feb:
middle,
right
- 20 Feb:
middle,
right
- 21 Feb:
middle
- 22 Feb:
middle
- 27 Feb:
middle
Homeworks, projects
Don Blaheta /
dblaheta@knox.edu