Competitive grammar...
Jun. 30th, 2003 05:06 pmToday's [ed -- monday's] project was bizarre, but entertaining. We were asked to work in teams to compete to write grammars of English, that were scored against each other.
Each team wrote a context-free-grammar from a fixed-vocabulary of English, and assigned probabilities to each expansion. Then each sentence generated (and ranked as grammatical by blinded judges) was piped through each grammar, and the grammar that assigned the highest combined entropy was the victor.
My team came in dead last. I know too much about the linguistics, and we didn't scope the project work properly. By the time they announce "five minutes to begin the competition" we had uncovered the massive holes in our ideas and just barely managed to add enough rules to assign non-zero probabilities to all possible sentences.
Sadly, our grammar was not successful in generating many sentences.
I've never played competitive grammar building before. I want a copy of the harness code for my computational linguistics students...
Each team wrote a context-free-grammar from a fixed-vocabulary of English, and assigned probabilities to each expansion. Then each sentence generated (and ranked as grammatical by blinded judges) was piped through each grammar, and the grammar that assigned the highest combined entropy was the victor.
My team came in dead last. I know too much about the linguistics, and we didn't scope the project work properly. By the time they announce "five minutes to begin the competition" we had uncovered the massive holes in our ideas and just barely managed to add enough rules to assign non-zero probabilities to all possible sentences.
Sadly, our grammar was not successful in generating many sentences.
I've never played competitive grammar building before. I want a copy of the harness code for my computational linguistics students...
Crikey!
Date: 2003-07-01 03:07 pm (UTC)How did you notate/enter the rules in this context-free-grammar? Or is the answer over my non-linguistics-studying head? :) And what kind of harness code was involved?
Re: Crikey!
Date: 2003-07-06 09:01 am (UTC)I hope it's not too late to answer this question... I've had kinda poor internet access, despite this being a computer language conference.
The rules looked like this (sorta):
The first column indicates relative weight (likelihood in text generation, score in parsing), the second column indicates the symbol.
The harness code was made up of two things:
- a parser that, given a sentence, searched this grammar for the highest "weight" sentence
- a generator that began at the S token and randomly selected expansions ("randomly" being "according to their proportional weight").
I doubt that programmers would find this part challenging -- the hard part was trying to write a CFG for English.