trochee

Today's [ed -- monday's] project was bizarre, but entertaining. We were asked to work in teams to compete to write grammars of English, that were scored against each other.

Each team wrote a context-free-grammar from a fixed-vocabulary of English, and assigned probabilities to each expansion. Then each sentence generated (and ranked as grammatical by blinded judges) was piped through each grammar, and the grammar that assigned the highest combined entropy was the victor.

My team came in dead last. I know too much about the linguistics, and we didn't scope the project work properly. By the time they announce "five minutes to begin the competition" we had uncovered the massive holes in our ideas and just barely managed to add enough rules to assign non-zero probabilities to all possible sentences.

Sadly, our grammar was not successful in generating many sentences.

I've never played competitive grammar building before. I want a copy of the harness code for my computational linguistics students...

1 S NP VP 1 NP ProperName 1 NP Det NBar 1 NBar Adj NBar 1 NBar N 1 VP VTrans NP 1 VP VIntrans 1 VTrans kiss 1 VTrans smite 1 VTrans carry 1 VIntrans rise 1 VIntrans fall 1 Adj red 1 Adj blue 1 Adj heavy 1 ProperName Arthur 1 ProperName Gwen 1 Det a 1 Det an 1 Det the 1 N sparrow 1 N coconut 1 N wind

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30

Most Popular Tags

books - 8 uses
brains - 13 uses
busy - 29 uses
comics - 13 uses
computers - 43 uses
culture - 14 uses
diary - 62 uses
double dactyl - 1 use
dream - 1 use
home - 6 uses
lazyweb - 8 uses
linguistics - 1 use
link - 29 uses
mcwhirtle - 1 use
meme - 14 uses
moving pictures - 14 uses
nerd - 1 use
patterns - 1 use
personal - 15 uses
philosophy - 3 uses
poetry - 9 uses
politics - 15 uses
quotidiana - 1 use
school - 47 uses
silly - 32 uses
software - 1 use
theory - 17 uses
wedding - 2 uses
words - 20 uses
worry - 7 uses

Flat | Top-Level Comments Only

From:

lx.livejournal.com

Sounds fascinating!

How did you notate/enter the rules in this context-free-grammar? Or is the answer over my non-linguistics-studying head? :) And what kind of harness code was involved?

trochee.livejournal.com

I hope it's not too late to answer this question... I've had kinda poor internet access, despite this being a computer language conference.

The rules looked like this (sorta):

The first column indicates relative weight (likelihood in text generation, score in parsing), the second column indicates the symbol.

The harness code was made up of two things:

a parser that, given a sentence, searched this grammar for the highest "weight" sentence
a generator that began at the S token and randomly selected expansions ("randomly" being "according to their proportional weight").

I doubt that programmers would find this part challenging -- the hard part was trying to write a CFG for English.

Competitive grammar...

Competitive grammar...

Crikey!

Re: Crikey!

Profile

June 2016

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags