trochee: (Default)
trochee ([personal profile] trochee) wrote2004-04-10 06:52 pm

MI in the Matrix

Hey, [livejournal.com profile] chr0me_kitten: [livejournal.com profile] moretea has just posted the ten highest-mutual-information word-pairs in the Matrix.

I think the top three are (unsurprisingly): agent jones, mr anderson, and agent smith. I asked him to post the code for more Matrix script-analysis. I wonder what Bible Code methods would uncover in there...
ext_8724: (Default)

[identity profile] chr0me-kitten.livejournal.com 2004-04-10 07:47 pm (UTC)(link)
okay, now you have me wanting to read up on information theory. i wish i had a better head for math.

[identity profile] moretea.livejournal.com 2004-04-12 02:59 pm (UTC)(link)
What!

Swooning kittens!

I am so all over that. I got the code here:

http://crl.nmsu.edu/~raz/Ling5801/papers/PerlIntro/associative.html#wordcounts

It's from an old course by Chris Manning (now at Stanford); an introduction to Perl for NLP.

However, there just a few Mac-isms in there that I edited out, since I use Linux, but unless I'm on crack the thing should run in Windows now as well. (And if I'm not mistaken the Mac-isms are os9 stuff, anyway -- OSX people needn't worry.)

And now that this comment has more lines than the code itself:

http://fieldmethods.com/code/mutual

pat@fieldmethods.net for the inevitable bugs... it's totally hacked together.

a much more credible package for doing Mutual Info and a zillion other collocational measures besides is the ngram statistics package:

http://search.cpan.org/~tpederse/Text-NSP-0.67/Docs/FAQ.pod

;)

& chrome_kitten might be interested in this:

http://del.icio.us/patfm/information_theory

the first link is a pretty good layman's intro to mi and info theory -- the book, the second link, is great but definitely requires somem math background (you might like it, trochee!)

cheers...