trochee: (Default)
trochee ([personal profile] trochee) wrote2008-08-15 03:20 pm

perhaps there's a future in MT

From Neil Gaiman's blog, a bit of MT poetry:
Mr. Gaiman, my name is Bruno D'Alincourt, and my question is, how you draw up its dialogues?
If you speak alone, get you.
If you use your cats.
His family.
Your friends.
Or another case to let their texts flow as if they were called in real life.
I know that the dialogues that make the story (For more fícção or description that is) more 'family' possible, as had already been counted and so many can identify with it.
Since already thank you very much.
But unless we see more, having a good morning, good afternoon and a good night.
And you are truly happy.
What all you want God to give you twice.
And do not forget what happened to the man who has everything I wanted ...
... He had a happy life for all forever.
I hope that this humble reply fan.
Me sorry for my English badly written, promise better.
Anything we see in the future.

[Neil says:] I don't really know what it is you're trying to find out, Bruno, but I think you ought to know that what the translator program turned it into was practically poetry, if it wasn't already.
isn't that pretty?

[identity profile] localcharacter.livejournal.com 2008-08-15 10:59 pm (UTC)(link)
Sorry to get literal on you here, but from the word "ficção" you can tell it's Portuguese, which at least explains some of the problems: in Brazil, 2nd person address usually uses the 3rd person verb forms, possessives, etc.--only the pronoun itself is different. Good luck with MT on that!

[identity profile] trochee.livejournal.com 2008-08-15 11:06 pm (UTC)(link)
I agree that it must be portuguese. And yeah, the MT system would need a lot of context to know this is a 2nd person environment (question context might help a little, statistically speaking, but it's a very challenging problem).

No worries about the literal -- I liked it for its non-literality, but it's also a very good example of how genre-information is key. This is a personal letter, but the MT system was probably trained on parallel parliamentary proceedings (if it's statistical) or other third-person text (if non-statistical).

[identity profile] damidnara.livejournal.com 2008-08-16 04:49 am (UTC)(link)
I've recently started reading his blog. I'm so excited he's coming to SF in a few months! Thanks for reminding me.