home

Proof of concept for an ML written language

A little machine learning project I started (and left on undefinite hiatus) in 2022.

Now, I know what you're thinking: oh god, another art-abusing ai slop generator — now for conlangs too. But my whole goal with this was to make something from the ground up, which doesn't learn language from how it's used by humans, but instead creates its own (super domain-specific) "language" purely by being forced to convey some information.

In the little demo I made I generate 10 random numbers, use a neural network to turn those into the endpoints of 3 line segments. I turn those into an image, with some added noise and moving the endpoints around a little. Finally another neural network tries to decipher the original 10 numbers. The networks gets rewarded if the final 10 numbers are close to the original 10.

These are 6 examples after some training. On the left are 10 random numbers. Those are used to generate the image in the middle. The images are decoded into the numbers on the right.

The pie in the sky goal would be to generate a mini-language that is completely alien but still reasonable, interpretable by humans. What would it mean if it came up with a Verb-Subject-Object structure?

Artistically, does it make sense to give this to an AI? You could argue that I should instead try to come up with my own exotic ways for how language could work, to train my conlanging muscles. My hope with this was that creating a novel language automatically and then studying it might be easier than coming up with one myself, but who's to say I will know how to analyze it, or that I haven't accidentally programmed in a bunch of assumptions that force the language to behave like human languages. Whatever it comes up with won't really prove anything about language. It's safe to say this is more of an exercise in machine learning than in conlanging or linguistics.

All that being said, some things to experiment with: