One of my favorite pastimes — like many millennials — is the evolution of memes. These are images which relay a central idea or joke that is largely contextual: their value as a communicative device exists only within particular social spheres. Outside of this sphere, they deliver no such information because the users outside of the user base do not understand the origins or humor of the meme in question. You can think of memes, in some sense, as a type of social currency — a cash crop if you will.
Anyway, there are many things to be said about memes, most of which I do not want to get bogged down in. We’ll just keep it simple & say that memes are vital to the ecosystem of the World Wide Web.
What is a Mood?
Upon the inception of this idea for mood analysis, together with my good friend Carlo, I ran into a tricky problem. How do I describe a mood to someone who is more out of touch from social media? How do I make the idea of a mood accessible to a wider audience? For me, these questions constituted a mood in an of itself.
A mood is a piece of textual information, typically a small bit about someone’s day-to-day living. This information explicitly describes a situation which we immediately connect to a particular feeling. Along with this piece of text, it is accompanied by a meme; the meme adds a humorous element about the situation, & also works to make the situation relatable to others. To paraphrase my good friend Alex, a mood “refers to content or experiences that evoke a […] feeling, and in sharing, communicate to others something about the self of the poster/sharer. […] Moods are funny, sad, joyful, upsetting uncomfortable, gross, specific, & confusing”, but may not be limited to these things.
Use of Neural Nets
With the release of Mathematica version 11.1, about 30 different types of neural net layers have been added to the core of the language. With the high level interface the notebook offers, it becomes really easy to build advanced neural networks. This is a consequence of the symbolic nature of the language: one specified, the language fills in the needed details. For those interested, Stephen talks about how everything interfaces with the low-level library MXNet.
In particular, the NetTrain function works to train parameters in any net from examples. NetTrain makes it easy to build up multi-layer neural networks. The one that we used for this analysis was LeNet, a type of network designed to recognize visual patterns directly from pixel images without creating large chains of composite functions — it minimizes the amount of preprocessing involved. Their strength comes from the fact that the patterns can be extremely varied yet still be recognized. This makes for a very robust network.
Since its pretty much impossible to build a model around specific moods such as “tfw u accidentally throw ur car keys into the trash, take out the trash, & then realize u threw out ur keys but the garbage truck has already collected ur trash” based on the sheer scope of moods created when you are dealing with specificity like this, we can at least attribute most moods to four different categories:
These seem basic enough to encompass a large swath of moods with relative ease. Using the above enumeration as the numeric representation of mood categories so that the code can interpret it. The data contains 10 memes a piece per mood. Here is a piece of the dataset:
Next, the appropriate LeNet code was written for this dataset
Then, we build the model
As you might be able to guess from the progress of the neural net, it doesn’t seem to be matching the kind of progress we would hope to have. Still, it’s worthwhile to check the results:
Ok so, we definitely did not get the results we hoped for. That is actually way more interesting to me though, because it raises so many new questions: why did the machine try to sort every meme into the “happy” category? Is my sample size too small? Are there issues with the way machines render & interpret images that are contextually vague? Can non-human images be reconciled & contrasted with human images? And if so, can enough information/value be extracted to make meaningful inferences about the moods?
Use of Classify
In addition to neural networks, we also used the built-in function Classify, which is more automated than the family of ‘Net’ functions (NetModel, NetTrain, NetGraph, etc) in the sense that it doesn’t require like, any training — you can think of NetTrain as a linux machine & Classify as a Mac: the Mac works straight out of the box with little user configuration, while the linux machine is designed for complete customizability.
The Classify feature has been around since the introduction of image processing in the Wolfram Language, but its capabilities have been significantly expanded upon since then. So, might as well take advantage of it!
Using the same data set with the same dimensions, we write the code to be input into Classify:
Then, we build the ClassifierFunction
We can’t really hypothesize how the results will turn out based on this analysis like we could sort of before, so we will absolutely need to check the results to confirm a accurate model has been successfully built:
While the results are not stellar here either, we can still make some important observations wrt the differences between each analysis: This model did a much better job at categorizing images with their associated moods, but why? Or rather, how?
There must be more going on with the low-level interfaces than we can make sense of. Perhaps it has to do with the way each function understands & quantifies images? I am really not sure. I might come back to this post at a later date & work to fill in some of my knowledge gaps as I learn more about machine learning & image processing.
Machine Failure: Why Can’t Machines #Realize Things in 2017?
Ultimately, the failure of machines to fully grasp the idea of moods is not something that really needed to be demonstrated in this way: moods are highly variable, volatile, contextual, ambiguous, & in some cases incomprehensible — even for humans.
But what allows us to connect feelings to images — what allows us to relate thoughts, concerns, & people together — comes down to social cues. It isn’t just a learned pattern of behavior that lets us absorb the value of moods, but rather it is the combination of these patterns together with the added information we inherently consume while being steeped in the environment we live in; humans do not exist in a vacuum — we are affected by the unconscious messages we receive starting from childhood. This isn’t anything new, but for some reason we ostensibly have trouble balancing these ideas when it comes to artificial intelligence.
I don’t think we can reasonably expect to understand moods on the analytical level alone, & because of that I find it hard to believe we will also develop any useful machine learning methods to extract the full breadth of value that moods contain.
But perhaps this is the biggest mood of all.