StoryWrangler: Exploratorium for billions of social media messages predicting political and financial turmoil

StoryWrangler

Language and intellectual. These are the humanity’s biggest and greatest social technologies. Since decades, words and speech have made great impact on human civilizations and changed the scenarios in many ways; whether it be Hitler or times during French revolution. And now, with advancing technology we humans have shifted to social platforms to express and present our words and intellectuals. Bizarre fact is, each day on an average, 500 million tweets are shared on twitter. Which is 6,000 tweets per second and in a year around, twitter handles 200 billion tweets.

StoryWrangler: Number of twitter users globally
Number of twitter users globally
(Source: Omnicore)

These numbers largely affect the real time world which we live in. apart from fun tweets and time passing activities, twitter also imprints world events, popular culture, and even day-to-day, recording of an ever-growing compendium of language change. And the way it can impact the society have already been witnessed since #metoo and #BlackLivesMatter. So, how exactly does a single tweet can make a bitcoin rise, or wall street market to fall, or riots in political scenarios? Can’t it be predicted, is there any tool?

A New Insane Invention

What if every soldier, home-worker, shopkeeper, teenager had put up their opinion during the French Revolution? We would have definitely got a whole lot of different stories about the rise and reign of Napoleon, Isn’t it?

For the first time, researchers at University of Vermont have invented an instrument called StoryWrangler. This is an insane invention, which can deeply look into billions of twitter posts and provides an sort of future look and minute-by-minute rising and trending scenarios. Whether it be Anime or K-Pop, Dogecoin or Elon Musk, India-China or BJP or of emerging diseases. StoryWrangler leverages twitter data while tracking its dynamic changes to illustrate the temporally evolving corpus. It potentially can see and illustrate the social amplification, sociotechnical dynamics of famous individuals (not particularly Musk), box office success and social unrest also.

How and what is story of StoryWrangler?

StoryWrangler acts as a perfect platform to enable research in computational social science, data journalism, natural language processing and digital humanities. It encodes casual daily conversation in a way unavailable through newspapers, articles or books. It gathers and caters phrases from across 150 different languages and analyses them as rise and fall of ideas & stories, amongst the people around the world.

Performing these tasks and predictions was a complex process and StoryWrangler had to start digging from back in 2008. They drew a storehouse of messages comprising roughly 10% tweets from Sept, 2008 and covered over 150 languages. One obstacle was of regional dialects, native words, slangs while encountering the data. But still, they somehow managed to classify them on the basis of native languages and date & location.

Storywrangler: N-gram example
N-gram example
(Source: Devopedia)

For each day t and for each language l, tweets were divided into: Organic Tweets (OT) and retweets (RT). Organic tweets are originally authored tweets and not the retweeted content they referred to. Then, each tweet was broken down to 1-grams, 2-grams, 3-grams and so on upto n-grams. Now, what is ‘n’ here?

So basically, there was a challenge for the researchers to sort the different aspects and scenarios of popularities of ‘n’-grams – n means various sequences of n “words” in a text that forms trending characters, emojis, numerals, symbols, etc. Interestingly, this n-word framework is inspired from Google’s n-gram project. So, they smash large data of books into small bits, and thus have simply parsed tweets of daily word frequencies; 2-word phrases, 3-word expressions.

The Hashtag game of predictions

So then, based on weight of ‘n’ of n-gram, the frequencies change and popularity rises or falls. For example, ‘god’ is 300th most commonly used word each day. For some words, this rises and falls cyclically. Interestingly, on one glorious day of 2015, “😊” was the most commonly used word! Similarly, movie titles, let’s say, ‘Avengers’ produce more symmetric patterns cantered on their release dates.

global sensor storywrangler
Trend of trending hashtags
(Source: University of Vermont)
Trend of trending hashtags
(Source: University of Vermont)

Also, always it can’t be predicted that for n=1, the term will trend more. For example, in the first six months of 2020, ‘coronavirus’ gave way to ‘covid’ as the dominant term of reference on twitter for covid-19 pandemic. And, simultaneously for n=2, it can’t be suggested that it won’t trend more. For example, #BlackLivesMatter trended more after George Floyd’s death then #metoo, although having more n-gram. To solve this issue, a complex formula was derived after researching and observing for over couple of years.

We write an n-gram by τ and a day’s lexicon for language 𝓁, the set of distinct n-grams found in all tweets (AT) for a given date t, by 𝒟t, 𝓁; n. We write n-gram raw frequency as fτ, t, 𝓁 and compute its usage rate in all tweets written in language 𝓁 as

Equation

The Future and Global Story ahead

As an online tool, it potentially provides a powerful lens for viewing and analyzing the rise and fall of ideas, words, expressions, stories each day among the global community. It showed some positive results, which were really astonishing and striking on their own. StoryWrangler was capable of predicting the political and financial turmoil. Also, positively, it examined and predicted that usage of words like ‘rebellion’ and ‘crackdown’ in various regions of the world is largely affected by well-established index of geopolitical risk of those same places.

“The StoryWrangler gives us a data-driven way to index what regular people are talking about in everyday conversations, not just what reporters or authors have chosen; it’s not just the educated or the wealthy or cultural elites,” says applied mathematician Chris Danforth, a professor at the University of Vermont who co-led the creation of the StoryWrangler.

Revolution of Evolution

The UVM team is constantly co-ordinating with National Science Foundation and using twitter to demonstrate about, how a simple chatter on distributed social media can act as a global sensor system. What happened, what actually is happening and what might come next. Other social media streams, from Reddit to 4chan and Weibo; these all can also feed data for StoryWrangler and follow or unfollow the fame or fate of political leaders and sports stars. Although twitter does not represent the entire humanity, but StoryWrangler quantifies the ‘expressions of many’ and that’s why is succeeding in its way ahead.

StoryWrangler is a special part of science advancement. The tool can revolutionize the human evolution. Tracing major news events or predicting natural disasters, fame, fate, views, insights and what not.  

The world is changing definitely with this, as now not only the powerful and rich intellectuals have potential, but educated and cultural will have too!

OSD

Related articles

Meet APC: A Leading-Edge Technology to boost Old Combustion Engines

Combustion Engines are now experiencing a downfall, however, the researchers haven't failed to experiment with improvements.

The Top 10 Mind-blowing Technologies That Stole The Spotlight In CES 2021

With 2021 kicking off and everyone filled with a...

Spicing Up Your Security: Salt and Pepper To Make Your Password Hash More Secure…

We live in a digital era where data is more precious than life. How is this data secured then? Well, it is with salt and pepper, not the spices, but the cryptographic ones. Read more about how these cryptospices improve your data security and protect your hash from hackers and crackers.

Space Junk: What is it and Why is it a problem for us

Ever since, the first artificial satellite, Sputnik 1 was launched in 1957, we have launched more than 10,000 satellites, with nearly 6,000 still in orbit out of which about only 3,000 are currently operational. The amount of space junk generated by them is so much that it's now becoming a huge concern to us.

Greenflation: Rising Inflation threatens the sustainable society

While moving with full-throttle towards sustainability, and net-zero policies; we have actually left behind one of the most crucial consequences – Greenflation. Renewable and sustainable technologies require more wiring than fossil fuels do thehavok.com has come up with how transition from fossil fuels is going to be messier than we think. And how this all will evolve huge and steady additional costs that nations are not willing and will be unable to bear. Read now at thehavok.com
Om Desai
Om Desai
M.Sc. Integrated Chemistry. Research. Blogging | Content Writing | Science & Tech. Photography

LEAVE A REPLY

Please enter your comment!
Please enter your name here
Captcha verification failed!
CAPTCHA user score failed. Please contact us!