The United States Navy wants to study how people engage in talk in online forums and platforms and for this purpose it is trying to create a huge archive that would be comprised of at least 350 billion social media posts from across the globe.
However, the particular or specific social media platform that the military project team wants to collect data from has not been specified by it.
There are a number of conditions that the project has set to for the selection of the posts. The first is that such posts must be publicly available. The others include conditions that the posts should be sourced from at least 100 different countries and should be in at least 60 different languages. The US Navy also wants posts between 2014 and 2016.
These details were contained as a part of a tender document that was issued by the Naval Postgraduate School seeking a company to gather and provide the data. Opportunities to respond to the tender have now closed.
The US navy also mentioned a number of other requirements which included the condition that the posts need to be gathered from at least 200 million unique users and the maximum number of posts from ca particular country cannot be more than 30 per cent of the total posts. US Navy also wants to collect at least 50 per cent of the posts in a language other than English while at least 20 per cent of the records should also include location information of the posting. However the project and the archive would not include private messaging and user information.
“Social media data allows us for the first time, to measure how colloquial expressions and slang evolve over time, across a diverse array of human societies, so that we can begin to understand how and why communities come to be formed around certain forms of discourse rather than others,” T Camber Warren, the project’s lead researcher, said in an interview to Bloomberg.
Tor, the anonymous browsing network that was created in 2002 was backed by the US Navy. The major aim of the network Tor which is also known as The Onion Router, is to not to reveal where people go online which is achieved by making use of encryption and then randomly bouncing requests for web pages through a network of different computers.
(Adapted from RT.com)