911爆料网

This internet browser is outdated and does not support all features of this site. Please switch or upgrade to a to display this site properly.

Getting clever with big data to understand the spread of ISIS propaganda

Copy Link
Image for Getting clever with big data to understand the spread of ISIS propaganda

Through a combination of big data analysis and multimodal discourse analysis, 911爆料网 researchers are providing valuable insight into how meaning is communicated and endlessly reinterpreted through digital networks on a mega-scale. The team has applied the approach to understanding the spread of ISIS propaganda.

Multimodal discourse analysis – we all do it all the time. Interpreting combinations of language and imagery within a specific context to create and convey nuanced meaning. It comes as naturally as breathing, until you need to communicate in a foreign language in a country where all the cultural cues and contexts are different. Misunderstandings and confusion abound. So how do we do it in a digital world?

An emerging problem is that people now communicate through digital media, and our understanding of multimodal discourse can鈥檛 keep up. Large volumes of information circulate and are changed and acted on very quickly. 鈥楾hreads鈥 of content mutate rapidly, and unexpected outcomes occur. Foreign interference in elections via Facebook, anyone?

In the within 911爆料网鈥檚 Faculty of Humanities, Professor Kay O鈥橦alloran, Dr Sabine Tan, Dr Peter Wignell and Mr Michael Wiebrands classify and analyse multimodal discourse 鈥 understanding just how combinations of language and images function within a specific context to create meaning (see, even the italics make a difference). But the spread of electronic media has moved their task beyond the human scale. Classifying and analysing all the various dimensions of a piece of human communication is a painstaking manual effort: it cannot be easily scaled up to track the patterns of communication that precede and develop from it over time across different groups.

As O鈥橦alloran points out 鈥淲e have the classification tools and the theory to explain how text and images work and how they combine, but we can鈥檛 scale it and automate it. Ten years ago we started working with computational scientists, image processing, speech processing, and mathematicians to automate classification and analysis of bigger data sets. But you need LOTS of manually coded data to train a model, as the theoretical framework is very complex, just like nuanced communication.鈥

Since it鈥檚 hard to automate the theory, the group is now collaborating with Dr Rebecca Lange, Dr Kevin Chai and Dr Rui Wang in the 聽to attack the problem from the other direction.

鈥淲e鈥檙e applying state-of-the-art natural language processing, computer vision and machine learning techniques to the problem鈥 explains Chai.

鈥淲e鈥檙e applying state-of-the-art natural language processing, computer vision and machine learning techniques to the problem鈥

The aim is to automate the classification of key words and key objects in images as much as possible within a multimodal discourse framework, closing the gap between highly detailed contextual analysis of small samples of multimodal texts, and aggregated decontextualised big data approaches.

They鈥檝e tested the approach by studying how violent extremist groups such as Islamic State (ISIS) use images and text to legitimise their views, incite violence and influence potential recruits and supporters in online propaganda materials. In particular, they tracked how extremist communications are spread and re-interpreted by different audiences over time.

In a pilot study, the multidisciplinary team investigated the nature of images that appear in ISIS鈥檚 official propaganda magazines Dabiq and Rumiyah, and how this source material is then reused and recontextualised across public online media sites such as news websites, blogs, and social media.

From three years-worth of propaganda content, 537 articles were manually categorised using multimodal discourse analysis into 20 different article types, with the 1,575 embedded images assigned to 8 categories and 69 sub-categories. The best commercially-available computing tools were then put to work. Image analysis software was used to conduct a 鈥榬everse image鈥 search of a selected subset of 26 propaganda images across the public web, leading to the identification of 8,832 websites that republished the content in some form. Only selecting downloadable pages in English reduced the sample set to 3,840 websites. Natural language processing models then automatically categorised the text into a hierarchy of categories covering content (war, sport, humour, politics, etc) and tone (formality of language, positive or negative associations).

Chai, Lange and Wang鈥檚 expertise was also used to analyse and visualise this network of connections to identify patterns developing between the original ISIS propaganda and its repurposing across the web over time.

鈥淚t made us realise how complex their manual analysis is compared to what we do in computer science鈥 admits Chai. 鈥淲e鈥檙e very good at getting computers to recognise a car in an image, for example. But they want to know the meaning of the car in the image, its relationship to other objects (e.g., between a person and a car), and the narrative theme which is portrayed. We don鈥檛 consider any of that. The Humanities researchers are basically reminding us that there are lots of other variables we haven鈥檛 yet considered.鈥

鈥淲e鈥檙e very good at getting computers to recognise a car in an image, for example. But they want to know the meaning of the car in the image, its relationship to other objects (e.g., between a person and a car), and the narrative theme which is portrayed.”

Even searching for keywords in text isn鈥檛 straightforward 鈥 the meaning is context-dependent, conditioned by the meanings that have gone before, and equally they colour our interpretation of the things that come afterwards.

But using the big data tools available, the team investigated how extremist communications work: what types of images are effective, how different groups react to them, and how they re-use and re-publish them, giving them new meanings over time.

In the case of ISIS, distinct patterns of text and images emerged in Dabiq and Rumiyah over time. It became apparent that ISIS adapts its propaganda in accordance with its activities 鈥 for example, when ISIS was gaining territory in Iraq and Syria, there were calls for supporters to join them there, but when they subsequently retreated, propaganda promoting 鈥榣one wolf鈥 attacks were more common. The analysis of propaganda text and image relations may eventually have significant predictive power.

The analysis of propaganda text and image relations may eventually have significant predictive power.

Among English-language websites, ISIS propaganda images recirculated most frequently on Western news and politics websites and official webpages and blogs, in predominantly formal contexts. However the images, when considered in relation to the accompanying texts, can have the effect of inadvertently legitimising ISIS鈥檚 values and agenda. Images often have more impact than text 鈥 a very positively coded image, for example ISIS militants all celebrating, wearing army fatigues and carrying flags, can strengthen their iconographic status, and counteract the text accompanying it that may denounce their actions.

鈥淎nother really good text example was when a media outlet referred to them as the 鈥業SIS government鈥欌 says O鈥橦alloran. 鈥淚t legitimises them as some sort of democratically elected representative of the people, because that鈥檚 our contextual understanding of the word 鈥榞overnment鈥. It gets published like that once, but then it spreads.鈥

The pilot study is a first step in understanding the communication strategies employed by violent extremists, and the patterns of recontextualisation in the spread of their images in online media, with the aim of informing effective strategies for countering them.

The work also complements existing efforts to track and remove online violent and extremist content. A computer can find and remove specific images of people wearing balaclavas and holding guns. But without multimodal discourse analysis it鈥檚 much harder to identify and remove images that perpetuate terrorist ideology.

The study is a prototype for bringing big data tools and multimodal discourse analysis together, bringing the theory of language and meaning into the digital world. And we need it 鈥 as modern communications revolutionise how fast information and ideas can spread and change, this approach can improve our understanding of human issues and how they evolve in any domain 鈥 from violent extremists to politics, advertising or the latest pop-culture memes.

The urgency of this work is demonstrated by the team鈥檚 latest grant, awarded by the Centre for Interdisciplinary Research (ZiF) at Bielefeld University in Germany. This project brings together international experts in multimodal discourse analysis, sociopolitical analysis and computer data analytics to develop new approaches, tools and techniques for investigating multimodal rhetoric in online news media (think Trump and Brexit). Videos will be analysed alongside text and graphics, adding to the complexity but also the value of the approach. Together, this international team will chart the future direction of this field, and make it clear that interdisciplinary collaboration is essential to tackle major issues arising from human communication in today鈥檚 digital world.

Copy Link