Who has the last word? Understanding How to Sample Online Discussions


In online debates, as in offline ones, individual utterances or arguments support or attack each other, leading to some subset of arguments (potentially from different sides of the debate) being considered more relevant than others. However, online conversations are much larger in scale than offline ones, with often hundreds of thousands of users weighing in, collaboratively forming large trees of comments by starting from an original post and replying to each other. In large discussions, readers are often forced to sample a subset of the arguments being put forth. Since such sampling is rarely done in a principled manner, users may not read all the relevant arguments to get a full picture of the debate from a sample. This paper is interested in answering the question of how users should sample online conversations to selectively favour the currently justified or accepted positions in the debate. We apply techniques from argumentation theory and complex networks to build a model that predicts the probabilities of the normatively justified arguments given their location in idealised online discussions of comments and replies which we represent as trees. Our model shows that the proportion of replies that are supportive, the distribution of the number of replies that comments receive, and the locations of comments that do not receive replies (i.e., the ``leaves’’ of the reply tree) all determine the probability that a comment is a justified argument given its location. We show that when the distribution of the number of replies is homogeneous along the tree length, for acrimonious discussions (with more attacking comments than supportive ones), the distribution of justified arguments depends on the parity of the tree level which is the distance from the root expressed as number of edges. In supportive discussions, which have more supportive comments than attacks, the probability of having justified comments increases as one moves away from the root. For discussion trees which have a non-homogeneous in-degree distribution, for supportive discussions we observe the same behaviour as before, while for acrimonious discussions we cannot observe the same parity-based distribution. This is verified with data obtained from the online debating platform Kialo. By predicting the locations of the justified arguments in reply trees, we can therefore suggest which arguments readers should sample, to grasp the currently accepted opinions in such discussions. Our models have important implications for the design of future online debating platforms.

ACM Transactions on the Web (TWEB)
Peter Young
Peter Young
Former Postdoc (now Data Scientist at Accuity)
Sagar Joglekar
Sagar Joglekar
Former PhD student (now Research Scientist at Bell Labs Cambridge)

I am a Research Scientist at Nokia Bell labs, Cambridge UK, working with the social dynamics team. I am mainly interested in projects that deal with quantification of human processes from web scale data using methods from complex networks, machine learning and computer vision. I was a King’s India scholar at King’s College London, where I worked on my Ph.D. in computer science at the Department of Informatics, under the guidance of Dr. Nishanth Sastry. I have graduated with a Masters of Science (M.S.) degree from University of California at Santa Barbara - USA, majoring in signals processing and networks, and a Bachelors of Engineering (B.Eng) from University of Pune - India, majoring in Electronics engineering .