Nishanth Sastry

Nishanth Sastry

Professor of Computer Science

University of Surrey

Biography

Prof. Nishanth Sastry is Joint Head of the Distributed and Networked Systems Group at Department of Computer Science, University of Surrey. He is also a Visiting Researcher at the Alan Turing Institute, where he is a co-lead of the Social Data Science Special Interest Group.

Prof. Sastry holds a Bachelor’s degree (with distinction) from R.V. College of Engineering, Bangalore University, a Master’s degree from University of Texas, Austin, and a PhD from the University of Cambridge, all in Computer Science. Previously, he spent over six years in the Industry (Cisco Systems, India and IBM Software Group, USA) and Industrial Research Labs (IBM TJ Watson Research Center). He has also spent time at the Massachusetts Institute of Technology Computer Science and AI Laboratory.

His honours include a Best Paper Award at SIGCOMM Mobile Edge Computing in 2017, a Best Paper Honorable Mention at WWW 2018, a Best Student Paper Award at the Computer Society of India Annual Convention, a Yunus Innovation Challenge Award at the Massachusetts Institute of Technology IDEAS Competition, a Benefactor’s Scholarship from St. John’s College, Cambridge, a Best Undergraduate Project Award from RV College of Engineering, a Cisco Achievement Program Award and several awards from IBM. He has been granted nine patents in the USA for work done at IBM.

Nishanth has been a keynote speaker, and received media coverage from print media such as The Times UK, New York Times, New Scientist and Nature, as well as Television media such as BBC, Al Jazeera and Sky News. He is a member of the ACM and a Senior Member of the IEEE.

Interests

  • Computer Networks and their architecture
  • Social Networks and Computational Social Science
  • Data Analytics and Machine Learning in support of the above two

Education

  • PhD in Computer Science

    University of Cambridge

  • MA in Computer Science

    University of Texas at Austin

  • BE in Computer Science and Engineering

    R.V. College of Engineering, Bangalore University

Students and collaborators

Postdocs

Avatar

Alessandro Di Stefano

PDRA working on multiplex networks and game theory

Avatar

Animesh Chaturvedi

PDRA working on online harms and social media (to join shortly)

Avatar

Aravindh Raman

PDRA, working on network measurements

Avatar

Damiano Di Francesco Maesa

PDRA working on distributed ledgers for 5G

Avatar

Frank Sardis

Managing 5G Lab infrastructure

PhD Students

Avatar

Abdullahi Abubakar

PhD Student working on sharing economy applications over edge networks for developing regions

Avatar

Emeka Obiodu

PhD Student, working on differentiated services for 5G

Avatar

Pushkal Agarwal

PhD Student working with the UK Parliament on Digital Citizen Engagement

Avatar

Tooba Faisal

PhD Student working with Vodafone on Service Level Agreements at the Network Edge

Avatar

Xuehui (Rachel) Hu

PhD Student, working on third party trackers and GDPR

Visitors

Avatar

Miriam Redi

Visiting Researcher, Wikimedia Research

Alumni

Avatar

Changtao Zhong

Former PhD student (now Data Scientist at Twitter)

Avatar

Dmytro Karamshuk

Former Postdoc, now Research Scientist at Facebook Core Data Science.

Avatar

Peter Young

Former Postdoc (now Data Scientist at Accuity)

Avatar

Sagar Joglekar

Former PhD student (now Research Scientist at Bell Labs Cambridge)

Recent Publications

Quickly discover relevant content by filtering publications.

CCCC: Corralling Cookies into Categories with CookieMonster

Browser cookies are ubiquitous in the web ecosystem today. Although these cookies were initially introduced to preserve user-specific state in browsers, they have now been used for numerous other purposes, including user profiling and tracking across multiple websites. This paper sets out to understand and quantify the different uses for cookies, and in particular, the extent to which targeting and advertising, performance analytics and other uses which only serve the website and not the user add to overall cookie volumes. We start with 31 million cookies collected in Cookiepedia, which is currently the most comprehensive database of cookies on the Web. Cookiepedia provides a useful four-part categorisation of cookies into strictly necessary, performance, functionality and targeting/advertising cookies, as suggested by the UK International Chamber of Commerce. Unfortunately, we found that, Cookiepedia data can categorise less than 22% of the cookies used by Alexa Top20K websites and less than 15% of the cookies set in the browsers of a set of real users. These results point to an acute problem with the coverage of current cookie categorisation techniques. Consequently, we developed CookieMonster, a novel machine learning-driven framework which can categorise a cookie into one of the aforementioned four categories with more than 94% F1 score and less than 1.5 ms latency. We demonstrate the utility of our framework by classifying cookies in the wild. Our investigation revealed that in Alexa Top20K websites necessary and functional cookies constitute only 13.05% and 9.52% of all cookies respectively. We also apply our framework to quantify the effectiveness of tracking countermeasures such as privacy legislation and ad blockers. Our results identify a way to significantly improve coverage of cookies classification today as well as identify new patterns in the usage of cookies in the wild.

Differential Tracking Across Topical Webpages of Indian News Media

Browser cookies are ubiquitous in the web ecosystem today. Although these cookies were initially introduced to preserve user-specific state in browsers, they have now been used for numerous other purposes, including user profiling and tracking across multiple websites. This paper sets out to understand and quantify the different uses for cookies, and in particular, the extent to which targeting and advertising, performance analytics and other uses which only serve the website and not the user add to overall cookie volumes. We start with 31 million cookies collected in Cookiepedia, which is currently the most comprehensive database of cookies on the Web. Cookiepedia provides a useful four-part categorisation of cookies into strictly necessary, performance, functionality and targeting/advertising cookies, as suggested by the UK International Chamber of Commerce. Unfortunately, we found that, Cookiepedia data can categorise less than 22% of the cookies used by Alexa Top20K websites and less than 15% of the cookies set in the browsers of a set of real users. These results point to an acute problem with the coverage of current cookie categorisation techniques. Consequently, we developed system, a novel machine learning-driven framework which can categorise a cookie into one of the aforementioned four categories with more than 94% F1 score and less than 1.5 ms latency. We demonstrate the utility of our framework by classifying cookies in the wild. Our investigation revealed that in Alexa Top20K websites necessary and functional cookies constitute only 13.05% and 9.52% of all cookies respectively. We also apply our framework to quantify the effectiveness of tracking countermeasures such as privacy legislation and ad blockers. Our results identify a way to significantly improve coverage of cookies classification today as well as identify new patterns in the usage of cookies in the wild.

To share or not to share: reliability assurance via redundant cellular connectivity in Connected Cars

There is growing adoption of connected cars (CCs) across society and the expectation is that 5G will better support safety-critical vehicle-to-everything (V2X) use cases. Operationally, most relationships between cellular network providers and car manufacturers or users are exclusive, providing a single network connectivity, with at best an occasional option of a back-up plan if the single network is unavailable. We question if this setup can provide QoS assurance for V2X use cases. Accordingly, in this paper, we investigate the role of redundancy in providing QoS assurance for cellular connectivity for CCs. Using our bespoke Android measurement app, we did a drive-through test on 380 kilometers of major and minor roads in South East England. We measured round trip times, jitter, page load times, packet loss, network type, uplink speed and downlink speeds on the four UK networks for 14 UK-centric websites every five minutes. In addition, we did the same measurement using a much more expensive universal SIM card provider that promises to fall back on any of the four UK networks to assure reliability. By comparing emphactual performance on the best performing network versus the universal SIM, and then emphprojected performance of a two/three/four multi-operator setup, we make three major contributions. First, the use of redundant multi-connectivity, especially if managed by the demand-side, can deliver superior performance (up to 28 percentage points in some cases). Second, despite costing 95x more per GB of data, the universal SIM performed worse than the best performing network except for uplink speed, highlighting how the choice of parameter to monitor can influence operational decisions. Third, any assessment of CC connectivity reliability based on emphavailability is sub-optimal as it can hide significant under-performance.

Under the Spotlight: Web Tracking in Indian Partisan News Websites

Social media has been on the vanguard of political information diffusion in the 21st century. Most studies that look into disinformation, political influence and fake-news focus on mainstream social media platforms. This has inevitably made English an important factor in our current understanding of political activity on social media. As a result, there has only been a limited number of studies into a large portion of the world, including the largest, multilingual and multi-cultural democracy: India. In this paper we present our characterisation of a multilingual social network in India called ShareChat. We collect an exhaustive dataset across 72 weeks before and during the Indian general elections of 2019, across 14 languages. We investigate the cross lingual dynamics by clustering visually similar images together, and exploring how they move across language barriers. We find that Telugu, Malayalam, Tamil and Kannada languages tend to be dominant in soliciting political images (often referred to as memes), and posts from Hindi have the largest cross-lingual diffusion across ShareChat (as well as images containing text in English). In the case of images containing text that cross language barriers, we see that language translation is used to widen the accessibility. That said, we find cases where the same image is associated with very different text (and therefore meanings). This initial characterisation paves the way for more advanced pipelines to understand the dynamics of fake and political content in a multi-lingual and non-textual setting.

Contact