skip to main content
research-article

ATR-Vis: Visual and Interactive Information Retrieval for Parliamentary Discussions in Twitter

Published: 06 February 2018 Publication History

Abstract

The worldwide adoption of Twitter turned it into one of the most popular platforms for content analysis as it serves as a gauge of the public’s feeling and opinion on a variety of topics. This is particularly true of political discussions and lawmakers’ actions and initiatives. Yet, one common but unrealistic assumption is that the data of interest for analysis is readily available in a comprehensive and accurate form. Data need to be retrieved, but due to the brevity and noisy nature of Twitter content, it is difficult to formulate user queries that match relevant posts that use different terminology without introducing a considerable volume of unwanted content. This problem is aggravated when the analysis must contemplate multiple and related topics of interest, for which comments are being concurrently posted. This article presents Active Tweet Retrieval Visualization (ATR-Vis), a user-driven visual approach for the retrieval of Twitter content applicable to this scenario. The method proposes a set of active retrieval strategies to involve an analyst in such a way that a major improvement in retrieval coverage and precision is attained with minimal user effort. ATR-Vis enables non-technical users to benefit from the aforementioned active learning strategies by providing visual aids to facilitate the requested supervision. This supports the exploration of the space of potentially relevant tweets, and affords a better understanding of the retrieval results. We evaluate our approach in scenarios in which the task is to retrieve tweets related to multiple parliamentary debates within a specific time span. We collected two Twitter datasets, one associated with debates in the Canadian House of Commons during a particular week in May 2014, and another associated with debates in the Brazilian Federal Senate during a selected week in May 2015. The two use cases illustrate the effectiveness of ATR-Vis for the retrieval of relevant tweets, while quantitative results show that our approach achieves high retrieval quality with a modest amount of supervision. Finally, we evaluated our tool with three external users who perform searching in social media as part of their professional work.

Supplementary Material

makki (makki.zip)
Supplemental movie, appendix, image and software files for, ATR-Vis: Visual and Interactive Information Retrieval for Parliamentary Discussions in Twitter

References

[1]
Noa Aharony. 2012. Twitter use by three political leaders: An exploratory analysis. Online Information Review 36, 4 (2012), 587--603.
[2]
Aretha B. Alencar, Maria Cristina F. de Oliveira, and Fernando V. Paulovich. 2012. Seeing beyond reading: A survey on visual text analytics. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2, 6 (2012), 476--492.
[3]
Richard Arias-Hernandez, Linda T. Kaastra, Tera M. Green, and Brian Fisher. 2011. Pair analytics: Capturing reasoning processes in collaborative visual analytics. In Proceedings of the 44th Hawaii International Conference on System Sciences (HICSS’11). IEEE, 1--10.
[4]
John Carlo Bertot, Paul T. Jaeger, Sean Munson, and Tom Glaisyer. 2010. Social media technology and government transparency. Computer 11 (2010), 53--59.
[5]
Javier Borge-Holthoefer, Alejandro Rivero, Iñigo García, Elisa Cauhé, Alfredo Ferrer, Darío Ferrer, David Francos, David Iñiguez, María Pilar Pérez, Gonzalo Ruiz, and others. 2011. Structural and dynamical patterns on online social networks: The Spanish May 15th movement as a case study. PloS One 6, 8 (2011), e23883.
[6]
Harald Bosch, Dennis Thom, Florian Heimerl, Edwin Puttmann, Steffen Koch, Robert Kruger, Michael Worner, and Thomas Ertl. 2013. ScatterBlogs2: Real-time monitoring of microblog messages through user-guided filtering. IEEE Transactions on Visualization and Computer Graphics 19, 12 (2013), 2022--2031.
[7]
Michael Bostock, Vadim Ogievetsky, and Jeffrey Heer. 2011. D data-driven documents. IEEE Transactions on Visualization and Computer Graphics 17, 12 (2011), 2301--2309.
[8]
Danah Boyd and Kate Crawford. 2012. Critical questions for big data: Provocations for a cultural, techno- logical, and scholarly phenomenon. Information, Communication 8 Society 15, 5 (2012), 662--679.
[9]
Junghoon Chae, Dennis Thom, Harald Bosch, Yun Jang, Ross Maciejewski, David S. Ebert, and Thomas Ertl. 2012. Spatiotemporal social media analytics for abnormal event detection and examination using seasonal-trend decomposition. In Proceedings of the IEEE Conference on Visual Analytics Science and Technology. IEEE, 143--152.
[10]
Yan Chen, Zhoujun Li, Liqiang Nie, Xia Hu, Xiangyu Wang, Tat-seng Chua, and Xiaoming Zhang. 2012. A semi-supervised bayesian network model for microblog topic classification. In Proceedings of the 24th International Conference on Computational Linguistics. Citeseer, 561--576.
[11]
Peter Cogan, Matthew Andrews, Milan Bradonjic, W. Sean Kennedy, Alessandra Sala, and Gabriel Tucci. 2012. Reconstruction and analysis of Twitter conversation graphs. In Proceedings of the First ACM International Workshop on Hot Topics on Interdisciplinary Social Networks Research. ACM, 25--31.
[12]
Michael Conover, Jacob Ratkiewicz, Matthew R. Francisco, Bruno Gonçalves, Filippo Menczer, and Alessandro Flammini. 2011. Political polarization on Twitter. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media 133 (2011), 89--96.
[13]
Danish Contractor, Bhupesh Chawda, Sameep Mehta, L. Venkata Subramaniam, and Tanveer A. Faruquie. 2015. Tracking political elections on social media: Applications and experience. In Proceedings of the 24th International Conference on Artificial Intelligence. AAAI Press, 2320--2326.
[14]
Nicholas Diakopoulos, Mor Naaman, and Funda Kivran-Swaine. 2010. Diamonds in the rough: Social media visual analytics for journalistic inquiry. In Proceedings of the IEEE Symposium on Visual Analytics Science and Technology. IEEE, 115--122.
[15]
Marian Dörk, Daniel Gruen, Carey Williamson, and Sheelagh Carpendale. 2010. A visual backchannel for large-scale events. IEEE Transactions on Visualization and Computer Graphics, 16, 6 (2010), 1129--1138.
[16]
Wenwen Dou, Xiaoyu Wang, Drew Skau, William Ribarsky, and Michelle X. Zhou. 2012. Leadline: Interactive visual analysis of text data through event identification and exploration. In Proceedings of the IEEE Conference on Visual Analytics Science and Technology. IEEE, 93--102.
[17]
Renato Miranda Filho, Jussara M. Almeida, and Gisele L. Pappa. 2015. Twitter population sample bias and its impact on predictive outcomes: A case study on elections. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. ACM, 1254--1261.
[18]
Lorenzo Gabrielli, Salvatore Rinzivillo, Francesco Ronzano, and Daniel Villatoro. 2014. From tweets to semantic trajectories: Mining anomalous urban mobility patterns. Citizen in Sensor Networks. Springer, 26--35.
[19]
Devin Gaffney. 2010. #iranElection: Quantifying online activism. In Proceedings of the WebSci10: Extending the Frontiers of Society On-Line.
[20]
Mona Golestan Far, Scott Sanne, Mohamed Reda Bouadjenek, Gabriela Ferraro, and David Hawking. 2015. On term selection techniques for patent prior art search. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 803--806.
[21]
Anatoliy Gruzd and Jeffrey Roy. 2014. Investigating political polarization on Twitter: A Canadian perspective. Policy 8 Internet 6, 1 (2014), 28--45.
[22]
Aditi Gupta, Ponnurangam Kumaraguru, Carlos Castillo, and Patrick Meier. 2014. Tweetcred: Real-time credibility assessment of content on Twitter. International Conference on Social Informatics. Springer, 228--243.
[23]
Davide F. Gurini and Fabio Gasparetti. 2012. TREC Microblog 2012 Track: Real-Time Algorithm for Microblog Ranking Systems. Technical Report. DTIC Documesnt.
[24]
Susan Havre, Beth Hetzler, and Lucy Nowell. 2000. ThemeRiver: Visualizing theme changes over time. In Proceedings of the IEEE Symposium on Information Visualization (InfoVis’00). IEEE, 115--123.
[25]
Danny Holten. 2006. Hierarchical edge bundles: Visualization of adjacency relations in hierarchical data. IEEE Transactions on Visualization and Computer Graphics 12, 5 (2006), 741--748.
[26]
Xia Hu, Jiliang Tang, Huiji Gao, and Huan Liu. 2013. ActNeT: Active learning for networked texts in microblogging. In Proceedings of the 2013 SIAM International Conference on Data Mining (SDM’13). 306--314.
[27]
Ajaz Hussain, Khalid Latif, Aimal Tariq Rextin, Amir Hayat, and Masoon Alam. 2014. Scalable visualization of semantic nets using power-law graphs. Applied Mathematics 8 Information Sciences 8, 1 (2014), 355--367.
[28]
Muhammad Imran, Carlos Castillo, Ji Lucas, Patrick Meier, and Sarah Vieweg. 2014. AIDR: Artificial intelligence for disaster response. In Proceedings of the 23rd International Conference on World Wide Web. ACM, 159--162.
[29]
Tommi Jaakkola and Hava T. Siegelmann. 2001. Active information retrieval. In Advances in Neural Information Processing Systems, T. G. Dietterich, S. Becker, and Z. Ghahramani (Eds.). MIT Press, 777--784.
[30]
Jari Jussila, Jukka Huhtamäki, Hannu Kärkkäinen, and Kaisa Still. 2013. Information visualization of Twitter data for co-organizing conferences. In Proceedings of International Conference on Making Sense of Converging Media. ACM, 139--145.
[31]
Daniel A. Keim, Jörn Kohlhammer, Geoffrey Ellis, and Florian Mansmann. 2010. Mastering the Information Age-solving Problems with Visual Analytics. Florian Mansmann.
[32]
Andy Kirk. 2012. Data Visualization: A Successful Design Process. Packt Publishing Ltd.
[33]
Heidi Lam, Enrico Bertini, Petra Isenberg, Catherine Plaisant, and Sheelagh Carpendale. 2012. Empirical studies in information visualization: Seven scenarios. IEEE Transactions on Visualization and Computer Graphics 18, 9 (2012), 1520--1536.
[34]
Kathy Lee, Diana Palsetia, Ramanathan Narayanan, Md Mostofa Ali Patwary, Ankit Agrawal, and Alok Choudhary. 2011. Twitter trending topic classification. In Proceedings of the 11th International Conference on Data Mining Workshops (IEEE’11). IEEE, 251--258.
[35]
Cheng Li, Yue Wang, Paul Resnick, and Qiaozhu Mei. 2014. Req-rec: High recall retrieval with query pooling and interactive classification. In Proceedings of the 37th International ACM SIGIR Conference on Research 8 Development in Information Retrieval. ACM, 163--172.
[36]
Mengchen Liu, Shixia Liu, Xizhou Zhu, Qinying Liao, Furu Wei, and Shimei Pan. 2016. An uncertainty-aware approach for exploratory microblog retrieval. IEEE Transactions on Visualization and Computer Graphics 22, 1 (2016), 250--259.
[37]
Shixia Liu, Xiting Wang, Jianfei Chen, Jun Zhu, and Baining Guo. 2014. Topic panorama: A full picture of relevant topics. In Proceedings of the IEEE Symposium on Visual Analytics and Science Technology (IEEE VAST’14). IEEE Computer Society, 183--192.
[38]
Yafeng Lu, Robert Krüger, Dennis Thom, Feng Wang, Steffen Koch, Thomas Ertl, and Ross Maciejewski. 2014. Integrating predictive analytics and social media. In Proceedings of the IEEE Conference on Visual Analytics Science and Technology, (IEEE’14). IEEE, 193--202.
[39]
William Lucia and Elena Ferrari. 2014. Egocentric: Ego networks for knowledge-based short text classification. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. ACM, 1079--1088.
[40]
Zhunchen Luo, Miles Osborne, and Ting Wang. 2012a. Opinion retrieval in Twitter. In Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media. 507--510.
[41]
Zhunchen Luo, Miles Osbornez, Saša Petrovic, and Ting Wang. 2012b. Improving Twitter retrieval by exploiting structural information. In Proceedings of the 26th AAAI Conference on Artificial Intelligence. 648--654.
[42]
Alan M. MacEachren, Anuj Jaiswal, Anthony C. Robinson, Scott Pezanowski, Alexander Savelyev, Prasenjit Mitra, Xiao Zhang, and Justine Blanford. 2011. Senseplace2: Geotwitter analytics support for situational awareness. In Proceedings of the IEEE Conference on Visual Analytics Science and Technology (IEEE’11). IEEE, 181--190.
[43]
Raheleh Makki, A. J. Soto, S. Brooks, and E. Milios. 2015. Active information retrieval for linking Twitter posts with political debates. In Proceedings of the IEEE International Conference on Machine Learning and Applications. IEEE, 1--10.
[44]
Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze, and others. 2008. Introduction to Information Retrieval. Vol. 1. Cambridge university press Cambridge.
[45]
Kamran Massoudi, Manos Tsagkias, Maarten de Rijke, and Wouter Weerkamp. 2011. Incorporating query expansion and quality indicators in searching microblog posts. In Advances in Information Retrieval, Clough P. et al. (Eds.). Springer, 362--367.
[46]
Taiki Miyanishi, Kazuhiro Seki, and Kuniaki Uehara. 2013. Improving pseudo-relevance feedback via tweet selection. In Proceedings of the 22nd ACM International Conference on Information 8 Knowledge Management. ACM, 439--448.
[47]
Fred Morstatter, Shamanth Kumar, Huan Liu, and Ross Maciejewski. 2013. Understanding Twitter data with tweetxplorer. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1482--1485.
[48]
Tamara Munzner. 2014. Visualization Analysis and Design. CRC Press.
[49]
Chris North and Ben Shneiderman. 2000. Snap-together visualization: A user interface for coordinating visualizations via relational schemata. In Proceedings of the Working Conference on Advanced Visual Interfaces. ACM, 128--135.
[50]
Brendan O’Connor, Michel Krieger, and David Ahn. 2010. TweetMotif: Exploratory search and topic summarization for Twitter. In Proceedings of the 4th International AAAI Conference on Weblogs and Social Media. 384--385.
[51]
Victor Pascual-Cid and Andreas Kaltenbrunner. 2009. Exploring asynchronous online discussions through hierarchical visualisation. In Proceedings of the 13th International Conference Information Visualisation. IEEE, 191--196.
[52]
M.-H. Peetz, Damiano Spina, Julio Gonzalo, M. Rijke, and others. 2013. Towards an active learning system for company name disambiguation in microblog streams. In Proceedings of the CEUR Workshop.
[53]
Philips Kokoh Prasetyo, Palakorn Achananuparp, and Ee-Peng Lim. 2016. On analyzing geotagged tweets for location-based patterns. In Proceedings of the 17th International Conference on Distributed Computing and Networking. ACM, 45--50.
[54]
Runwei Qiang, Feifan Fan, Chao Lv, and Jianwu Yang. 2015. Knowledge-based query expansion in real-time microblog search. In Information Retrieval Technology, G. Zuccon, S. Geva, H. Joho, F. Scholer, A. Sun, and P. Zhang (Eds.), Lecture Notes in Computer Science, vol. 9460. Springer, Cham.
[55]
Jonathan C. Roberts. 2007. State of the art: Coordinated 8 multiple views in exploratory visualization. In Proceedings of the 5th International Conference on Coordinated and Multiple Views in Exploratory Visualization (CMV’07). IEEE, 61--71.
[56]
Daniel M. Romero, Brendan Meeder, and Jon Kleinberg. 2011. Differences in the mechanics of information diffusion across topics: Idioms, political hashtags, and complex contagion on Twitter. In Proceedings of the 20th International Conference on World Wide Web. ACM, 695--704.
[57]
Kevin Dela Rosa, Rushin Shah, Bo Lin, Anatole Gershman, and Robert Frederking. 2011. Topical clustering of tweets. In Proceedings of the ACM SIGIR: Social Web Search and Mining.
[58]
David Shamma, Lyndon Kennedy, and Elizabeth Churchill. 2010. Tweetgeist: Can the Twitter timeline reveal the structure of broadcast events. In Proceedings of the CSCW Horizons (2010), 589--593.
[59]
Malcolm Slaney and Michael Casey. 2008. Locality-sensitive hashing for finding nearest neighbors. IEEE Signal Processing Magazine 25, 2 (2008), 128--131.
[60]
Axel J. Soto, Abidalrahman Mohammad, Andrew Albert, Aminul Islam, Evangelos Milios, Michael Doyle, Rosane Minghim, and Maria Cristina Ferreira de Oliveira. 2015. Similarity-based support for text reuse in technical writing. In Proceedings of the 2015 ACM Symposium on Document Engineering. ACM, 97--106.
[61]
Bharath Sriram, Dave Fuhry, Engin Demir, Hakan Ferhatosmanoglu, and Murat Demirbas. 2010. Short text classification in Twitter to improve information filtering. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 841--842.
[62]
Guodao Sun, Yingcai Wu, Shixia Liu, Tai-Quan Peng, Jonathan J. H. Zhu, and Ronghua Liang. 2014. EvoRiver: Visual analysis of topic competition on social media. IEEE Transactions on Visualization and Computer Graphics 20, 12 (2014), 1753--1762.
[63]
James J. Thomas, Kristin Cook, and others. 2006. A visual analytics agenda. IEEE Computer Graphics and Applications 26, 1 (2006), 10--13.
[64]
Michelle Q. Wang Baldonado, Allison Woodruff, and Allan Kuchinsky. 2000. Guidelines for using multiple views in information visualization. In Proceedings of the Working Conference on Advanced Visual Interfaces. ACM, 110--119.
[65]
Colin Ware. 2012. Information Visualization: Perception for Design. Elsevier.
[66]
Jinxi Xu and W. Bruce Croft. 1996. Query expansion using local and global document analysis. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 4--11.
[67]
Panpan Xu, Yingcai Wu, Enxun Wei, Tai-Quan Peng, Shixia Liu, J. J. H. Zhu, and Huamin Qu. 2013. Visual analysis of topic competition on social media. IEEE Transactions on Visualization and Computer Graphics 19, 12 (Dec 2013), 2012--2021.

Cited By

View all

Index Terms

  1. ATR-Vis: Visual and Interactive Information Retrieval for Parliamentary Discussions in Twitter

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Knowledge Discovery from Data
      ACM Transactions on Knowledge Discovery from Data  Volume 12, Issue 1
      Special Issue (IDEA) and Regular Papers
      February 2018
      363 pages
      ISSN:1556-4681
      EISSN:1556-472X
      DOI:10.1145/3178542
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 06 February 2018
      Accepted: 01 January 2017
      Revised: 01 September 2016
      Received: 01 December 2015
      Published in TKDD Volume 12, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Information retrieval
      2. active learning
      3. visual analytics

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • CALDO-FAPESP
      • ELAP scholarship
      • FAPESP
      • NSERC
      • International Development Research Centre
      • CNPq

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)21
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 27 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media