skip to main content
10.1145/3292500.3330737acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open access

Investigate Transitions into Drug Addiction through Text Mining of Reddit Data

Published: 25 July 2019 Publication History

Abstract

Increasing rates of opioid drug abuse and heightened prevalence of online support communities underscore the necessity of employing data mining techniques to better understand drug addiction using these rapidly developing online resources. In this work, we obtained data from Reddit, an online collection of forums, to gather insight into drug use/misuse using text snippets from users narratives. Specifically, using users' posts, we trained a binary classifier which predicts a user's transitions from casual drug discussion forums to drug recovery forums. We also proposed a Cox regression model that outputs likelihoods of such transitions. In doing so, we found that utterances of select drugs and certain linguistic features contained in one's posts can help predict these transitions. Using unfiltered drug-related posts, our research delineates drugs that are associated with higher rates of transitions from recreational drug discussion to support/recovery discussion, offers insight into modern drug culture, and provides tools with potential applications in combating the opioid crisis.

References

[1]
Ahn, Woo-Young, Vassileva, and Jasmin. 2016/04/01. Machine-learning identifies substance-specific behavioral markers for opiate and stimulant dependence. Drug and Alcohol Dependence, Vol. 161 (2016/04/01).
[2]
Magdalena Berger, Todd H. Wagner, and Laurence C. Baker. 2005. Internet use and stigmatized illness. Social Science and Medicine, Vol. 61, 8 (2005), 1821 -- 1827.
[3]
Richard J. Bonnie, Morgan A. Ford, and Jonathan K. Phillips. 2017. Pain Management and the Opioid Epidemic: Balancing Societal and Individual Benefits and Risks of Prescription Opioid Use .The National Academies Press.
[4]
De Choudhury, Munmun, Kiciman, Emre, Mark Dredze, Glen Coppersmith, and Mrinal Kumar. 2016. Discovering Shifts to Suicidal Ideation from Mental Health Content in Social Media. In Proc. of the CHI Conference on Human Factors in Computing Systems . 2098--2110.
[5]
Wilson M. Compton, Christopher M. Jones, and Grant T. Baldwin. 2016. Relationship between Nonmedical Prescription-Opioid Use and Heroin Use. New England Journal of Medicine, Vol. 374, 2 (2016), 154--163.
[6]
Ryan Eshleman, Deeptanshu Jha, and Rahul Singh. 2017. Identifying individuals amenable to drug recovery interventions through computational analysis of addiction content in social media. 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2017).
[7]
Benjamin Fischman. 2018. Data Driven Support for Substance Addiction Recovery Communities. In Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems (CHI EA '18). ACM, New York, NY, USA, SRC07:1--SRC07:6.
[8]
M. R. Gossop and S. B. Eysenck. 1980. A Further Investigation into the Personality of Drug Addicts in Treatment. Addiction, Vol. 75, 3 (1980), 305--311.
[9]
A. Hutchinson. {n. d.}. Reddit Now Has as Many Users as Twitter, and Far Higher Engagement Rates. https://www.socialmediatoday.com/news/reddit-now-has-as-many-users-as-twitter-and-far-higher-engagement-rates/521789/
[10]
Samantha P Wallace Kathlene Tracy. {n. d.}. Benefits of peer support groups in the treatment of addiction. Substance Abuse and Rehabilitation, Vol. 7 ({n. d.}).
[11]
Andrew Kolodny, David T. Courtwright, Catherine S. Hwang, Peter Kreiner, John L. Eadie, Thomas W. Clark, and G. Caleb Alexander. 2015. The Prescription Opioid and Heroin Crisis: A Public Health Approach to an Epidemic of Addiction. Annual Review of Public Health, Vol. 36, 1 (2015), 559--574.
[12]
Quoc Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (ICML'14). JMLR.org, II--1188--II--1196. http://dl.acm.org/citation.cfm?id=3044805.3045025
[13]
Jason B. Luoma, Barbara S. Kohlenberg, Steven C. Hayes, Kara Bunting, and Alyssa K. Rye. 2008. Reducing self-stigma in substance abuse through acceptance and commitment therapy: Model, manual development, and pilot outcomes. Addiction Research & Theory, Vol. 16, 2 (2008), 149--165.
[14]
Diana MacLean, Sonal Gupta, Anna Lembke, Christopher Manning, and Jeffrey Heer. 2015. Forum77: An Analysis of an Online Health Forum Dedicated to Addiction Recovery. In Proc. of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. 1511--1526.
[15]
Kaustubh Mani, Ishan Verma, and Lipika Dey. 2017. Multi-Document Summarization using Distributed Bag-of-Words Model. CoRR, Vol. abs/1710.02745 (2017). arxiv: 1710.02745 http://arxiv.org/abs/1710.02745
[16]
David Nutt, Leslie A King, William Saulsbury, and Colin Blakemore. 2007. Development of a rational scale to assess the harm of drugs of potential misuse. The Lancet, Vol. 369, 9566 (2007), 1047 -- 1053.
[17]
Michael J. Paul and Mark Dredze. 2012. Experimenting with Drugs (and Topic Models): Multi-Dimensional Exploration of Recreational Drug Discussions. In Proc. of AAAI .
[18]
Jonathan Penm, Neil J. MacKinnon, Jill M. Boone, Antonio Ciaccia, Cameron McNamee, and Erin L. Winstanley. 2017. Strategies and policies to address the opioid epidemic: A case study of Ohio. Journal of the American Pharmacists Association, Vol. 57, 2, Supplement (2017), S148 -- S153.
[19]
James W. Pennebaker. 2013. The secret life of pronouns: what our words say about us .Bloomsbury Press.
[20]
James W. Pennebaker, Matthias R. Mehl, and Kate G. Niederhoffer. 2003. Psychological Aspects of Natural Language Use: Our Words, Our Selves. Annual Review of Psychology, Vol. 54, 1 (2003), 547--577.
[21]
Dragomir R. Radev, Hongyan Jing, Malgorzata Stys, and Daniel Tam. 2004. Centroid-based Summarization of Multiple Documents. Inf. Process. Manage., Vol. 40, 6 (Nov. 2004), 919--938.
[22]
Mark Edmund Rose. 2018. Are Prescription Opioids Driving the Opioid Crisis? Assumptions vs Facts. Pain Medicine, Vol. 19, 4 (2018), 793--807.
[23]
Abeed Sarker, Karen O'Connor, Rachel Ginn, Matthew Scotch, Karen Smith, Dan Malone, and Graciela Gonzalez. 2016. Social Media Mining for Toxicovigilance: Automatic Monitoring of Prescription Medication Abuse from Twitter. Drug Safety, Vol. 39, 3 (01 Mar 2016), 231--240.
[24]
Aaron L. Sarvet, Melanie M. Wall, David S. Fink, Emily Greene, Aline Le, Anne E. Boustead, Rosalie Liccardo Pacula, Katherine M. Keyes, Magdalena Cerdá, Sandro Galea, and Deborah S. Hasin. {n. d.}. Medical marijuana laws and adolescent marijuana use in the United States: a systematic review and meta-analysis. Addiction, Vol. 113, 6 ( {n. d.}), 1003--1016.
[25]
P Seth, L Scholl, R Rudd, and Bacon S. 2018. Overdose Deaths Involving Opioids, Cocaine, and Psychostimulant -- United States, 2015--2016., Vol. 67 (2018), 349--358.
[26]
Harald Steck, Balaji Krishnapuram, Cary Dehing-oberije, Philippe Lambin, and Vikas C Raykar. 2008. On Ranking in Survival Analysis: Bounds on the Concordance Index. In Advances in Neural Information Processing Systems 20. Curran Associates, Inc., 1209--1216. http://papers.nips.cc/paper/3375-on-ranking-in-survival-analysis-bounds-on-the-concordance-index.pdf
[27]
Kim Jung Sunny, Marsch A. Lisa, Hancock T. Jeffrey, and Das K. Amarendra. 2017. Scaling Up Research on Drug Abuse and Addiction Through Social Media Big Data. J Med Internet Res, Vol. 19, 10 (31 Oct 2017), e353.

Cited By

View all

Index Terms

  1. Investigate Transitions into Drug Addiction through Text Mining of Reddit Data

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
    July 2019
    3305 pages
    ISBN:9781450362016
    DOI:10.1145/3292500
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 July 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cox regression
    2. drug addiction and recovery
    3. reddit forum
    4. text mining

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    KDD '19
    Sponsor:

    Acceptance Rates

    KDD '19 Paper Acceptance Rate 110 of 1,200 submissions, 9%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)643
    • Downloads (Last 6 weeks)54
    Reflects downloads up to 03 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media