skip to main content
10.1145/1593105.1593232acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesacm-seConference Proceedingsconference-collections
research-article

Using XML to map relationships in hacker forums

Published: 28 March 2008 Publication History

Abstract

XML and the technologies that make use of it (XPath, XSL, etc.) have massive potential for information collection, storage, and visualization. One application in which their functionality stands out is the collection and visualization of social relationships in hacker forums. This information can be used by law enforcement to discover potential informants for a known suspect or by academia to study the complex relationships that exist in the malicious hacker subculture. This paper examines the motivations for, workings of, and uses of the program "XScrape" for the purpose of mapping social networks in hacker forums, but the information it contains can be easily applied to other domains.

References

[1]
Bray, T. et. al. Extensible Markup Language (XML) 1.1 (Second Edition). W3C; see http://www.w3.org/TR/2006/REC-xml11-20060816/.
[2]
Chamberlin, D. and Boyce, R. SEQUEL: A Structured English Query Language. Proceedings of the 1974 ACM SIGFIDET Workshop on Data Description, Access and Control: pp. 249--264. Association for Computing Machinery.
[3]
Clark, J. XSL Transformations (XSLT) Version 1.0. W3C; see http://www.w3.org/TR/xslt/.
[4]
Clark, J. and DeRose, S. XML Path Language (XPath) Version 1.0. W3C; see http://www.w3.org/TR/xpath/.
[5]
Class XPathAPI. Apache; see http://xml.apache.org/xalanj/apidocs/org/apache/xpath/XPathAPI.html.
[6]
Drupal.org Community Plumbing. Drupal; see http://www.drupal.org/.
[7]
Ellson, J. et. al. Graphviz - Graph Visualization Software. Graphviz; see http://www.graphviz.org/.
[8]
Firefox Web Browser: Faster, More Secure, and Customizable. Mozilla; see http://www.mozilla.com/enUS/firefox/.
[9]
Gansner, E. et. al. DOT. Graphviz; see http://www.graphviz.org/.
[10]
Holt, Thomas J. 2007. Subcultural Evolution? Examining the Influence of On- and Off-line Experiences on Deviant Subcultures. In Deviant Behavior, 28. Taylor & Francis Group, LLC, 171--198.
[11]
JTidy: HTML Parser and Pretty Printer in Java. Sourceforge; see http://jtidy.sourceforge.net/project-info.html.
[12]
Nikšić, H. GNU Wget. GNU; see http://www.gnu.org/software/wget/.
[13]
Pemberton, S. et. al. XHTML 1.0 The Extensible HyperText Markup Language (Second Edition): A Reformulation of HTML 4 in XML 1.0. W3C; see http://www.w3.org/TR/xhtml1/.
[14]
Raggett, D. Clean Up Your Pages With HTML Tidy. W3C; see http://www.w3.org/People/Raggett/tidy/.
[15]
Raggett, D., Le Hors, A., and Jacobs, I. HTML 4.01 Specification. W3C; see http://www.w3.org/TR/html401/.
[16]
Spencer, J. XScrape. Sourceforge.net; see http://xscrape.sourceforge.net/.
[17]
VBulletin: Instant Community. VBulletin; see http://www.vbulletin.com/.
[18]
UNC Charlotte Honeynet Project. UNC Charlotte; see http://honeynet.uncc.edu/.
[19]
Wood, L. et. al. Document Object Model (DOM) Level 1 Specification. W3C; see http://www.w3.org/TR/REC-DOM-Level-1/.
[20]
Zigo, V. XPather 1.3. Mozilla; see https://addons.mozilla.org/en-US/firefox/addon/1192.

Cited By

View all

Index Terms

  1. Using XML to map relationships in hacker forums

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ACMSE '08: Proceedings of the 46th annual ACM Southeast Conference
    March 2008
    548 pages
    ISBN:9781605581057
    DOI:10.1145/1593105
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 March 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. HTML
    2. Java
    3. XHTML
    4. XML
    5. XPath
    6. XSL
    7. XScrape
    8. data-mining
    9. forum
    10. hacker
    11. screen scraping
    12. tidy

    Qualifiers

    • Research-article

    Conference

    ACM SE08
    ACM SE08: ACM Southeast Regional Conference
    March 28 - 29, 2008
    Alabama, Auburn

    Acceptance Rates

    Overall Acceptance Rate 502 of 1,023 submissions, 49%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 23 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media