skip to main content
10.1145/1352664.1352670acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Free access

Adaptive e-mail intention finding mechanism based on e-mail words social networks

Published: 27 August 2007 Publication History

Abstract

Through the rapid evaluation of spam, no fully successful solution for filtering spam has been found. However, the spammers still spread spam by using the same intentions such as advertising and phishing. In this investigation, we propose a mechanism of E-mail Words Social Network (EWSN) for profiling users' intentions related to interesting and uninteresting e-mails. An EWSN is constructed from the information in an individual user's mailbox, and expands e-mail information from the World Wide Web (WWW) via the search engine. Based on the web information and association rules among the words, words and relations are expanded as a words' social network. Via the EWSN, both interested and uninterested EWSNs can be constructed to analyze user intentions. Additionally, an efficiency detection mechanism based on the EWSN is proposed to classify e-mails. Finally, the adaptation algorithm of artificial immune system is applied to EWSN, which is thus adapted to follow the user's confirmed classification results. The experimental results indicate that the proposed system is very helpful for classifying spam e-mails by analyzing senders' intentions. Some ideas for analyzing interested nature of people, and profiling their backgrounds, are also presented.

References

[1]
L. H. Gomes, C. Cazita, J. M. Almeida, V. Almeida, and W. M. Junior. Workload models of spam and legitimate e-mails. Performance Evaluation, 64(7-8):690--714, August 2007.
[2]
A. J. Donnell. The Evolutionary Microcosm of Stock Spam Oapos. IEEE Security & Privacy Magazine, 5(1):70--75, 2007.
[3]
M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz. A Bayesian approach to filtering junk e-mail. AAAI Workshop on Learning for Text Categorization, pages 55--62. 1998.
[4]
H. D. Drucker, D. Wu, and V. Vapnik. Support Vector Machines for spam categorization. IEEE Trans. on Neural Networks, 10(5):1048--1054, 1999.
[5]
J. G. Hidalgo and M. M. Lopez. Combining text and heuristics for cost-sensitive spam filtering. Computational Natural Language Learning Workshop, pages 99--102. 2000.
[6]
A. Secker, A. A. Freitas, and J. Timmis. AISEC: An Artificial Immune System for Email Classification. The IEEE Congress on Evolutionary Computation Proceedings, 1: 131--138, December 2003.
[7]
T. Oda and T. White. Immunity from Spam: An Analysis of an Artificial Immune System for Junk Email Detection. The 4th International Conference on Artificial Immune Systems, pages 276--289. August 2005.
[8]
P. O. Boykin and V. P. Roychowdhury. Leveraging social networks to fight spam. Computer, 38(4):61--68, April 2005.
[9]
A. J. O'Donnell, W. C. Mankowski, and J. Abrahason. Using E-mail Social Network Analysis for Detecting Unauthorized Accounts. In Proceedings of the Third Conference on Email and Anti-spam, July 2006.
[10]
J. S. Kong, B. A. Rezaei, N. Sarshar, V. P. Roychowdhury, and P. O. Boykin. Collaborative Spam Filtering Using Email Networks. Computer, 39(8):67--73, August 2006.
[11]
M. Wong and W. Schlitt. Sender Policy Framework (SPF) for Authorizing Use of Domains in E-mail, Available at: http://www.openspf.org/Project_Overview.
[12]
B. Taylor. Sender Reputation in a Large Webmail Service. In Proceedings of the Third Conference on Email and Anti-spam, July 2006.
[13]
DomainKeys, Proving and Protecting Email Sender Identity, Available at: http://antispam.yahoo.com/domainkeys.
[14]
S. Ahmed and F. Mithun. Word Stemming to Enhance Spam Filtering. In Proceedings of the First Conference on Email and Anti-Spam, July 2004.
[15]
F. li, and M. H. Hsieh. An Empirical Study of Clustering Behavior of Spammers and Group-based Anti-Spam Strategies. In Proceedings of the Third Conference on Email and Anti-spam, July 2006.
[16]
J. Goodman, and W. T. Yih. Online Discriminative Spam Filter Training. In Proceedings of the Third Conference on Email and Anti-spam, July 2006.
[17]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, March 2003.
[18]
V. H. Tuulos and H. Tirri. Combining Topic Models and Social Networks for Chat Data Mining. IEEE/WIC/ACM International Conference on Web Intelligence Proceedings, pages 206--213. September 2004.
[19]
A. McCallum, A. Corrada-Emmanuel, and X. Wang. Topic and Role Discovery in Social Networks. International Joint Conference on Artificial Intelligence, August 2005.
[20]
A. C. Surendran, J. C. Platt, and E. Renshaw. Automatic Discovery of Personal Topics to Organize Email. In Proceedings of the Second Conference on Email and Anti-Spam, July 2005.
[21]
D. M. Blei and J. D. Lafferty. Correlated topic models. Advances in Neural Information Processing Systems, 18:147--154, 2006.
[22]
M. W. Berry. Survey of Text Mining: Clustering, Classification, and Retrieval. Springer-Verlag. 2003.
[23]
G. Fumera, I. Pillai and F. Roli. Spam Filtering Based On The Analysis Of Text Information Embedded Into Images. Journal of Machine Learning Research, 7:2699--2720, December 2006.
[24]
L. N. De Castro, and J. Timmis. Artificial Immune Systems: A New Computational Intelligence Approach. Springer-Verlag. London, September 2002.
[25]
A. A. A. Ferreira, G. Corso, G. Piuvezam, and M. S. C. F. Alves. A Scale-Free Network of EvokedWords. Brazilian Journal of Physics, 36(3A), September 2006.
[26]
A. E. Motter, A. P. S. de Moura, Y. C. Lai, and P. Dasgupta. Topology of the conceptual social network of language. Physical Review E, 65, June 2002.
[27]
H. Ebel, L. I. Mielsch, and S. Bornholdt. Scale-free topology of email networks. Physical Review E, 66, 2002.
[28]
S. A. Hofmeyr and S. Forrest. Immunity by Design: An artificial Immune System. Genetic and Evolutionary Computation Conference, 1999.
[29]
S. A. Hofmeyr and S. Forrest. Architecture for an artificial immune system. Evolutionary Computation journal, 8(4):443--473, 2000.
[30]
P. S. Andrews and J. Timmis. Inspiration for the next generation of artificial immune systems. International Conference on Artificial Immune Systems, pages 126--138. 2005.
[31]
M. J. Martin-Bautista, D. Sanchez, J. Chamorro-Martinez, J. M. Serrano, and M. A. Vila. Mining web documents to find additional query terms using fuzzy association rules. Fuzzy Sets and Systems, 148(1):85--104, November 2004.
[32]
A. Culotta, R. Bekkerman, and A. McCallum. Extracting social networks and contact information from email and the Web. In Proceedings of the First Conference on Email and Anti-Spam, July 2004.
[33]
R. Bekkerman, A. McCallum. Disambiguating Web Appearances of People in a Social Network. International World Wide Web Conference, pages 463--470. May 2005.
[34]
D. Shen, J. T. Sun, Q. Yang, Z. Chen. Building Bridges for Web Query Classification. The 29th ACM International Conference on Research and Development in Information Retrieval, pages 131--138. August 2006.
[35]
B. S. Richard and O. K. Jeffrey. MailCat: an intelligent assistant for organizing e-mail. The third annual conference on Autonomous Agents, pages: 276--282. 1999.
[36]
UCINET, The Software for Social Network Analysis, Available at: http://www.analytictech.com/downloaduc6.htm
[37]
I. H. Witten and E. Frank, Data Mining: Practical machine learning tools and techniques, 2nd Edition, Morgan Kaufmann, San Francisco, 2005.
[38]
R. Kohavi and F. Provost. Glossary of terms. Machine Learning, 30:271--274, 1998.
[39]
S. Wasserman, and K. Faust. Social Networks Analysis: Methods and Applications. Cambridge University Press. 1994.
[40]
P. J. Carrington, J. Scott, and S. Wasserman. Models and Methods in Social Network Analysis. Cambridge University Press. 2005.
[41]
T. Fawcett. "In vivo" spam filtering: A challenge problem for data mining. KDD Explorations, 5(2):140--148, December 2003.
[42]
SPAMASSASSIN, The SpamAssassin corpus, Available at: http://spamassassin.apache.org/publiccorpus/.
[43]
G. V. Cormack and T. R. Lynam. TREC 2005 Spam Track Overview. Fourteenth Text REtrieval Conference, 2005.
[44]
G. V. Cormack and T. R. Lynam. Spam corpus creation for TREC. In Proceedings of the Second Conference on Email and Anti-Spam, July 2005.

Cited By

View all

Index Terms

  1. Adaptive e-mail intention finding mechanism based on e-mail words social networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    LSAD '07: Proceedings of the 2007 workshop on Large scale attack defense
    August 2007
    73 pages
    ISBN:9781595937858
    DOI:10.1145/1352664
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 August 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. artificial immune system
    2. e-mail words social network
    3. intention finding
    4. social network
    5. spam classification

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGCOMM07
    Sponsor:
    SIGCOMM07: ACM SIGCOMM 2007 Conference
    August 27, 2007
    Kyoto, Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)118
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 06 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media