skip to main content
10.1145/3539618.3591826acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Multi-lingual Semantic Search for Domain-specific Applications: Adobe Photoshop and Illustrator Help Search

Published: 18 July 2023 Publication History

Abstract

Search has become an integral part of Adobe products and users rely on it to learn about tool usage, shortcuts, quick links, and ways to add creative effects and to find assets such as backgrounds, templates, and fonts. Within applications such as Photoshop and Illustrator, users express domain-specific search intents via short text queries. In this work, we leverage sentence-BERT models fine-tuned on Adobe's HelpX data to perform multi-lingual semantic search on help and tutorial documents. We used behavioral data (queries, clicks, and impressions) and additional annotated data to train several BERT-based models for scoring query-document pairs for semantic similarity. We benchmarked the keyword-based production system against semantic search. Subsequent AB tests demonstrate that this approach improves engagement for longer queries while reducing null results significantly.

References

[1]
Eugene Agichtein, Eric Brill, and Susan Dumais. 2006. Improving web search ranking by incorporating user behavior information. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 19--26. https://doi.org/10.1145/1148170.1148177
[2]
Sebastian Bruch, Claudio Lucchese, and Franco Maria Nardini. 2022. ReNeuIR: Reaching Efficiency in Neural Information Retrieval. In SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 3462--3465. https://doi.org/10.1145/3477495.3531704
[3]
Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2020. Unsupervised Cross-lingual Representation Learning at Scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 8440--8451. https://doi.org/10.18653/v1/2020.acl-main.747
[4]
Jacob Devlin and Ming-Wei Chang. 2018. Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing. https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html Google AI Blog.
[5]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT. ACL, 4171--4186. arXiv:1810.04805v2.
[6]
R. Guha, Rob McCool, and Eric Miller. 2003. Semantic Search. In WWW '03: Proceedings of the 12th international conference on World Wide Web. ACM, 700--709.
[7]
Adobe Inc. 2023. HelpX. https://helpx.adobe.com/support.html Adobe online public help pages.
[8]
Tracy Holloway King, Chirag Arora, Francois Guerin, Sachin Kelkar, and Judy Massuda. 2021. The Last Mile: Taking Query Language Identification from Model Ready to Production. In SIGIR ECOM21: Workshop on Ecommerce Search. CEUR.
[9]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. https://doi.org/10.48550/ARXIV.1907.11692
[10]
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation Learning with Contrastive Predictive Coding. https://doi.org/10.48550/ARXIV.1807.03748
[11]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. ACL, 3982--3992. arXiv preprint arXiv:1908.10084.
[12]
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. https://doi.org/10.48550/ARXIV.1910.01108
[13]
Ritiz Tambi, Ajinkya Kale, and Tracy Holloway King. 2020. Search Query Language Identification Using Weak Labeling. In Proceedings of LREC 2020. LREC, 3513--3520.

Cited By

View all

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2023
3567 pages
ISBN:9781450394086
DOI:10.1145/3539618
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. fine-tuning text embeddings
  2. null result recovery
  3. semantic search

Qualifiers

  • Short-paper

Conference

SIGIR '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)62
  • Downloads (Last 6 weeks)3
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media