Towards Personalized Evaluation of Large Language Models with An Anonymous Crowd-Sourcing Platform
Abstract
Supplemental Material
- Download
- 106.77 MB
References
Index Terms
- Towards Personalized Evaluation of Large Language Models with An Anonymous Crowd-Sourcing Platform
Recommendations
A Survey on Evaluation of Large Language Models
Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role in both research and daily use, their evaluation becomes ...
Large Language Models are Diverse Role-Players for Summarization Evaluation
Natural Language Processing and Chinese ComputingAbstractText summarization has a wide range of applications in many scenarios. The evaluation of the quality of the generated text is a complex problem. A big challenge to language evaluation is that there is a clear divergence between existing metrics ...
Leveraging Large Language Models for Analysis of Student Course Feedback
COMPUTE '23: Proceedings of the 16th Annual ACM India Compute ConferenceThis study investigates the use of large language models, specifically ChatGPT, to analyse the feedback from a Summative Evaluation Tool (SET) used to collect student feedback on the quality of teaching. We find that these models enhance comprehension of ...
Comments
Information & Contributors
Information
Published In
- General Chairs:
- Tat-Seng Chua,
- Chong-Wah Ngo,
- Program Chairs:
- Ravi Kumar,
- Hady W. Lauw,
- Roy Ka-Wei Lee
Sponsors
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Check for updates
Author Tags
Qualifiers
- Short-paper
Funding Sources
Conference
Acceptance Rates
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 60Total Downloads
- Downloads (Last 12 months)60
- Downloads (Last 6 weeks)9
Other Metrics
Citations
View Options
Get Access
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in