Comparative Study of Language Models on Cross-Domain Data with Model Agnostic Explainability (2009.04095v1)
Abstract: With the recent influx of bidirectional contextualized transformer LLMs in the NLP, it becomes a necessity to have a systematic comparative study of these models on variety of datasets. Also, the performance of these LLMs has not been explored on non-GLUE datasets. The study presented in paper compares the state-of-the-art LLMs - BERT, ELECTRA and its derivatives which include RoBERTa, ALBERT and DistilBERT. We conducted experiments by finetuning these models for cross domain and disparate data and penned an in-depth analysis of model's performances. Moreover, an explainability of LLMs coherent with pretraining is presented which verifies the context capturing capabilities of these models through a model agnostic approach. The experimental results establish new state-of-the-art for Yelp 2013 rating classification task and Financial Phrasebank sentiment detection task with 69% accuracy and 88.2% accuracy respectively. Finally, the study conferred here can greatly assist industry researchers in choosing the LLM effectively in terms of performance or compute efficiency.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.