Ongoing

Evaluation of Personalization in Large Language Models
Historically, much attention has been given to accuracy in ML models, including large and small (neural) language models (LLMs/SLMs). In recent years, other aspects such as fluency, factuality, coherence, and consistency have been explored. However, another important aspect of “intelligence” is the ability to personalize in situations where the expected response will be inherently subjective to the user’s profile (and how that evolves over time). This project addresses the lack of proper evaluation measures and systematic probing techniques for the degree of personalization in modern SOTA LLMs/SLMs.

Harnessing Data Augmentation: Enhancing Model Performance through Diverse Data Data augmentation plays a vital role in improving model performance by introducing diversity into training datasets. By generating synthetic data, we can enrich datasets with varied user interactions and preferences, allowing models to learn from a broader spectrum of inputs. This diversity is essential for addressing the limitations of existing datasets, particularly in capturing the nuances of user behavior and context. By effectively implementing data augmentation techniques, we enhance the adaptability and relevance of model outputs, ultimately leading to more personalized and effective solutions across a wide range of applications.

Personalized Summarization Inducer Using Temporal Knowledge Graphs Personalized text summarization models commonly rely on static user preferences, considering only explicit inputs without adapting to the natural evolution of these preferences over time. However, capturing how a user's interests shift provides critical insights into their unique behaviors and preferences. To bridge this gap, our ongoing project involves a knowledge graph-based approach that enhances personalization by generating dynamic key phrases representing a user’s evolving interests. These key phrases not only enrich the personalization process but also assist large language models (LLMs) in generating more tailored and contextually relevant summaries. Rather than modifying existing summarization model architectures, our approach introduces these key phrases as auxiliary inputs, guiding and enriching personalization in a seamless, adaptable way.

Auto-PerSEval: An Automated Framework for Evaluating Personalized Summarization Models In our ongoing project, we are developing Auto-PerSEval, an innovative approach to auto-evaluate personalized summarization models. This project builds upon our previous work on PerSEval, which introduced a novel metric for assessing the degree of personalization in text summaries. Auto-PerSEval aims to automate this evaluation process, enabling real-time assessment of summarization outputs based on individual user preferences and historical interactions.

By leveraging a learnable loss function, Auto-PerSEval can dynamically adapt to the specific nuances of user behavior, offering a more refined and context-sensitive evaluation. This not only enhances the accuracy of the evaluation process but also ensures that models can be iteratively improved based on direct feedback from their performance.

Our methodology focuses on capturing the intricate relationship between user engagement and summarization quality, allowing for a more comprehensive understanding of how well a model meets personalized needs. With Auto-PerSEval, we aspire to create a robust framework that simplifies the evaluation of personalized summarization while maintaining a high correlation with human judgment. This advancement has the potential to transform how we assess and improve personalized summarization models, making them more responsive and relevant to users over time.

LLMs Explainability: Building Trust, Transparency, and Interpretability In our ongoing project focused on explainability, we strive to enhance the transparency and understandability of large language models (LLMs) for users across various domains. By prioritizing explainability, we aim to build trust, allowing users to not only see what an LLM generates but also comprehend why and how specific responses are formulated. This transparency is invaluable for researchers, developers, and end-users, as it unveils the inner workings of the model, aids in diagnosing biases, and ensures that responses align with user expectations. By making LLMs more interpretable, our work empowers informed and responsible usage, fostering user confidence and facilitating seamless integration into real-world applications.