Large Language Models are Zero-Shot Next Location Predictors (2405.20962v3)

Published 31 May 2024 in cs.CY, cs.AI, and cs.CL

Abstract: Predicting the locations an individual will visit in the future is crucial for solving many societal issues like disease diffusion and reduction of pollution. However, next-location predictors require a significant amount of individual-level information that may be scarce or unavailable in some scenarios (e.g., cold-start). LLMs have shown good generalization and reasoning capabilities and are rich in geographical knowledge, allowing us to believe that these models can act as zero-shot next-location predictors. We tested more than 15 LLMs on three real-world mobility datasets and we found that LLMs can obtain accuracies up to 36.2%, a significant relative improvement of almost 640% when compared to other models specifically designed for human mobility. We also test for data contamination and explored the possibility of using LLMs as text-based explainers for next-location prediction, showing that, regardless of the model size, LLMs can explain their decision.

Citations (4)

View on Semantic Scholar

Summary

The paper demonstrates that LLMs, notably GPT-3.5, markedly outperform deep learning models by achieving up to 32.4% ACC@5, representing over a 600% relative improvement.
The study employs varied prompting strategies on real-world mobility datasets from NYC, Tokyo, and Ferrara, effectively leveraging historical context for prediction.
The findings emphasize that rich historical data substantially boosts prediction accuracy and offer actionable insights for applications in urban planning and emergency management.

LLMs are Zero-Shot Next Location Predictors: An Overview

This paper, authored by Ciro Beneduce, Bruno Lepri, and Massimiliano Luca, explores the novel application of LLMs in predicting the future locations an individual might visit, a problem known as next-location prediction (NL). Traditional models for NL, including Markov models and deep learning (DL) techniques, require extensive individual-level data and have notable limitations in geographical transferability and generalization. The authors investigate whether advanced LLMs, known for their generalization and reasoning capabilities, can act as zero-shot predictors for this task.

Significance of Next-Location Prediction

Next-location prediction has far-reaching implications for societal challenges such as traffic management, pollution control, and disaster response management. However, the current state-of-the-art models in NL, though effective within specific geographical or data-rich contexts, falter considerably when deployed in new or data-scarce regions. This geographical dependency poses a significant obstacle to widespread practical applications. The paper proposes that LLMs, given their rich geographical knowledge, might overcome these constraints by functioning effectively as zero-shot predictors, thus providing a potentially universal solution to NL.

Methodology

Evaluation Framework

The paper assesses several popular LLMs, including Llama2, Llama2 Chat, GPT-3.5, and Mistral 7B. These models were evaluated using three real-world mobility datasets: two from Foursquare (NYC and Tokyo) and one private dataset from Ferrara, Italy. These datasets encompass a diverse range of user mobility patterns and geographical areas, providing a robust testbed for the models.

Prompt Design and Testing

The authors crafted prompts supplying LLMs with historical and contextual visit data, asking the LLMs to predict the next location a user might visit. They explored various prompting strategies, including zero-shot, one-shot, and few-shot contexts, to determine the influence of these methods on model performance.

Results

Performance Evaluation

The evaluation metrics focused primarily on accuracy at top-k (ACC@k), with particular emphasis on ACC@1, ACC@3, and ACC@5. The results demonstrated that LLMs, particularly GPT-3.5, significantly outperformed traditional DL models in zero-shot prediction scenarios. Specifically, GPT-3.5 achieved a maximum ACC@5 of 32.4%, which represents a relative improvement exceeding 600% over DL-based NL models under comparable settings.

The paper also investigated the role of historical and contextual data. It was found that increasing the amount of historical information proportionately improved prediction accuracy, confirming that more comprehensive historical data boosts model performance. Conversely, omitting or reducing this information led to notable performance drops, underscoring the critical role of historical context in NL tasks.

Data Contamination Analysis

The authors tackled potential biases from data contamination by testing models on both public and private datasets and conducting a quiz-based verification to ascertain whether models had prior exposure to test data. The results confirmed that LLMs, even when trained on vast, publicly available datasets, did not exhibit data contamination, thus ensuring the integrity of the performance metrics reported.

Discussion

The findings from this paper suggest that LLMs can serve as potent zero-shot next-location predictors, offering a viable solution for applications requiring geographical and contextual adaptability. The models' ability to provide text-based explanations for their predictions also adds a layer of interpretability, enhancing their suitability for deployment in real-world scenarios.

Implications and Future Directions

The implications of this research are profound, both practically and theoretically. On a practical level, accurate NL predictions can revolutionize urban planning, resource allocation, and emergency management, making services more efficient and responsive. From a theoretical perspective, this research highlights the potential for LLMs to function beyond their primary domain of textual data, opening new avenues for their application in spatial and temporal prediction tasks.

Ethical Considerations and Limitations

The use of LLMs for NL prediction raises ethical considerations, particularly concerning data privacy and potential biases. LLMs could inadvertently reinforce existing biases in mobility data, leading to unequal service quality among different demographic groups. Therefore, careful consideration must be given to model training and deployment to ensure fairness and equity.

Conclusion

This paper convincingly demonstrates that LLMs, when provided with appropriately designed prompts, can act as effective zero-shot next-location predictors. The results emphasize the importance of contextual and historical data in enhancing model performance and suggest that LLMs could substantially advance the field of human mobility prediction. Future research could explore more tailored prompt designs and expand the application of LLMs to other predictive tasks requiring spatial and temporal reasoning.

PDF Markdown

Related Papers

Tweets

https://twitter.com/luca_msl/status/1797637735576191448

https://twitter.com/luca_msl/status/1809261124505809295

https://twitter.com/GptMaestro/status/1797844601702903876

https://twitter.com/RecPaperBot/status/1831104896441225333