Finding Complex Patterns in Trajectory Data via Geometric Set Cover (2308.14865v2)
Abstract: Clustering trajectories is a central challenge when faced with large amounts of movement data such as GPS data. We study a clustering problem that can be stated as a geometric set cover problem: Given a polygonal curve of complexity $n$, find the smallest number $k$ of representative trajectories of complexity at most $l$ such that any point on the input trajectories lies on a subtrajectory of the input that has Fr\'echet distance at most $\Delta$ to one of the representative trajectories. In previous work, Br\"uning et al.~(2022) developed a bicriteria approximation algorithm that returns a set of curves of size $O(kl\log(kl))$ which covers the input with a radius of $11\Delta$ in time $\widetilde{O}((kl)2n + kln3)$, where $k$ is the smallest number of curves of complexity $l$ needed to cover the input with a radius of $\Delta$. The representative trajectories computed by this algorithm are always line segments. In the applications however, one is usually interested in more complex representative curves which consist of several edges. We present a new approach that builds upon previous work computing a set of curves of size $O(k\log(n))$ in time $\widetilde{O}(l2n4 + kln4)$ with the same distance guarantee of $11\Delta$, where each curve may consist of curves of complexity up to the given complexity parameter~$l$. We conduct experiments on tracking data of ocean currents and full body motion data suggesting its validity as a tool for analyzing large spatio-temporal data sets.