Emergent Mind

Matrices inducing generalized metric on sequences

(2303.08725)
Published Mar 15, 2023 in cs.DM

Abstract

Sequence comparison is a basic task to capture similarities and differences between two or more sequences of symbols, with countless applications such as in computational biology. An alignment is a way to compare sequences, where a giving scoring function determines the degree of similarity between them. Many scoring functions are obtained from scoring matrices. However,not all scoring matrices induce scoring functions which are distances, since the scoring function is not necessarily a metric. In this work we establish necessary and sufficient conditions for scoring matrices to induce each one of the properties of a metric in weighted edit distances. For a subset of scoring matrices that induce normalized edit distances, we also characterize each class of scoring matrices inducing normalized edit distances. Furthermore, we define an extended edit distance, which takes into account a set of editing operations that transforms one sequence into another regardless of the existence of a usual corresponding alignment to represent them, describing a criterion to find a sequence of edit operations whose weight is minimum. Similarly, we determine the class of scoring matrices that induces extended edit distances for each of the properties of a metric.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.