Skip to content
Fuzzy text search
Search
Ctrl
K
Cancel
Twitter
GitHub
Select theme
Dark
Light
Auto
Intro
Playground
Measures
Bag distance
Cosine similarity
Damerau-Levenshtein distance
Dice coefficient
Hamming distance
Harmonic edit distance
Jaccard index
Jaccard index, generalized
Jaro similarity
Jaro-Winkler similarity
LCS distance
Levenshtein distance
Marzal-Vidal edit distance
Monge-Elkan similarity
Monge-Elkan similarity, generalized
Otsuka-Ochiai coefficient
Overlap coefficient
Term Frequency — Inverse Document Frequency
Tversky index
Normalisations
Higuera, Mico (normalisation)
Levy et al. (normalisation)
Li, Bo (normalisation)
Notes
Algorithms
Confusion matrix
Edit distance
Exact String Matching
Ngrams, Qgrams, skip-grams, etc.
Metric measure
Normalised measure
Pairwise alignment
How can we represent a string?
String similarity measure
Twitter
GitHub
Select theme
Dark
Light
Auto
Monge-Elkan similarity
s
i
m
M
E
(
x
,
y
)
=
1
∣
x
∣
∑
i
=
1
∣
x
∣
max
j
=
1
,
∣
y
∣
(
s
i
m
(
x
i
,
y
j
)
)
sim_{ME}(x, y) = \frac{1}{|x|} \sum_{i=1}^{|x|} \max_{j=1,|y|}(sim(x_i, y_j))
s
i
m
ME
(
x
,
y
)
=
∣
x
∣
1
i
=
1
∑
∣
x
∣
j
=
1
,
∣
y
∣
max
(
s
im
(
x
i
,
y
j
))
Where
s
i
m
sim
s
im
can be any other similarity measure.
Reading
Monge, Alvaro E. and Charles Peter Elkan. “The Field Matching Problem: Algorithms and Applications.”
Knowledge Discovery and Data Mining
(1996).
↗