Skip to content
Fuzzy text search
Search
Ctrl
K
Cancel
Twitter
GitHub
Select theme
Dark
Light
Auto
Intro
Playground
Measures
Bag distance
Cosine similarity
Damerau-Levenshtein distance
Dice coefficient
Hamming distance
Harmonic edit distance
Jaccard index
Jaccard index, generalized
Jaro similarity
Jaro-Winkler similarity
LCS distance
Levenshtein distance
Marzal-Vidal edit distance
Monge-Elkan similarity
Monge-Elkan similarity, generalized
Otsuka-Ochiai coefficient
Overlap coefficient
Term Frequency — Inverse Document Frequency
Tversky index
Normalisations
Higuera, Mico (normalisation)
Levy et al. (normalisation)
Li, Bo (normalisation)
Notes
Algorithms
Confusion matrix
Edit distance
Exact String Matching
Ngrams, Qgrams, skip-grams, etc.
Metric measure
Normalised measure
Pairwise alignment
How can we represent a string?
String similarity measure
Twitter
GitHub
Select theme
Dark
Light
Auto
Cosine similarity
s
i
m
c
o
s
(
x
⃗
,
y
⃗
)
=
cos
θ
=
x
⃗
⋅
y
⃗
∥
x
⃗
∥
∥
y
⃗
∥
=
∑
i
=
1
n
x
i
y
i
∑
i
=
1
n
x
i
2
∑
i
=
1
n
y
i
2
sim_{cos}(\vec{x} , \vec{y}) = \cos\theta = \frac{\vec{x} \cdot \vec{y} }{\parallel \vec{x} \parallel \parallel \vec{y} \parallel} = \frac{\sum^n_{i=1} x_iy_i}{\sqrt{ \sum^n_{i=1} x_i^2 } \sqrt{ \sum^n_{i=1} y_i^2 }}
s
i
m
cos
(
x
,
y
)
=
cos
θ
=
∥
x
∥∥
y
∥
x
⋅
y
=
∑
i
=
1
n
x
i
2
∑
i
=
1
n
y
i
2
∑
i
=
1
n
x
i
y
i