| 1 | .. -*- coding: UTF-8 -*-
|
|---|
| 2 |
|
|---|
| 3 | Starting from Bayes' Theorem, assuming X and A are independent::
|
|---|
| 4 |
|
|---|
| 5 | p(X|A)·p(A) = p(X∩A) = p(A∩X) = p(A|X)·p(X)
|
|---|
| 6 |
|
|---|
| 7 | If X is "the intended word was X" and A is "the typed word was A", let m be the
|
|---|
| 8 | number of edits required to get from X to A, and assume each edit is equally
|
|---|
| 9 | likely with probability p, then we want to pick the most likely X which the
|
|---|
| 10 | user intended, given they actually typed A - i.e. we want to::
|
|---|
| 11 |
|
|---|
| 12 | find X to maximise p(X|A) = p(A|X)·p(X)/p(A)
|
|---|
| 13 | ⇔ maximise p(A|X)·p(X)
|
|---|
| 14 | ⇔ maximise p(A|X)·freq(X)
|
|---|
| 15 | ⇔ maximise pᵐ·freq(X)
|
|---|
| 16 | ⇔ maximise log(p).m + log(freq(X))
|
|---|
| 17 |
|
|---|
| 18 | Since p < 1, log(p) is negative, so we want to::
|
|---|
| 19 |
|
|---|
| 20 | minimise m + log(freq(X))/log(p)
|
|---|
| 21 |
|
|---|
| 22 | It's likely X and A aren't actually independent, and that not all edits have
|
|---|
| 23 | equal probability, but this does at least give us a formula which instinctively
|
|---|
| 24 | feels plausible and which has some theoretical justification. We should also
|
|---|
| 25 | be able to optimise it fairly well.
|
|---|
| 26 |
|
|---|
| 27 | We do need an estimate of the p though.
|
|---|