Auto correct a search query

So your visitor is in a hurry and makes a typo in the search box. Chances are no results might be returned.

By using the Bing Spellcheck API you can auto correct typos in the query and, maybe, return a result.

public string AutoCorrect(string query)
{
    if (query.Length < 5)
    {
	return query;
    }

    SpellCheckClient spellCheckClient = new SpellCheckClient(new ApiKeyServiceClientCredentials("f977078a33624a15adf990a671e334c5"));
    HttpOperationResponse result = spellCheckClient.SpellCheckerWithHttpMessagesAsync(text: query, mode: "spell", setLang: "en").Result;

    IList tokens = result.Body.FlaggedTokens;

    if (tokens.Count == 0)
    {
	return query;
    }

    string autoCorrect = query;
    int diff = 0;

    foreach (SpellingFlaggedToken problemToken in tokens)
    {
	int offset = problemToken.Offset;
	int originalTokenLength = problemToken.Token.Length;

	SpellingTokenSuggestion suggestionToken = problemToken.Suggestions[0];

	if (suggestionToken.Score < .5)
	{
	    continue;
	}

	string suggestion = suggestionToken.Suggestion;

	int lengthDiff = suggestion.Length - originalTokenLength;

	autoCorrect = autoCorrect.Substring(0, offset + diff) + suggestion + autoCorrect.Substring(offset + diff + originalTokenLength);

	diff += lengthDiff;
	}

    return autoCorrect;
}

As with the auto correct on your phone, it kinda depends on how smart Bing is what the result will be, but probably better than with the typo.

You could set the parameter preContextText to give context to the query. "For example, the text string petal is valid. However, if you set preContextText to bike, the context changes and the text string becomes not valid. In this case, the API suggests that you change petal to pedal (as in bike pedal)."

You could also set the parameter postContextText to give context to the query. "For example, the text string read is valid. However, if you set postContextText to carpet, the context changes and the text string becomes not valid. In this case, the API suggests that you change read to red (as in red carpet)."

The combined length of the query, preContextText, and postContextText may not exceed 10,000 characters though.

More info about the API can be found here.

You can find the code also in a gist

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s