Post by justinv » Thu Oct 21, 2010 2:51 pm

I added spelling suggestion functionality to my site. When you type an incorrectly spelt word into search and it returns no results the spelling suggestion algorithm returns the closest match based on a dictionary of words built from the product titles in the db.

It's not in opencart extension format but if anyone wants it drop me a note and I'll upload the code and you can tweak it as you like.

Example here:

http://www.3gdigital.co.nz/index.php?ro ... egory_id=0

Documentation: OpenCart User Guide
Mods: Total Import PRO | CSV Import PRO | Ecom Tracking | Any Feed | Autosuggest | OpenCart CDN
Image


Active Member

Posts

Joined
Tue Oct 12, 2010 1:24 pm

Post by JAY6390 » Fri Oct 22, 2010 1:53 am

This sounds great Justin
I would appreciate seeing how you did this with the dictionary :)

Image


User avatar
Guru Member

Posts

Joined
Wed May 26, 2010 11:47 pm
Location - United Kingdom

Post by justinv » Fri Oct 22, 2010 6:27 am

Do you mean how I created the dictionary or how I used the dictionary?

To use the dictionary I just modified this guy's code: http://phpir.com/spelling-correction

To create the dictionary I cheated - I put a perl script on a cron job to daily recreate the dictionary on disk. The format is line by line, word,num_occurrences like this:

external,10
hdd,12
ata,15
multi,1

The php speller sucks in that list as a dictionary ('external' => 10, 'hdd' => 12, 'ata' => 15 ...) and uses the code I linked to above for the rest.

Documentation: OpenCart User Guide
Mods: Total Import PRO | CSV Import PRO | Ecom Tracking | Any Feed | Autosuggest | OpenCart CDN
Image


Active Member

Posts

Joined
Tue Oct 12, 2010 1:24 pm

Post by JAY6390 » Fri Oct 22, 2010 6:34 am

Ah very nice, will have a look into it

Image


User avatar
Guru Member

Posts

Joined
Wed May 26, 2010 11:47 pm
Location - United Kingdom

Post by jty » Fri Oct 22, 2010 6:47 am

Very nice. I sent you a PM. Is that what I'm supposed to do to see the code.
I have been wanting to improve Search for a couple of years so I'm very appreciative of this as I am not a developer so all things are hard ::)

jty
Active Member

Posts

Joined
Sat Aug 30, 2008 8:19 am

Post by Moggin » Fri Oct 22, 2010 7:38 am

That looks really good.
I also like the list view you've created for search results.

Active Member

Posts

Joined
Wed May 05, 2010 4:56 am

Post by justinv » Fri Oct 22, 2010 1:44 pm

Here is the code I used to create spelling suggestions:

Code: Select all

<?php
class Speller {

	function __construct($dictfile) {
		$this->dictionary = array();
		$this->file = $dictfile;
		$this->train();
	}

	function train() {
		$f = fopen ($this->file, "r");
	    while ($line= fgets ($f)) {
			$word = split(",", $line);
			$this->dictionary[strtolower($word[0])] = $word[1];
	    }
	    fclose ($f);
	}

	function correct($word) {
	        $word = strtolower($word);
        	if(isset($this->dictionary[$word])) {
        	        return $word;
        	}

       		$edits1 = $edits2 = array();
        	foreach($this->dictionary as $dictWord => $count) {
			if (substr($dictWord, 0, 1) === substr($word, 0, 1)) {
                		$dist = levenshtein($word, $dictWord);
                		if($dist == 1) {
                	        	$edits1[$dictWord] = $count;
                		} else if($dist == 2) {
                		        $edits2[$dictWord] = $count;
                		}
			}
        	}

      		if(count($edits1)) {
        	        arsort($edits1);
                	return key($edits1);
       		} else if(count($edits2)) {
                	arsort($edits2);
                	return key($edits2);
        	}

	        // Nothing better
        	return $word;
	}

}

/**TEST MAIN METHOD**/
//$speller = new Speller('dict.txt');
//print_r($speller->dictionary);
//echo $speller->correct('soni');
//echo $speller->correct('arta');
//echo $speller->correct('mluti');
//echo $speller->correct('canon');

?>

And here is a sample of the dict.txt file that I trained it with - it's basically all unique words from my database and the number of times the word occurs separated by a comma.

USB1394,1
Cassette,9
50GB,1
12MM,1
56W25H,1
FSC8008N,4
WirelessN,10
Products,2
faxesprinters,1
DIMM,21
Party,1
MG5250,1
RS232USBPar,1
LibertyECO2,1
CART319II,1
D60,1
printer,1
TK510M,1
VF62CPk,1
Platenclene,1
NetDefend,1
G08XU,1
Compad32B,1
MDR110LP,1
TN3060,1
Ports,18
AA,29
TRIMode,1
CS8800F,1
Full,30
1080P,10
350,1
LC47BK2PK,1
TZ231,1
TK810M,1
RJ45RJ11,1
can,3
GP5014x6,1
BRACKET,2

Private message me if you want me to help you integrate it to your site.

Thanks!

Documentation: OpenCart User Guide
Mods: Total Import PRO | CSV Import PRO | Ecom Tracking | Any Feed | Autosuggest | OpenCart CDN
Image


Active Member

Posts

Joined
Tue Oct 12, 2010 1:24 pm
Who is online

Users browsing this forum: No registered users and 5 guests