Matching text
In order to make multiple choice questions more challenging I wanted to select answers that are pretty similar as the user proves good at the topic.
Matchoo has a pretty robust understanding of the player doing well already and uses it to tweak numbers of words shown, speed of progress, amount of repetition.
But how do you say these words are similar?
Setting up the framework for this, so that each game mode could specific its own scoring for the learning system to lean on when choose what to show the player, I started with a very simple "how many letters are common to each word".
This game me nothing. I wasn't really expecting it to but somehow was disappointed!
So with the framework in place I went researching algorithms and came up with a solution that seems to give some good results.
With the relatively small dictionaries Matchoo uses I found that this was a bit precise. I wanted so pull in words that were fairly similar in structure and sound but might share no actual letters.
Enter SoundEx. SoundEx is a simple approach to reduce words to a set of simplified sounds. It's biased towards English so is something I'll have to revisit, but as we're not spellchecking or making typing suggestions I hope there's more leeway for use with other languages.
My thought behind using the LDist again here is it allows for out of order similarities a bit better than a numerical difference.
Using SoundEx seemed a bit fuzzy though, so I summed result of both the original strings LDist and the LDist of the SoundEx reduction.
Getting closer!
Worked a treat! In fact, leaving as is, task complete for now.
Matchoo has a pretty robust understanding of the player doing well already and uses it to tweak numbers of words shown, speed of progress, amount of repetition.
But how do you say these words are similar?
Setting up the framework for this, so that each game mode could specific its own scoring for the learning system to lean on when choose what to show the player, I started with a very simple "how many letters are common to each word".
This game me nothing. I wasn't really expecting it to but somehow was disappointed!
So with the framework in place I went researching algorithms and came up with a solution that seems to give some good results.
Step 1 - Proper difference between words evaluation
The first thing I found was calculating Levenshtein Distance between words. This creates a matrix of required edits to change one word into another and calculates the minimal path length of those changes. We'll call this the LDist from now on.With the relatively small dictionaries Matchoo uses I found that this was a bit precise. I wanted so pull in words that were fairly similar in structure and sound but might share no actual letters.
Enter SoundEx. SoundEx is a simple approach to reduce words to a set of simplified sounds. It's biased towards English so is something I'll have to revisit, but as we're not spellchecking or making typing suggestions I hope there's more leeway for use with other languages.
Step 2 - Sound patterns, rather than letters
SoundEx reduces words to Letter plus three digits. When auto completing you might look for other words that have the same resultant four character code, or close codes. What did I do with that reduction? Well currently I'm using the LDist between them! An alternative approach would be to change the first letter into a code (1-6) and then do a numerical distance on the resultant number - not sure if that would be base 6 or 10 though.My thought behind using the LDist again here is it allows for out of order similarities a bit better than a numerical difference.
Using SoundEx seemed a bit fuzzy though, so I summed result of both the original strings LDist and the LDist of the SoundEx reduction.
Getting closer!
Step 3 - Word starts get extra attention
When testing English/Spanish words, and Element Names I felt that the initial letter needed some extra influence so I added a simple Matching Letter Streak evaluation. This counts up the number of the first four letters that match in each word.Worked a treat! In fact, leaving as is, task complete for now.
Comments
Post a Comment