These are chat archives for xem/miniDiff

Jun 2016
Mathieu 'p01' Henri
Jun 09 2016 07:25
Haloa, I haven't followed the progress here, but I thought the 136b Levenshtein distance I wrote back then might be helpful and decent base to extract a diff
Martin Kleppe
Jun 09 2016 07:28
not sure what it does or how it works but very nice :D
Martin Kleppe
Jun 09 2016 07:31
But maybe it already works with a simple word check. @xem where do you need the permissions for?
Mathieu 'p01' Henri
Jun 09 2016 07:33
The idea of doing the diff check at the "word" level is interesting.
white space and new lines differences might need to be spit or trimmed
ignoring spaces vs. tabs or CF vs. LR line breaks could be a bonus yes
Martin Kleppe
Jun 09 2016 07:44
damn you, auto correction! "permission" = "permutations".
the permutations i'm talking about consist of trying every combination of deletions and additions in order to find the smallest transformation from s1 to s2
I explained the algorithm in the source code of (still need to implement it)
of course, if the edits were just words replaced with other words, it'd be easy. but sometimes words are just removed or just added, and that's the hard problem.
I'm not sure what c and d ("rows of distance matrix") represents in @p01's gist?
you can do it building the whole matrix of N x M cells of just using two rows of the size of the longest string, as seen in the example implementation below
it's very clever. is it easy to retrieve the letters that have changed between the two strings?
Mathieu 'p01' Henri
Jun 09 2016 08:50
Dont't remember exactly. But one thing to note is that the a and b don't have to be strings, they can be arrays of strings like you folks started to work
Sorry for the lame input :p
it's great! It's basically the same principle as what I was going to do, but this approach is enlightening. thanks
I can build the matrixes shown in but... it's not easy to get the list of transformations out of it
Here's a first prototype that detects the edit between the two strings
For some mysterious reason, it fails in all the other cases... (ex: add "var " at the beginning of s1 or s2)
still doesn't work perfectly...
Tommy Hodgins
Jun 09 2016 18:27
so at a line-level it's working great, it's just still thinking about letters in words that are sometimes larger than they need to?
no it's not working perfectly yet at line-level, its buggy when string1 contains words that string2 doesnt have