These are chat archives for xem/miniDiff

9th
Jun 2016
Mathieu 'p01' Henri
@p01
Jun 09 2016 07:25
Haloa, I haven't followed the progress here, but I thought the 136b Levenshtein distance I wrote back then https://gist.github.com/p01/1127070 might be helpful and decent base to extract a diff
Martin Kleppe
@aemkei
Jun 09 2016 07:28
Ha!
:wrench:
not sure what it does or how it works but very nice :D
Martin Kleppe
@aemkei
Jun 09 2016 07:31
But maybe it already works with a simple word check. @xem where do you need the permissions for?
permissions?
Mathieu 'p01' Henri
@p01
Jun 09 2016 07:33
The idea of doing the diff check at the "word" level is interesting.
white space and new lines differences might need to be spit or trimmed
ignoring spaces vs. tabs or CF vs. LR line breaks could be a bonus yes
Martin Kleppe
@aemkei
Jun 09 2016 07:44
damn you, auto correction! "permission" = "permutations".
the permutations i'm talking about consist of trying every combination of deletions and additions in order to find the smallest transformation from s1 to s2
I explained the algorithm in the source code of http://xem.github.io/miniDiff/ (still need to implement it)
of course, if the edits were just words replaced with other words, it'd be easy. but sometimes words are just removed or just added, and that's the hard problem.
I'm not sure what c and d ("rows of distance matrix") represents in @p01's gist?
you can do it building the whole matrix of N x M cells of just using two rows of the size of the longest string, as seen in the example implementation below
it's very clever. is it easy to retrieve the letters that have changed between the two strings?
Mathieu 'p01' Henri
@p01
Jun 09 2016 08:50
Dont't remember exactly. But one thing to note is that the a and b don't have to be strings, they can be arrays of strings like you folks started to work
Sorry for the lame input :p
it's great! It's basically the same principle as what I was going to do, but this approach is enlightening. thanks
I can build the matrixes shown in https://en.wikipedia.org/wiki/Levenshtein_distance#Iterative_with_full_matrix but... it's not easy to get the list of transformations out of it
Here's a first prototype that detects the edit between the two strings
For some mysterious reason, it fails in all the other cases... (ex: add "var " at the beginning of s1 or s2)
still doesn't work perfectly...
Tommy Hodgins
@tomhodgins
Jun 09 2016 18:27
nice!
so at a line-level it's working great, it's just still thinking about letters in words that are sometimes larger than they need to?
no it's not working perfectly yet at line-level, its buggy when string1 contains words that string2 doesnt have