javascript - Make translation function not translate result again -


i have made simplified version of translation tool similar google translate. idea build simple tool minority language in sweden called "jamska". app built function takes string textarea id #svenska , replaces words in string using regexp.

i've made array called arr that's used in loop of function dictionary. each array item looks this: var arr = [["eldröd", "eillrau"], ["oväder", "over"] ...]. first word in each array item in swedish, , second word in jamska. if regexp finds matching word in loop replaces word using code:

function translate() {  var str = $("#svenska").val(); var newstr = ""; (var = 0; < arr.length; i++) {     var replace = arr[i][0];     var replacewith = arr[i][1];     var re = new regexp('(^|[^a-z0-9åäö])' + replace + '([^a-z0-9åäö]|$)', 'ig');     str = str.replace(re, "$1" + replacewith + '$2'); }  $("#jamska").val(str);  } 

the translate() called in event handler when #svenska textarea gets keyup, this: $("#svenska").keyup(function() { translate(); });

the translated string assigned value of textarea id #jamska. far, good.

i have problem though: if translated word in jamska word in swedish, function translates word too. problem occurring because i'm assigning variable str translated version of same variable, using: str = str.replace(re, "$1" + replacewith + '$2');. function using same variable on , on again perform translation.

example: swedish word "brydd" "fel" in jamska. "fel" word in swedish, word after translation "felht", since swedish word "fel" "felht" in jamska.

does have idea how work around problem?

instead of looking each jamska word in input , replacing them respective translation, recommend find word ([a-z0-9åäö]+) in text , replace word either translation if 1 found in dictionary or otherwise:

//var arr = [["eldröd", "eillrau"], ["oväder", "over"] ...]  // i'd better use dictionary instead of array define dictionary   var dict = {      eldröd: "oväder",      eillrau: "over"      // ...  };  var str = "eldröd test eillrau eillrau oväder over";  var translated = str.replace(/[a-z0-9åäö]+/ig, function(m) {      var word = m.tolowercase();      var trans = dict[word];      return trans === undefined ? word : trans;  });  console.log(translated);


update:

if dictionary keys may represented phrases (i.e. technically appear strings spaces), regex should extended include these phrases explicitly. final regex like

(?:phrase 1|phrase 2|etc...)(?![a-z0-9åäö])|[a-z0-9åäö]+ 

it try match 1 of phrases explicitly first , single words. (?![a-z0-9åäö]) lookbehind helps filter out phrases followed letters (e.g. varken bättre eller sämreåäö).

phrases preceded letters implicitly filtered out fact match either fist 1 (and therefore not preceded letter) or it's not first , therefore previous 1 separated current spaces.

//var arr = [["eldröd", "eillrau"], ["oväder", "over"] ...]  // i'd better use dictionary instead of array define dictionary   var dict = {      eldröd: "oväder",      eillrau: "over",      bättre: "better",      "varken bättre eller sämre": "vär å int viller",      "test test": "double test"      // ...  };    var str = "eldröd test eillrau eillrau oväder on test test ";  str += "varken bättre eller sämre ";  str += "don't trans: varken bättre eller sämreåäö";  str += "don't trans again: åäövarken bättre eller sämre";    var phrases = object.keys(dict)      .filter(function(k) { return /\s/.test(k); })      .sort(function(a, b) { return b.length - a.length; })      .join('|');  var re = new regexp('(?:' + phrases + ')(?![a-z0-9åäö])|[a-z0-9åäö]+', 'ig');    var translated = str.replace(re, function(m) {      var word = m.tolowercase();      var trans = dict[word];      return trans === undefined ? word : trans;  });  console.log(translated);


Comments

Popular posts from this blog

javascript - Clear button on addentry page doesn't work -

c# - Selenium Authentication Popup preventing driver close or quit -

tensorflow when input_data MNIST_data , zlib.error: Error -3 while decompressing: invalid block type -