Jump to content

[TOPIC: topicViewTemplate]
[GLOBAL: userSmallPhoto]
Photo

Trying to replace string with "\" (backslash)
Started by bbk May 22 2019 10:37 PM

- - - - -
22 replies to this topic
string.gsub special characters
[TOPIC CONTROLS]
[/TOPIC CONTROLS]
[modOptionsDropdown]
[/modOptionsDropdown]
[reputationFilter]
[TOPIC: post.html]
#1

bbk

[GLOBAL: userInfoPane.html]
bbk
  • Enthusiast

  • 61 posts
  • Corona SDK

I am submitting information using network.request and GET

_G.theNetworkRequest = network.request( requestString, "GET", serverCommunicate.networkListener,params );

the requestString could look like this: https:\\www.somewhere.com?description=mytext

 

The requestString contains information entered by the user, and I need to find a way to handle special characters that are not causing the request string to be malformed.

 

if the description, instead of "mytext" contains characters like  ^  or \, the URL is malformed.

 

Therefore I replace these characters with e.g. "-!!!BACKSLASH!!!-" and when read the data that is returned from the server I want to replace "-!!!BACKSLASH!!!-" with "\" again.

 

The problem is the string.gsub() function.

 

 

 

Here is my code: 

    theDescription = "some text -!!!BACKSLASH!!!- and some more text"
    local lookfor = "-!!!BACKSLASH!!!-";  
    local  replacewith = '\\';
    theDescription = string.gsub(theDescription,lookfor,replacewith);
  • One problem is that theDescription after this substitution contains the text:

    "some text \\!- and some more text"

    in other words, the gsub() functions does not replace the whole string, but leaves "!-" at the end of it.

    (I have tried replacewith = "%\\", but it gives the same result)

 

  • The other problem is that I cannot set the variable replacewith = "\" (or "%\"), because then the IDE (ZeroBrane Studios) tells me that "\" is an unfinished string and will not comply the code.

 

 

 

 



[TOPIC: post.html]
#2

XeduR @Spyric

[GLOBAL: userInfoPane.html]
XeduR @Spyric
  • Contributor

  • 764 posts
  • Corona SDK

It'll be easier to just use mime.
 

local mime = require("mime")

local string = "/^s.-!%)\\'"
print(string) -- output: /^s.-!%)\'
string = mime.b64(string)
print(string) -- output: L15zLi0hJSlcJw==
string = mime.unb64(string)
print(string) -- output: /^s.-!%)\'

Using mime's base64 encoding and decoding you can make any string safe to submit in a URL. All you need to do is encode at source, send and decode at destination.


  • agramonte likes this

[TOPIC: post.html]
#3

carloscosta

[GLOBAL: userInfoPane.html]
carloscosta
  • Contributor

  • 636 posts
  • Corona SDK

you can change the text you replace with a text that gsub works, for example:

 

 

local theDescription = "some text ####BACKSLASH#### and some more text"

    
local lookfor = "####BACKSLASH####";
    local replacewith = '\\';
    theDescription = string.gsub(theDescription,lookfor,replacewith);
 
print (theDescription)


[TOPIC: post.html]
#4

bbk

[GLOBAL: userInfoPane.html]
bbk
  • Enthusiast

  • 61 posts
  • Corona SDK

Sorry carloscosta

theDescription = "some text -###BACKSLASH###- and some more text"
local lookfor = "-###BACKSLASH###-";
local replacewith = '\\';
theDescription = string.gsub(theDescription,lookfor,replacewith);
print(" theDescription: "..theDescription)

gives this:

 

-- Output:  theDescription: some text \\#- and some more text



[TOPIC: post.html]
#5

XeduR @Spyric

[GLOBAL: userInfoPane.html]
XeduR @Spyric
  • Contributor

  • 764 posts
  • Corona SDK

Am I missing something?  :huh:

 

If your issue is with URL strings becoming malformed, all you need to do is use base64 encoding with them as in my sample code above. You don't need to play around with gsub or anything.


  • bbk likes this

[TOPIC: post.html]
#6

bbk

[GLOBAL: userInfoPane.html]
bbk
  • Enthusiast

  • 61 posts
  • Corona SDK

Thank XeduR, do you have an example on how to encode it using PHP 7.2 so that I can store the data in the MySQL database ?



[TOPIC: post.html]
#7

XeduR @Spyric

[GLOBAL: userInfoPane.html]
XeduR @Spyric
  • Contributor

  • 764 posts
  • Corona SDK

They are directly available in the manual.

https://www.php.net/manual/en/function.base64-encode.php

 

https://www.php.net/manual/en/function.base64-decode.php


  • bbk likes this

[TOPIC: post.html]
#8

bbk

[GLOBAL: userInfoPane.html]
bbk
  • Enthusiast

  • 61 posts
  • Corona SDK

Thanks, again XeduR,

 

I have managed to implement the decode in the App and the encode on the PHP on the Server.

 

But, I still have a problem with the bakslash (\).

 

If I enter e.g. "abc \" into a native.NewTextField() and read the value

local theValue = theCustomerDescriptionField.text -- abc \
print(theValue) -- output: abc \\


[TOPIC: post.html]
#9

XeduR @Spyric

[GLOBAL: userInfoPane.html]
XeduR @Spyric
  • Contributor

  • 764 posts
  • Corona SDK

Ah, yeah. Those occur because backlashes are used to escape characters in Lua. I'm guessing that your string is automatically "fixed" for you somewhere since you can't actually have a string like "abc \" in Lua because it would be an unfinished string.

So, if you have string like "abc \\" it will appear as "abc \" if you output it, but you can't actually turn it into "abc \" because it just won't work (unfinished string again). You could, if you want, change the "\\" into "/" by looping through the entire string, character by character. Do you have a specific need to do this?
 



[TOPIC: post.html]
#10

bbk

[GLOBAL: userInfoPane.html]
bbk
  • Enthusiast

  • 61 posts
  • Corona SDK

Just to sum up again what my problem is: 

 

I enter "abc\test" into an entryfield. Looking at the value of the entry field, it has changed to "abc\\test"

 

I try to replace a single "\", but Corona or ZeroBrane requires that I escape the character, so I replace it as follows.

    lookfor = '\\';  
    replacewith = "-!!!BACKSLASH!!!-" 
    theString = string.gsub(theString,lookfor,replacewith);

    print theString -- output:  abc-!!!BACKSLASH!!!-test

When I try to read the information back (from the MySQL database) I try to do the reverse:

      local lookfor= "-!!!BACKSLASH!!!-"
      local replacewith  = '\\';
      theDescription = string.gsub(theDescription,lookfor,replacewith);

      print theDescription -- output: abc\\!-test

the "print" function shows two backslashes (but in the entryField it is shown as one)

 

However, the problem now is that the string.gsub() does not remove the complete text, "!-" is still left after the replace has been done.



[TOPIC: post.html]
#11

bbk

[GLOBAL: userInfoPane.html]
bbk
  • Enthusiast

  • 61 posts
  • Corona SDK

Ah, yeah. Those occur because backlashes are used to escape characters in Lua. I'm guessing that your string is automatically "fixed" for you somewhere since you can't actually have a string like "abc \" in Lua because it would be an unfinished string.

So, if you have string like "abc \\" it will appear as "abc \" if you output it, but you can't actually turn it into "abc \" because it just won't work (unfinished string again). You could, if you want, change the "\\" into "/" by looping through the entire string, character by character. Do you have a specific need to do this?
 

 

The need for this is an app where the user should be able to enter any character into textfields and these should be saved in the database and later shown in the App and on a website.

 

I have already decided to filter out the single ^-character, and I have to replace the Eurosign (€) since it cannot be saved in the database when I use PHP.



[TOPIC: post.html]
#12

XeduR @Spyric

[GLOBAL: userInfoPane.html]
XeduR @Spyric
  • Contributor

  • 764 posts
  • Corona SDK

The reason as to why "abc\test" from your entryfield changes to "abc\\test" is because a lone backslash cannot exist in a lua string. It will always break the string. The function that you are using is adding the extra backslash there to prevent the string from being broken.

 

i.e. if a user inputs a string "abc\test" it will be stored as "abc\\test", but if you later create a display object with that, the display object will still read as "abc\test".

I would personally just prevent the users from submitting any such special characters. This will usually save you a lot of time and money from needing to worry about these issues. Unless there is a specific need to allow the user to type in those characters, you can filter them out by using something like "if char:find("%w") or char == " " then" to check that the newly inputted character is either an alphanumeric character or a space. You could add punctuation there for the checks as well, etc.



[TOPIC: post.html]
#13

bbk

[GLOBAL: userInfoPane.html]
bbk
  • Enthusiast

  • 61 posts
  • Corona SDK

It might be a good idea to at least stop the user from entering some difficult characters, but it is hard to know what they all are since the user might be using different national keyboards with very different characters.

 

I have tried however, to replace the \ with a /, but the textlistener event = "editing" is also having problems 

    local newChar = event.newCharacters
    local oldText = event.oldText
    if newChar then
      
      if newChar == "\\" then 
        if oldText ~= nil then
           event.target.text = "/"
        else

        event.target.text = oldText.."/"
        end 
      end
    end 

When I enter a \ the event.oldText becomes "nil"

 

(I could maybe remember the event.oldText in a variable, but it shows the problems with the backslash)



[TOPIC: post.html]
#14

nick_sherman

[GLOBAL: userInfoPane.html]
nick_sherman
  • Corona Geek

  • 1,803 posts
  • Corona SDK

Can you compare newChar to the character code for "\" rather than a literal string?

[TOPIC: post.html]
#15

bbk

[GLOBAL: userInfoPane.html]
bbk
  • Enthusiast

  • 61 posts
  • Corona SDK

Can you compare newChar to the character code for "\" rather than a literal string?

 

Yes, it looks like I have to do that. I am not happy about it since the customer should be able to enter any type of text.

 

But it looks like I have to implement this code:

-- theEnteredText is set to event.target.text when event.phase == "began"     


local newChar = event.newCharacters
    local oldText = event.oldText
    if newChar then
      local theAscii = string.byte(newChar)
      if theAscii == 92 then 
        if  theEnteredText == "" then
          event.target.text = "/"
        else

          event.target.text = theEnteredText.."/"
        end 
      
    end
  end 

  theEnteredText = event.target.text 




[TOPIC: post.html]
#16

SGS

[GLOBAL: userInfoPane.html]
SGS
  • Corona Geek

  • 2,108 posts
  • Corona SDK

This might work easier

-- Contruct an URL compatable string from a table of parameters
-- @param table params The parameters to be used in {paramName = paramValue, etc} format.
local urlOperations = require("socket.url")
local urlEncode = urlOperations.escape

function paramsToString(params)
     
    local paramsString = ""
    for name, value in pairs(params) do
        if value then 
            value = urlEncode(tostring(value))
            if string.len(value) > 0 then -- No empty values are to be sent
                paramsString = paramsString .. name .. "=" .. value .. "&"
            end
        end
    end
    
    return paramsString   
end


[TOPIC: post.html]
#17

XeduR @Spyric

[GLOBAL: userInfoPane.html]
XeduR @Spyric
  • Contributor

  • 764 posts
  • Corona SDK

I took a few minutes to adjust the sample code from Corona docs. This is how I'd approach the issue:

[lua]
local defaultField
local acceptedString, character = "", ""

local textObject = display.newText( "", display.contentCenterX, display.contentCenterY-40, system.defaultFont, 32 )

local function textListener( event )
    if event.phase == "editing" then
		acceptedString = ""
		for i = 1, event.newCharacters:len() do
			character = event.newCharacters:sub(i,i)
			if character:find( "%w") or character == " " then -- %w	is alphanumeric characters
				acceptedString = acceptedString..character
			end
		end
		textObject.text = acceptedString
    end
end

defaultField = native.newTextField( display.contentCenterX, display.contentCenterY, 180, 30 )
defaultField.inputType = "no-emoji"
defaultField:addEventListener( "userInput", textListener )
[/lua]

This way, you'd loop through the string whenever new characters are entered. In the loop, you'd only accept alphanumeric characters and spaces. If you want to accept or reject some other types of characters or specific characters, you can just add them to the check (for character classes, see https://www.lua.org/pil/20.2.html). Alternatively, you could just watch out for things that you definitely don't want and block those out.



[TOPIC: post.html]
#18

bbk

[GLOBAL: userInfoPane.html]
bbk
  • Enthusiast

  • 61 posts
  • Corona SDK

Thank you all. I will try to implement it to avoid characters that cause problems.

[TOPIC: post.html]
#19

SGS

[GLOBAL: userInfoPane.html]
SGS
  • Corona Geek

  • 2,108 posts
  • Corona SDK

Just use URL escaping (code I posted above) and then you do not have to worry about user entry at all. 

 

Simple :)



[TOPIC: post.html]
#20

carloscosta

[GLOBAL: userInfoPane.html]
carloscosta
  • Contributor

  • 636 posts
  • Corona SDK

Sorry carloscosta

theDescription = "some text -###BACKSLASH###- and some more text"
local lookfor = "-###BACKSLASH###-";
local replacewith = '\\';
theDescription = string.gsub(theDescription,lookfor,replacewith);
print(" theDescription: "..theDescription)

gives this:

 

-- Output:  theDescription: some text \\#- and some more text

Why did you add "-" next to the "###" in my example i didn't put it for a reason it will not work with gsub, but this is the worst implementation ever. I just gave you an example how to resolve your problem changing 1 char.if you copy paste my example it will work.



[TOPIC: post.html]
#21

carloscosta

[GLOBAL: userInfoPane.html]
carloscosta
  • Contributor

  • 636 posts
  • Corona SDK

you should also send data in json with POST method. it's pretty easy to use that data in php on the server side.

never had a problem with that.


  • bbk likes this

[TOPIC: post.html]
#22

bbk

[GLOBAL: userInfoPane.html]
bbk
  • Enthusiast

  • 61 posts
  • Corona SDK

Just use URL escaping (code I posted above) and then you do not have to worry about user entry at all. 

 

Simple :)

 

Unfortunately, the socket.url.escape did not work with the national characters like æøå/äöü so I have to use the string = mime.b64(string) that XedurR suggested. But I also had to avoid a lot of characters that can be entered with the English/Norwegian and German keyboard that we at the moment will support.

 

I am using MySQL Database and PHP 7.2.7 and I have problems saving e.g. the the backslash (\), percent (%) and the combination ("/>")

On the Server I am therefore looping through all the request variable from GET or from POST and decode the values ($value = base64_decode($value),  convert some values back to what they should be before the data is saved.

 

Here is the code that I use to remove/replace characters that cause problems on the Server:

removeCharsWithError = function(newChar)
  local mime = require("mime")

  local replaceChar = ""
  local theAscii = string.byte(newChar) or -1
  local theb64 = mime.b64(newChar) or -1 

 
  if theAscii == 92 then -- backslash "\"
    replaceChar = "/"

  elseif newChar =="°" then  -- degree -- works on Mac Simulator
    replaceChar = "*"
  elseif theAscii == 203  then -- degree -- Ascii value = 194 on Mac, 203 on iPad
    replaceChar = "*"
  elseif theb64 == "y5o="  then -- degree -- theb64 value = "wrA=" on Mac, "y5o=" on iPad
    replaceChar = "*"

  elseif newChar =="•" then
    replaceChar = "-"

  elseif newChar =="±" then -- works on Mac Simulator
    replaceChar = "+/-"
  elseif theb64 == "4omg" then --on iPad
    replaceChar = "+/-"
  end

  -- variations of  letters --on iPad

--  A
  if theb64 == "xIA=" then -- Overline
    replaceChar = "A"
  elseif theb64 == "xIE=" then 
    replaceChar = "a"
--  E
  elseif theb64 == "xJY=" then -- One dot
    replaceChar = "E"
  elseif theb64 == "xJc=" then 
    replaceChar = "e"

  elseif theb64 == "w4s=" then -- Two dots - German keyboard
    replaceChar = "E"
  elseif theb64 == "w6s=" then 
    replaceChar = "e"

  elseif theb64 == "xJg=" then  -- Polish E
    replaceChar = "E"
  elseif theb64 == "xJk=" then 
    replaceChar = "e"


  elseif theb64 == "xJI=" then -- Overline 
    replaceChar = "E"
  elseif theb64 == "xJM=" then 
    replaceChar = "e"


-- U
  elseif theb64 == "xao=" then -- Overline
    replaceChar = "U"
  elseif theb64 == "xas=" then 
    replaceChar = "u"

-- I
  elseif theb64 == "xKo=" then -- Overline
    replaceChar = "I"
  elseif theb64 == "xKs=" then 
    replaceChar = "i"

  elseif theb64 == "xK4=" then -- La cédille - English keyboard
    replaceChar = "I"
  elseif theb64 == "xK8=" then 
    replaceChar = "i"



-- O
  elseif theb64 == "xYw=" then -- Overline
    replaceChar = "O"
  elseif theb64 == "xY0=" then 
    replaceChar = "o"

-- S
  elseif theb64 == "xZo=" then -- accent aigu 
    replaceChar = "S"
  elseif theb64 == "xZs=" then 
    replaceChar = "s"
-- C
  elseif theb64 == "xIw=" then -- c with caron
    replaceChar = "C"
  elseif theb64 == "xI0=" then 
    replaceChar = "c"  
  elseif theb64 == "xIY=" then --  accent aigu 
    replaceChar = "C"
  elseif theb64 == "xIc=" then 
    replaceChar = "c"  
  elseif theb64 == "w4c=" then --  cedilla
    replaceChar = "C"
  elseif theb64 == "w6c=" then 
    replaceChar = "c"   

-- N        
  elseif theb64 == "xYM=" then --  accent aigu -- German keyboard
    replaceChar = "N"
  elseif theb64 == "xYQ=" then 
    replaceChar = "n"        

-- Z        
  elseif theb64 == "xbk=" then --  accent aigu -- English keyboard
    replaceChar = "Z"
  elseif theb64 == "xbo=" then 
    replaceChar = "z"   
  elseif theb64 == "xbs=" then --  with caron -- English keyboard
    replaceChar = "Z"
  elseif theb64 == "xbw=" then 
    replaceChar = "z"     
    
-- L 
  elseif theb64 == "xYE=" then --  with slash through  -- English keyboard
    replaceChar = "L"
  elseif theb64 == "xYI=" then 
    replaceChar = "l"   
  end

  return replaceChar
end

My Textlistener to remove or replace all characters that cause a problem looks like this:

local function textListener(event)
  
  if (event.phase == "began") then

    theEnteredText = event.target.text 

  elseif event.phase == "editing" then

-- Remove unwanted characters 

    local newChar = event.newCharacters
  --  local oldText = event.oldText this does not work if event.oldText contains characters like \
    local replaceChar = ""
    if newChar then

      replaceChar = removeCharsWithError(newChar)

      if  theEnteredText == "" and replaceChar ~= "" then
        event.target.text = replaceChar
      elseif replaceChar ~= ""  then 
        event.target.text = theEnteredText..replaceChar
      end 
      replaceChar = ""
    end 

    theEnteredText = event.target.text
-- END Remove unwanted characters 

  end
end

​On the App I convert the values that are to be sent with this code:

(The combination "/>" is also causing a problem)

local doEncode = function(theString)

  if theString ~= "" then 
    -- %% in lookfor is replaced with single %
    lookfor = '%%';
    replacewith = "!!!PERCENT!!!"
    theString = string.gsub(theString,lookfor,replacewith);
    
    lookfor = '/>';
    replacewith = "/XXXgtXXX"
    theString = string.gsub(theString,lookfor,replacewith);


    local mime = require("mime")

    theString = mime.b64(theString)
  end 

  return theString

end


[TOPIC: post.html]
#23

SGS

[GLOBAL: userInfoPane.html]
SGS
  • Corona Geek

  • 2,108 posts
  • Corona SDK

I don't understand why you need all that.  I have a similar configuration to you and it works with all character sets (including CJK).

 

I use the utf8 library rather than the standard string library.  I then use this function to remove the problematic chars - those with a code point above 65535 and then POST to my server.

function stripProblemChars(s)
  local res = ""

  for i = 1, utf8.len(s) do
    local c = utf8.sub(s, i, i)
    if utf8.codepoint(c) <= 65535 then
      res = res .. c
    end
  end
  return res
end

I use utf8_general_ci for the database and have never had a "string problem".




[topic_controls]
[/topic_controls]

Also tagged with one or more of these keywords: string.gsub, special characters