How to encode properly this URL

  • A+
Category:Languages

I am trying to get this URL using JSoup

http://betatruebaonline.com/img/parte/330/CIGUEÑAL.JPG

Even using encoding, I got an exception. I don´t understand why the encoding is wrong. It returns

http://betatruebaonline.com/img/parte/330/CIGUEN%C3%91AL.JPG

instead the correct

http://betatruebaonline.com/img/parte/330/CIGUEN%CC%83AL.JPG

How I can fix this ? Thanks.

private static void GetUrl() {     try     {         String url = "http://betatruebaonline.com/img/parte/330/";         String encoded = URLEncoder.encode("CIGUEÑAL.JPG","UTF-8");         Response img = Jsoup                             .connect(url + encoded)                             .ignoreContentType(true)                             .execute();          System.out.println(url);         System.out.println("PASSED");     }     catch(Exception e)     {         System.out.println("Error getting url");         System.out.println(e.getMessage());     } } 


The encoding is not wrong, the problem here is composite unicode & precomposed unicode of character "Ñ" can be displayed in 2 ways, they look the same but really different

precomposed unicode: Ñ           -> %C3%91 composite unicode: N and ~       -> N%CC%83 

I emphasize that BOTH ARE CORRECT, it depends on which type of unicode you want:

String normalize = Normalizer.normalize("Ñ", Normalizer.Form.NFD); System.out.println(URLEncoder.encode("Ñ", "UTF-8")); //%C3%91 System.out.println(URLEncoder.encode(normalize, "UTF-8")); //N%CC%83 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: