Kitz ADSL Broadband Information
adsl spacer  
Support this site
Home Broadband ISPs Tech Routers Wiki Forum
 
     
   Compare ISP   Rate your ISP
   Glossary   Glossary
 
Please login or register.

Login with username, password and session length
Advanced search  

News:

Author Topic: I typed apostrophes, but Er!  (Read 1151 times)

renluop

  • Kitizen
  • ****
  • Posts: 3326
I typed apostrophes, but Er!
« on: July 15, 2018, 03:11:20 PM »

I've just posted a ticket on another support site. All the apostrophes have changed to '. For instance don't changes to don't in the ack of my ticket.

I've noticed similar oddities elsewhere with, say, quotation marks becoming  spanish question marks i.e front upside down, rear right way up.

I'm curious what causes those peculiarities.
 
Logged

chenks

  • Kitizen
  • ****
  • Posts: 1106
Re: I typed apostrophes, but Er!
« Reply #1 on: July 15, 2018, 03:34:02 PM »

the site is converting the ' to the ascii code
http://ee.hawaii.edu/~tep/EE160/Book/chap4/subsection2.1.1.1.html

an apostrophe is often to terminate strings in code, so will be converted to stop that from happening.
what normally happens though is that the page displaying the result should reconvert that ascii back to what it should be, but in this case it's not.
Logged

renluop

  • Kitizen
  • ****
  • Posts: 3326
Re: I typed apostrophes, but Er!
« Reply #2 on: July 15, 2018, 06:08:38 PM »

Thanks! It's funny that it was an Antivirus application support site throwing up the oddity. Amusing!
Logged

DaveC

  • Reg Member
  • ***
  • Posts: 197
Re: I typed apostrophes, but Er!
« Reply #3 on: July 15, 2018, 06:20:37 PM »

It's clearly a bug in the site - the site is "HTML-escaping" the text twice.  This is better than not escaping at all, which intoduces what are known as "cross-site scripting" (XSS) bugs, but is a bug nonetheless.

Put briefly, in a HTML page, certain characters (<, >, &, ', " and others) have a special meaning.  If you want your HTML page to actually display one of those characters, you need to "escape" it, so the browser knows that it is a character to be displayed, and not part of the markup.

This escaping takes the form of "&something;".  An & itself is escaped as "&amp;", and as you've seen, a single quote (or apostrophe) is escaped as "&#039;" which the browser will understand and display as an apostrophe.

However, if "&#039;" is then escaped again, the HTML will contain "&amp;#039;", which the browser will display as "&#039;", and not as a single quote/apostrophe.

Logged

Weaver

  • Senior Kitizen
  • ******
  • Posts: 11459
  • Retd s/w dev; A&A; 4x7km ADSL2 lines; Firebrick
Re: I typed apostrophes, but Er!
« Reply #4 on: July 15, 2018, 07:02:14 PM »

DaveC said it perfectly. The confused developers did not bother to keep track of whether they had performed the required encoding already or not so they ended doing it twice and so mangled the text. `they should have stuck a label on the relevant item which travels with it, to mark whether this vital encoding 'has not been done yet', vs 'has already now been done'.

The upside down question marks, I’m not sure, but that kind of codswallop is sometimes seen when unicode text characters such as 6 and 9 style curly attractive non-ASCII single quite marks or curly double quote marks and umpteen other characters are encoded as multiple bytes in the UTF-8 format, which is a kind of data compression so that unicode characters that belong to the good old ASCII set are represented as a single byte, but other characters, including modern punctuation and non-English writing systems need to be represented by a string of several bytes. If software does not realise or does not remember that this UTF-8 multiple byte representation encoding system is in use, then it will interpret a stream of bytes as multiple garbage characters each displayed character being one the bytes that in fact just each make up parts of one UTF-8 character in some places. If that is what is going on, then it is nothing to do with HTML, just more confused software not keeping track of the format things are in.
Logged
 

anything