Post by mystifier » Thu Nov 04, 2010 11:28 pm

I have hit the boundary of my knowledge again (which isn't hard :-[ ).

All descriptions etc., that are maintained in Admin via the HTML editor appear in the database as though HTMLSSpecialcharacters (which I only found out about this morning before I sound too clever) has been called, although it never is.

For example:
<p><strong>Product</strong></p>
is stored as:
<p><strong>Product</strong></p>

Is saving as HTML or Special Characters an editor option?
Would I have a problem if I ran a query to replace all '<' to '<' and '>' to '>' in all description fields?
Is it possible to make HTML data save as raw HTML tags?
or can anyone plug the gap in my ignorance as to what I should be doing?!

The object is to manage HTML content from a Back-Office system through ODBC.

Free v1.4.9 Extensions: Default Specials | Improved Search | Customer Activity Report | Customer Groups | Royal Mail With Handling | Improved Product Page | Random Products | Stock Report | All Products


User avatar
Active Member

Posts

Joined
Tue May 18, 2010 5:15 pm

Post by JAY6390 » Thu Nov 04, 2010 11:55 pm

I wouldn't run a query to change the data no, that would lead to all sorts of problems. The issue could be overcome in one of two ways. either set yourself up an API to get the descriptions with them decoded, or decode them with your back office system. You could in theory change the editor, but that would mean you would have to update everything that uses the editor

Image


User avatar
Guru Member

Posts

Joined
Wed May 26, 2010 11:47 pm
Location - United Kingdom

Post by Qphoria » Fri Nov 05, 2010 12:02 am

Back in 0.x, I believe JNeuhoff led the resistance against storing the descriptions in specialchars and preferred to store the actual html tags in the database. I also agreed that it made more sense and we made this change in 0.7.9. It looks like that change was never adopted into early 1.x so it never happened here. It would make much more sense to store the data in a proper format so that multiple consumers could reuse the data directly from the db in a standardized format without having to encode/decode.

But I too get confused by the way POSTing stuff gets passed around and automatically gets encoded. In this case, if you trace the post with liveheader monitoring, you can see that the webpage does indeed post the html for the description field:

Image

But then when the controller has it in the $_POST field, the HTTP protocol or PHP engine appears to have encoded it automatically:
Image

And that is simply passed as-is to the model function to save.

You could run htmlspecialchars_decode on the data right before the db inserts it by
1. EDIT: admin/model/catalog/product.php

2. FIND (2 INSTANCES):

Code: Select all

description = '" . $this->db->escape($value['description'])
3. REPLACE BOTH WITH:

Code: Select all

description = '" . $this->db->escape(htmlspecialchars_decode($value['description']))
That will save it to the database in proper html format.

Interestingly, I figured since I decoded it on the way in, I would need to re-encode it on the way out, otherwise it would display funny.. but again some magic appears to be working that it doesn't try to double decode. So with that above change alone, that might be all that is needed. Maybe I've overlooked something but it seems to be working correctly with just that change AND it is stored nicely in the db.

Image


User avatar
Administrator

Posts

Joined
Tue Jul 22, 2008 3:02 am

Post by mystifier » Fri Nov 05, 2010 12:56 am

Very confusing!

I just added a product and used htmlspecialchars_decode (two new keywords in one day!) so the data is stored as html. When it comes to viewing the product on the website or loading them into ckeditor, neither seems to care less whether it is stored as html or special characters.

If something was stored as specialcharacters, I would have expected products to show on the website with literal tags (eg. '<b><u>Product Name</u></b>').

Reverting, when I edit an (html) product in ckeditor and save it, it is special characters again but the website doesn't care either way.

I need to lie down for a bit.

Free v1.4.9 Extensions: Default Specials | Improved Search | Customer Activity Report | Customer Groups | Royal Mail With Handling | Improved Product Page | Random Products | Stock Report | All Products


User avatar
Active Member

Posts

Joined
Tue May 18, 2010 5:15 pm

Post by Qphoria » Fri Nov 05, 2010 1:36 am

mystifier wrote:Very confusing!

I just added a product and used htmlspecialchars_decode (two new keywords in one day!) so the data is stored as html. When it comes to viewing the product on the website or loading them into ckeditor, neither seems to care less whether it is stored as html or special characters.

If something was stored as specialcharacters, I would have expected products to show on the website with literal tags (eg. '<b><u>Product Name</u></b>').

Reverting, when I edit an (html) product in ckeditor and save it, it is special characters again but the website doesn't care either way.

I need to lie down for a bit.
I'm just as stumped as you in this case... the internets still hold some secret magic that even I don't understand... but perhaps this is a gift horse... and I'm not going to look it in the mouth (wow our societal phrases really make no sense)

Image


User avatar
Administrator

Posts

Joined
Tue Jul 22, 2008 3:02 am

Post by mystifier » Fri Nov 05, 2010 9:21 pm

Fields can even be a complete mixture of html and special characters, for example:
<p>text</p>

It's useful but crazy; I just don't get it ?!!?

Free v1.4.9 Extensions: Default Specials | Improved Search | Customer Activity Report | Customer Groups | Royal Mail With Handling | Improved Product Page | Random Products | Stock Report | All Products


User avatar
Active Member

Posts

Joined
Tue May 18, 2010 5:15 pm

Post by Qphoria » Fri Nov 05, 2010 10:31 pm

It's magic!

Image


User avatar
Administrator

Posts

Joined
Tue Jul 22, 2008 3:02 am

Post by affect » Wed Aug 17, 2011 6:39 pm

Just stumbled upon this topic when trying to figure out why do my html tags get into the db encoded. After an hour of banging my head against a wall I found out all the encoding happens in request class constructor.

Code: Select all

$_GET = $this->clean($_GET);
$_POST = $this->clean($_POST);
$_REQUEST = $this->clean($_REQUEST);
$_COOKIE = $this->clean($_COOKIE);
$_FILES = $this->clean($_FILES);
$_SERVER = $this->clean($_SERVER);

Code: Select all

  	public function clean($data) {
    	if (is_array($data)) {
	  		foreach ($data as $key => $value) {
				unset($data[$key]);
				
	    		$data[$this->clean($key)] = $this->clean($value);
	  		}
		} else { 
	  		$data = htmlspecialchars($data, ENT_COMPAT);
		}

		return $data;
	}
So my question is: what's the point? Shouldn't it be up to the developer to choose a way to deal with request data and securing it if needed?

Automatic encode really confused me here and I if I remember correctly the good practice was storing the data as it originally was. If requests get encoded, storing/fetching gets tricky and if different ways of inserting/modifying the data get used, one can end up with a mess of encoded/unencoded characters like the last poster here.

MultiMerch Marketplace for OpenCart

Image


User avatar
Active Member

Posts

Joined
Sat Aug 13, 2011 5:04 pm


Post by SapporoGuy » Thu Aug 18, 2011 3:54 am

the last poster here
Hmmm, wouldn't that be you? OR are you referring to ! :laugh:

Ok, jokes aside, looks like somebody who coded that in had an idea of securing data being entered into the db through the catalog side. Then forgot about that class and then had to deal with the situation since it was a mystery.

Anyway, what would be the best way to clean up the mess?
Move everything to the class or weed out the base controllers?

And no, I'm not particularly interested in leaving it up to the end user since most will have no clue about securing this.
But if this was in the class, then you could have switches to pass to allow you to change on the fly like the SSL links do.

IMHO, CFKEditer is F!keditor ... I bet there is a bunch of cleaning or what not going on inside that huge baggage code.

930sc ... because it is fun!


User avatar
Active Member

Posts

Joined
Mon Nov 01, 2010 7:29 pm

Post by maksfeltrin » Sat Jul 28, 2012 8:10 am

Having to build an import / export from/to csv for a customer, i stumbled across this problem too.

I tried to modify the db methods escape() and query() (walking through $query->rows and $query->row) in order not to have htmlspecialchars in db tables. The problem is that we need to know also which are the serialize fields (they are not only in the setting table only)

The proper approach to the problem is:

1. never to use htmlspecialchars encoded data in db, just db/sql escaped data;
2. escape data in html output and form field values not the get / post data itself

That would mean minor changes in the request class (getting rid of the clean() method), but would also mean changing all controllers classes in both catalog and admin applications (because OC does not use escaping functions in tpl and the model classes is not the proper place to do that).

Anyway , it's a change that should be done along with reverting all tables to utf8_general_ci ....

As far as fckeditor goes, i noticed that values are doubled escaped like & => &amp;

Maks Feltrin

Newbie

Posts

Joined
Wed Nov 16, 2011 9:06 am

Post by straightlight » Sat Jul 28, 2012 8:44 am

That would mean minor changes in the request class (getting rid of the clean() method), but would also mean changing all controllers classes in both catalog and admin applications (because OC does not use escaping functions in tpl and the model classes is not the proper place to do that).
Or simply use the JSON POST calls which has already been improved since v1.5x releases. One example to see would be the commission value assigned to the orders from the admin when either adding / removing values.

Dedication and passion goes to those who are able to push and merge a project.

Regards,
Straightlight
Programmer / Opencart Tester


Legendary Member

Posts

Joined
Mon Nov 14, 2011 11:38 pm
Location - Canada, ON

Post by musketeerz » Sat May 06, 2017 2:25 am

how to remove char

&amp;lt;div class=&amp;quot;cpt_product_description &amp;quot;&amp;gt;\r\n\t&amp;lt;div&amp;gt;\r\n\t\t&amp;lt;p&amp;gt;&amp;lt;span style=&amp;quot;color: rgb(153, 153, 153); font-family: Lato; font-size: 14px; font-weight: bold;&amp;quot;&amp;gt;

Newbie

Posts

Joined
Sat May 06, 2017 2:06 am
Who is online

Users browsing this forum: No registered users and 105 guests