HTML Encoding
Posted: Tue Feb 27, 2018 5:16 am
So I'm new to OpenCart and I am really happy with the platform. I found it supported almost everything I need it to do, and anything it doesn't is easy to implement using custom PHP.
However, I've found something that surprised me. When I add products (or any data really) that contains certain characters (such as double quotes), I found that these were being HTML encoded when stored in the database. Now, I know why. This is because keeping them in their original state would break the Edit Form when you decided to edit the product again (as the source HTML would break out of the `value` attribute of the `input` element, showing only a portion of the original value). There are better ways of dealing with this though.
I'm a developer myself, and one of the first lessons I was taught is never to encode data when it's being inserted. Always encode when it's being outputted instead. (This doesn't mean you shouldn't verify and clean data before it goes in, but that's a different matter.) The database should be the source of truth, and by HTML encoding the data it essentially means that the application is assuming that the data will only ever be displayed in an HTML context, which isn't always true. At some point I want to develop an API for my OpenCart instance and supply the data as JSON strings, which can be potentially used for anything. But if I have to HTML decode everything just to get it back to its original form, then there's room for issues to occur and forcing me to perform a task that really shouldn't be necessary.
So, why was the decision made to encode data going into the database?
However, I've found something that surprised me. When I add products (or any data really) that contains certain characters (such as double quotes), I found that these were being HTML encoded when stored in the database. Now, I know why. This is because keeping them in their original state would break the Edit Form when you decided to edit the product again (as the source HTML would break out of the `value` attribute of the `input` element, showing only a portion of the original value). There are better ways of dealing with this though.
I'm a developer myself, and one of the first lessons I was taught is never to encode data when it's being inserted. Always encode when it's being outputted instead. (This doesn't mean you shouldn't verify and clean data before it goes in, but that's a different matter.) The database should be the source of truth, and by HTML encoding the data it essentially means that the application is assuming that the data will only ever be displayed in an HTML context, which isn't always true. At some point I want to develop an API for my OpenCart instance and supply the data as JSON strings, which can be potentially used for anything. But if I have to HTML decode everything just to get it back to its original form, then there's room for issues to occur and forcing me to perform a task that really shouldn't be necessary.
So, why was the decision made to encode data going into the database?