Page 1 of 1

problem with foreign languages

Posted: Wed Dec 02, 2009 3:10 am
by loadaverage
hi there,

there is a problem with opencart when dealing with foreign languages.
try entering "ááááááááááá" as a category name for instance.
it will complain about a too long string.

the problem is basically in system/library/request.php:clean()

Code: Select all

36:         $data = htmlentities($data, ENT_QUOTES, 'UTF-8');
this converts all the accented characters into html entities
taking up 4-6 characters and also defeating the utf8 conversion.

the solution would be to replace htmlentities() with htmlspecialchars().

i also posted this as
http://code.google.com/p/opencart/issues/detail?id=111

thanks for looking into this.

Re: problem with foreign languages

Posted: Mon Dec 07, 2009 4:57 am
by loadaverage
the htmlentities() is causing another problem: all the mail
that is sent out has these entities encoded into base64.
when the email client shows the message, it is showing
the entitites instead of the rendered characters...

it is either necessary to do a html entities decoding everywhere
when text is touched, or (what i prefer) just replace htmlentitites()
with htmlspecialchars().

Daniel, could you please elaborate on your decision to use
html entities? i see na rational reason for it (especially
with non-english shops).

Re: problem with foreign languages

Posted: Mon Dec 07, 2009 5:19 am
by yaxo
This is a really big problem in Sweden too, thank you for this.

Re: problem with foreign languages

Posted: Mon Jan 04, 2010 10:16 pm
by axxies
I agree (I am from Sweden too). It appears that texts shown on the web pages needs to be separated from those sent out in mails.

I am using 1.3.4.

Re: problem with foreign languages

Posted: Fri Jan 15, 2010 10:02 pm
by Planck
Hello,

Same problem with Greek language. Is there a workaround for this?

Re: problem with foreign languages

Posted: Fri Jan 15, 2010 10:52 pm
by i2Paq
Which version?

Re: problem with foreign languages

Posted: Sat Jan 16, 2010 10:02 pm
by Miguelito
I'm using the 1.4 version and outgoing mail shows in web-based email just fine but in Outlook all scandinavian letters (ä, å, ö and so on) are translated to something else. I noticed the same problem as the Swedish guys (I'm from Finland) that when you use scandinavian characters for example in product names and press Save - you get a warning that the product name should be max 32 characters. This is because scandinavian characters (ä, ö, å and so on) take in UTF-8 five to six characters...

Re: problem with foreign languages

Posted: Sun Jan 17, 2010 6:25 am
by Daniel
the solution is to count the characters after they have been converted back.

Re: problem with foreign languages

Posted: Sun Jan 17, 2010 6:09 pm
by Miguelito
So there is no solution to use ISO-8859-15 rather than UTF-8?

Re: problem with foreign languages

Posted: Mon Jan 18, 2010 3:59 pm
by OSWorX
Daniel wrote:the solution is to count the characters after they have been converted back.
But does this solve the main problem?

Why not storing values as they are?
Now every output MUST be converted back.
Beside this, every limited field (in the DB) is useless (see title max. 32 chars).
And - the DB is full of characters, but useless.
Or will grow and grow (beside the normal), because each value there will be 2-4 times greater than it could be (if not English is used).

Re: problem with foreign languages

Posted: Mon Jan 18, 2010 8:14 pm
by Daniel
the problem is if hackers try to insert soem html or javascript. you want to get rid of things like <script>.

Only certain fields require html such as product description fields.

Re: problem with foreign languages

Posted: Mon Jan 18, 2010 9:24 pm
by OSWorX
Daniel wrote:the problem is if hackers try to insert soem html or javascript. you want to get rid of things like <script>.

Only certain fields require html such as product description fields.
But then there are better ways to solve this 'kiddie' problem ...

Re: problem with foreign languages

Posted: Tue Jan 19, 2010 3:49 am
by loadaverage
Daniel wrote:the problem is if hackers try to insert soem html or javascript. you want to get rid of things like <script>.
it has been pointed out to you numerous times that

php.net/htmlspecialchars

is enough for this filtering.

Re: problem with foreign languages

Posted: Wed Jan 20, 2010 5:48 am
by Daniel
i'm not sure htmlspecialchar can stop hacking attempts.

SO you just want me to do ths:

public function clean($data) {
if (is_array($data)) {
foreach ($data as $key => $value) {
unset($data[$key]);

$data[$this->clean($key)] = $this->clean($value);
}
} else {
$data = htmlspecialchars($data, ENT_QUOTES, 'UTF-8');
}

return $data;
}

Re: problem with foreign languages

Posted: Wed Jan 20, 2010 5:55 am
by Miguelito
Daniel, is your post the fix for this problem? I could go to testing mode if it is...

Re: problem with foreign languages

Posted: Wed Jan 20, 2010 6:35 am
by Daniel
Yes that is the fix. just find the request class and replcae the clean method with this one.

tell me if the problem is still there.

Re: problem with foreign languages

Posted: Wed Jan 20, 2010 6:52 am
by Miguelito
Daniel, tested with a few product categories and worked perfect :D

Hopefully other users also contribute by testing this fix... so it could be implemented in the next release.

Re: problem with foreign languages

Posted: Sat Jan 23, 2010 5:45 am
by OSWorX
Daniel wrote:Yes that is the fix. just find the request class and replcae the clean method with this one.

tell me if the problem is still there.
THE problem has gone, but the next is here: now if special characters, accents, umlauts are used, depending of these characters you can save a maximum of 20-26 characters per title!
Defenitely too less!

Re: problem with foreign languages

Posted: Tue Jan 26, 2010 4:38 am
by loadaverage
Daniel wrote:i'm not sure htmlspecialchar can stop hacking attempts.
then you don't understand what these functions do...

Re: problem with foreign languages

Posted: Wed Jan 27, 2010 7:21 pm
by loadaverage
Daniel wrote:Yes that is the fix. just find the request class and replcae the clean method with this one.

tell me if the problem is still there.
it is not the complete fix.
htmlentities() is all over the place.
but it's a good start.