How to configure a TYPO3 website for Arabic language

25 September 2012 Comments Off

With TYPO3 we can have multiple languages in the backend and the different editors can use the backend modules and add frontend content in their language of choice. There is no problem in having an international team of editors which can edit and add content in lots of languages. As the TYPO3 community grows also the number of language packs is growing.

One of the available language packs is Arabic, which has a few interesting aspects and that’s why we will discuss it a little bit. One of the challenges in this case is that the texts in Arabic are right to left (RTL) instead of left to right (LTR).

Now let’s see how can we configure TYPO3 to allow this kind of content setup.

Available online tutorials offer as a solution for multilanguage site installation an extension called csh_XX where XX means a shortening for language name. But these methods are deprecated now and there are better solutions for this.

TYPO3 has a really advanced module to configure backend languages for each user using “Extension Manager”.For TYPO3 4.5 versions or older there is a section called “Translation Handling” and starting with TYPO3 4.6 the section is named “Language Pack”.There are more than 50 language packs and the number is growing.

BACKEND

Install and configure TYPO3 with desired language pack can be made using following steps:

1. Go to → “Extension Manager” → “Translation Handling” or “Language Pack”

2. Select your language pack

3. Press “Update from repository”

Each user can set his own backend language using “User settings” module.

Installing Arabic language pack or any language pack doesn’t come with a 100% translation of the text. There are labels of modules which doesn’t have a completely translation or literally translation.

· Workspace

· News Admin – tt_news

· Extension Manager

· Reports

· Scheduler

· Typoscript Help

· About Modules

Also, there are modules which hasn’t been translated completely, modules translated only partial and some of them without translation.

Modules with incomplete translation:

· Template

· News Admin

· User settings

· User Admin

· Extension Manager

· Configuration

· Log

· Indexing

· Reports

· Scheduler

Backend translations are available mostly for editors modules and functionalities.

FRONTEND

In order to create a website in arabic (non-latin) it is required that encoding to be UTF-8, not latin ANSI. We will have non-latin characters.

How to configure UTF-8 in TYPO3

1. System files

In order to configure encoding to be UTF-8 you will have to add following lines of code using localconf.php file or using Install Tool from Typo3.

// For backend charset

$TYPO3_CONF_VARS['BE']['forceCharset'] = ‘utf-8′;

$TYPO3_CONF_VARS['SYS']['setDBinit'] = ‘SET NAMES utf8;’;

$TYPO3_CONF_VARS['SYS']['multiplyDBfieldSize'] = ’3′;

Where:

-forceCharset as utf-8 it brings config.renderCharset and metaCharset set utf-8 by default utf-8.

-multiplyDBfieldSize is used to multiply the dimension of the fields in DB if you want to use an UTF-8 encoding.It is recommended to use „3” for non-latin languages as arabic or chinese.

-SET NAMES utf8 is equivalent to three SQL statements:

SET character_set_client = utf8;

SET character_set_results = utf8;

SET character_set_connection = utf8;

2.Apache

There is a configuration for Apache to specify the charset when browser is displaying a page for example.There are 2 methods to do that:

· .htaccess – safer.

· httpd.conf – faster.

AddDefaultCharset utf-8

3.PHP

In php.ini file you have to add the following line of code in order that all stand-alone script to use utf-8 charset.

default_charset = "utf-8"

4. TypoScript

There are some Typoscript configurations wich has to be done in order to use non-latin languages for pages, by default.

config.language = ar

config.locale_all = ar_AR.utf-8

config.htmlTag_langKey = ar-AR

[browser = msie]

config.htmlTag_setParams = xmlns="http://www.w3.org/1999/xhtml" xmlns:v=urn:schemas- microsoft-com:vml xml:lang="ar"

LIMITATIONS

There are some limitations when dislaying the text in non-lating language on page using TYPO3.

1.Right to left direction

First of all, for each paragraph of text it is necessary to set direction of the text from right to left using attribute dir in HTML (dir=”RTL”) , where RTL means Right To Left.

In TYPO3, build-in RTE (rtehmltarea) offer for editors possibility to set direction of text in a visual way. This option is available only when RTE is configured correctly to display this button. For example in Demo mode this button is displayed. Setting RTE to work in “Demo” mode can be done from “Extension Manager” , at extension configuration section.

Result: Text direction buttons (LTR and RTL)

clip_image003 clip_image004

Also, if you want to modify „dir” attribute for <html> tag you have to make an RTL configuration using Typoscript for root page template.

config.htmlTag_dir = rtl

2.Text alignment

By default, in html, text is aligned from left to right. If there is no configuration about alignment mode, then for a 2 lines text you can see that the second line isn’t properly align so you can read from right to left and that’s why there comes the need of setting rtl attribute.

3.RealURL

Using a non-latin language in TYPO3 brings some limitations when you have to work with RealURL. When you have to access some pages with title with non-latin text, it is impossible because URLs can’t contain characters other than latin. There are two ways to avoid such things.

a) enableAllUnicodeLetters

Usually, realURL makes the conversion of non-ASCII characters to their ASCII equivalent, allowing only characters like A-Z,a-z,0-9 and minus in URL segment. That cause problem to non-latin languages and this option allow URLs to encode incompatible characters into URL entities instead ignoring them completely. That method, usually slows down realURL and it is recommended using caching in RealURL to improve performances.

Setting enableAllUnicodeLetters as true can be done from realurl autoconf, info „init” array:

PATH : ~/typo3conf/realurl_autoconf.php

$GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['realurl']=array (
‘_DEFAULT’ =>
array (
‘init’ =>
array (

‘enableAllUnicodeLetters’ => true,

‘doNotRawUrlEncodeParameterNames’ => true,

……

b) Speaking URL path segment

A much easier method is to use an speaking path for RealURL to replace that non-latin title text. The disadvantage is that it needs for each page an individual speaking path and there is no more a conversion of page title to new non-latin language.

Title of the page will be the same in header and in menu(in non-latin language) and only URL will be modified with the speaking path.

4.Powermail

Using powermail extension in Arabic language brings with it some errors when, for example, you want to create a Powermail Form. When you install a new instance of Typo3, database is automatically created on latin charset, not in utf-8 and when you will create a form and add a field in Arabic language, characters won’t be displayed and won’t be saved properly. Each character will be replaced with “?” and when you will hit the save button will return an error like:

These Fields are not properly updated in database: (title) Probaly language mismatch with field Types”

If all fields needs to be saved properly and non-latin text to be displayed properly, it is needed that all powermail tables and database „Character” to be set as utf-8 and „Collatae” as utf8_general_ci.

ALTER DATABASE `databaseName` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;

ALTER TABLE `tableName` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

and tables were:

· tx_powermail_domain_model_answers

· tx_powermail_domain_model_fields

· tx_powermail_domain_model_forms

· tx_powermail_domain_model_mails

· tx_powermail_domain_model_pages

CONCLUSION

As a conclusion we can say that it is not so difficult to create a website in a non-latin language (in this case Arabic) using TYPO3.There are some configurations which need to be adjusted in order to have a properly saved text into the database and a properly displayed text into frontend.As we can see, we only have to add some configurations into TYPO3 filesystem and configure Typoscript to have Arabic language for frontend pages.

It is not mandatory that Apache and PHP to be set as utf-8.It is most likely that a developer will not have access to core files from Apache and PHP and it is enough to configure database as utf-8 from Install Tool(or editing config file) and frontend from Typoscript.

Of course, there are some limitations and problems for some extensionsbut these can be fixed as we described for RealURL and Powermail.

The conclusion is: YES, it is not hard to develop a website in Arabic or any other non-latin language that requires right to left writing. There are resources and solutions for the problems that appear in these situations and the TYPO3 Enterprise CMS can be used universally by people from all over the world.