Defining Awesome

Written by MM. Posted at 10:45 am on October 18th, 2010

I encountered the first roadblock when porting. Apparently the wide character type (wchar_t) is 4 bytes instead of 2 like on Windows. Do these crazy Linux ppl really think there’s a language with 2.1 billion characters?

Be Sociable, Share!

Categories: Development log. 8 comments... »

8 comments.

programmer.laik

October 18th, 2010

OFFTOPIC
MM I’m waiting for youre article(s) about meditation!
Alink

October 18th, 2010

The problem is that you need more than 65535 characters (2 bytes) to render all languages, especially for the Asian ones. See Unicode planes.

I recently had the inverse problem when porting from linux to windows. The code was about handling cleanly word wrapping for CJK languages (where the separation of words/letters is different) and the special cases were about 4 bytes characters, but windows wchar_t chocked on it.
Hacktank

October 18th, 2010

Let the “#define PORT_WCHAR_T unsigned short int” abstraction commence!
MM

October 19th, 2010

programmer.laik: Meditation?:) Could you remind me the context?

Alink: So how much do these Asian languages have characters?
FinDude

October 19th, 2010

“The number of Chinese characters contained in the Kangxi dictionary is approximately 47,035, although a large number of these are rarely used variants accumulated throughout history.”

Yey for Wikipedia.

2.1 billion? Overkill, much?
m!nus

October 19th, 2010

So what happened to UTF-8?
programmer.laik

October 19th, 2010

http://mm.soldat.pl/inspirado/surviving-2008
“Meditation solves this problem, refreshes my mind and gives a thousand other benefits. Sleep isn’t so good for refreshing my mind.”
Pleeease, my masta;P
jan

October 19th, 2010

wchar_t is not portable:

“The width of wchar_t is compiler-specific and can be as small as 8 bits. Consequently, programs that need to be portable across any C or C++ compiler should not use wchar_t for storing Unicode text. The wchar_t type is intended for storing compiler-defined wide characters, which may be Unicode characters in some compilers.”

Defining Awesome

Status Updates

8 comments.

Post a comment.

Links

Categories

Archives

Stay in touch