Discussion:
keyboard layout - dead keys vs. combining diacritical marks
(too old to reply)
Cristian Secara
2004-02-08 14:16:32 UTC
Permalink
There are many languages that uses dead keys in their keyboard layouts.
Is there any language that uses combining diacritical marks instead ?

For my language (Romanian) there are few characters where, on rare
occasions, they can have two accents simultaneously. One of these has no
distinct Unicode code for it, as far as I know: letter i with circumflex AND
acute accent. This can be generated by using U00EE then U0301.

My question is if this can be a practical alternative to dead keys
characters, by changing all (working) dead keys with this style of combining
diacritical marks. For both typing styles (first accent, then character vs.
first character, then accent) the user knows nothing about and has to learn
from scratch, so habits is not a reason to consider.

Cristi
Michael (michka) Kaplan [MS]
2004-02-08 19:00:28 UTC
Permalink
Generally, it is better not to mix multiple types of technologies, because
making people lean more than one way to do things in a single keyboard will
rarely be a good user experience. For any language that has made a
"decision" as to which they prefer, it is better to stick with that decision
consistently.
--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies

This posting is provided "AS IS" with
no warranties, and confers no rights.
Post by Cristian Secara
There are many languages that uses dead keys in their keyboard layouts.
Is there any language that uses combining diacritical marks instead ?
For my language (Romanian) there are few characters where, on rare
occasions, they can have two accents simultaneously. One of these has no
distinct Unicode code for it, as far as I know: letter i with circumflex AND
acute accent. This can be generated by using U00EE then U0301.
My question is if this can be a practical alternative to dead keys
characters, by changing all (working) dead keys with this style of combining
diacritical marks. For both typing styles (first accent, then character vs.
first character, then accent) the user knows nothing about and has to learn
from scratch, so habits is not a reason to consider.
Cristi
Benjamin Riefenstahl
2004-02-08 19:02:35 UTC
Permalink
Hi Cristian,
Post by Cristian Secara
There are many languages that uses dead keys in their keyboard
layouts. Is there any language that uses combining diacritical
marks instead ?
That's two different concepts that in theory have no connection.
Dead-key processing is about key stokes on the keyboard and how they
are interpreted. The Unicode concept of combining marks is about the
storage of text.

There is no problem with pressing first the <'> key and than the <a>
key on the keyboard to get <a+acute>, but than representing the result
of this in memory as the Unicode sequence <U+0061 U+0301> (a +
combining acute).
Post by Cristian Secara
For my language (Romanian) there are few characters where, on rare
occasions, they can have two accents simultaneously. One of these
has no distinct Unicode code for it, as far as I know: letter i with
circumflex AND acute accent. This can be generated by using U00EE
then U0301.
There is also no theoretical problem about having to press <~> <'> <i>
in this order and storing the result as <U+00EE U+0301>. The question
is, does the Windows keyboard layout practically allow multiple
deadkeys and does it allow multiple Unicode characters to be produced
in one keyboard event. I don't know about the details here. But
AFAIK keyboard input processing on NT/W2K/XP is done in layout DLLs,
so it should be possible to do any kind of processing you need. Just
generate two keyboard events, if that is necessary.


benny
Michael (michka) Kaplan [MS]
2004-02-08 20:00:35 UTC
Permalink
Post by Benjamin Riefenstahl
That's two different concepts that in theory have no connection.
Dead-key processing is about key stokes on the keyboard and how they
are interpreted. The Unicode concept of combining marks is about the
storage of text.
Actually, when one is looking at Windows keyboard layouts, they are
definitely connected (one is the "multiple keystrokes make a single code
point" solution and the other is the "one key stroke makes multiple code
points" solution. In terms of MSKLC and keyboard creation, they are two
possible solutions to a particular problem.
Post by Benjamin Riefenstahl
There is no problem with pressing first the <'> key and than the <a>
key on the keyboard to get <a+acute>, but than representing the result
of this in memory as the Unicode sequence <U+0061 U+0301> (a +
combining acute).
Actually, that is a problem -- because the keyboard layout will not expect
the dead key first in that case.
Post by Benjamin Riefenstahl
Post by Cristian Secara
For my language (Romanian) there are few characters where, on rare
occasions, they can have two accents simultaneously. One of these
has no distinct Unicode code for it, as far as I know: letter i with
circumflex AND acute accent. This can be generated by using U00EE
then U0301.
There is also no theoretical problem about having to press <~> <'> <i>
in this order and storing the result as <U+00EE U+0301>.
Well, the theoretical problem is not so theoretical since it is very
difficult to do in practice and would require the chaning of dead keys and
in the end one would have to resolve to ONE code point.
Post by Benjamin Riefenstahl
The question
is, does the Windows keyboard layout practically allow multiple
deadkeys and does it allow multiple Unicode characters to be produced
in one keyboard event. I don't know about the details here.
This is in no way easy to do. If you look at the structures in question in
kbd.h from the DDK, it becomes obvious that the resulting char from a dead
key combination can only be a single UTF-16 code point.
Post by Benjamin Riefenstahl
But
AFAIK keyboard input processing on NT/W2K/XP is done in layout DLLs,
so it should be possible to do any kind of processing you need. Just
generate two keyboard events, if that is necessary.
This is not true, though -- they are DLLs, yes; but those DLLs export a
single function which returns a big struct contaning th keyboard
information. It is not a binary that generates keyboard events.
--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies

This posting is provided "AS IS" with
no warranties, and confers no rights.
Benjamin Riefenstahl
2004-02-09 12:34:49 UTC
Permalink
Hi Michael,
Post by Michael (michka) Kaplan [MS]
But AFAIK keyboard input processing on NT/W2K/XP is done in layout
DLLs, so it should be possible to do any kind of processing you
need. Just generate two keyboard events, if that is necessary.
This is not true, though -- they are DLLs, yes; but those DLLs
export a single function which returns a big struct contaning th
keyboard information. It is not a binary that generates keyboard
events.
I understand. That's a pity.

Than the OP has no other means but to let the users insert additional
accents *after* the base character, literal Unicode, so to speak?

benny
Michael (michka) Kaplan [MS]
2004-02-09 13:49:13 UTC
Permalink
Post by Benjamin Riefenstahl
Than the OP has no other means but to let the users insert additional
accents *after* the base character, literal Unicode, so to speak?
I am not sure I understand. They have many choices:

1) The dead key model -- Two keystrokes. To type, the user would first type
the accent (nothing will appear), then the base, and the single character
would then appear. This will create precomposed characters (in NFC).

2) The ligature model -- Two keystrokes. To type, the user would first type
the base character, then the accent. This would create composite characters
(in NFD). Note that text is visible as each key is typed.

3) The ligature model -- One keystroke. To type, the user simply hits one
key -- with both base and accent in a single keystroke (this would also
create composite characters, in NFD).

4) The precomposed character could also be assigned to a single keystroke.
This will create precomposed characters (in NFC).

There is really very little technical preference here and the choice should
mainly be based on user expectations (although the NFC/NFD choice is also a
good one to base it on if there are no expectations).
--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies

This posting is provided "AS IS" with
no warranties, and confers no rights.
Cristian Secara
2004-02-09 15:19:50 UTC
Permalink
Post by Michael (michka) Kaplan [MS]
1) The dead key model -- Two keystrokes. To type, the user would first
type the accent (nothing will appear), then the base, and the single
character would then appear. This will create precomposed characters
(in NFC).
True, but the problem is that the mentioned character does not exists as
Unicode entity. Or so it seems to me.

Cristi
Michael (michka) Kaplan [MS]
2004-02-09 16:23:26 UTC
Permalink
Post by Cristian Secara
Post by Michael (michka) Kaplan [MS]
1) The dead key model -- Two keystrokes. To type, the user would first
type the accent (nothing will appear), then the base, and the single
character would then appear. This will create precomposed characters
(in NFC).
True, but the problem is that the mentioned character does not exists as
Unicode entity. Or so it seems to me.
It does not. In this case it would likely be better to put the entire
ligature in a single keystroke, in rarely used location (since the character
is rarely used).
--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies

This posting is provided "AS IS" with
no warranties, and confers no rights.
Mihai N.
2004-02-09 07:18:41 UTC
Permalink
Post by Cristian Secara
For my language (Romanian) there are few characters where, on rare
occasions, they can have two accents simultaneously. One of these has no
distinct Unicode code for it, as far as I know: letter i with circumflex AND
acute accent. This can be generated by using U00EE then U0301.
Sorry, where and when did you graduate?
Never heard of such a thing.
If you need this for something else, just say it.
--
Mihai
-------------------------
Replace _year_ with _ to get the real email
Cristian Secara
2004-02-09 15:11:49 UTC
Permalink
Post by Mihai N.
[...] letter i with circumflex AND acute accent.
Sorry, where and when did you graduate?
Never heard of such a thing.
As I said, it is used "on rare occasions". That means in dictionaries or
other academic literature.
Here's one example: "Indreptar Ortografic, Ortoepic si de Punctuatie",
edited by Romanian Academy, the National Linguistic Institute division.

Cristi
Michael (michka) Kaplan [MS]
2004-02-09 16:25:17 UTC
Permalink
Truth be told, this is generally not a good thing to add to a typical
keyboard, as it makes the keyboard less inherently useful to people who are
native speakers in the language.

I would suggest seriously reconsidering whether this new keyboard standard
is attempting to solve problems for too many other people, rather than for
the people for whom it is intended.
--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies

This posting is provided "AS IS" with
no warranties, and confers no rights.
Post by Cristian Secara
Post by Mihai N.
[...] letter i with circumflex AND acute accent.
Sorry, where and when did you graduate?
Never heard of such a thing.
As I said, it is used "on rare occasions". That means in dictionaries or
other academic literature.
Here's one example: "Indreptar Ortografic, Ortoepic si de Punctuatie",
edited by Romanian Academy, the National Linguistic Institute division.
Cristi
Loading...