2012-02-10

special characters

Special characters

Living in a post-ASCII world offers great opportunities, but brings some problems, too. After all, it's nice to be able to write Ångström or Καλημέρα or ☺☎☯, but it's not necessarily easy to enter those characters.

input methods

So - what to do? First, you can set the input-method, as explained in the emacs manual. This is the best solution if you're writing a non-Latin language – Russian, Thai, Japanese, …

If you only occasionally need some accented character, input methods like latin-postfix (e" -> ë), latin-prefix ("e -> ë) or TeX (\"e -> ë) are useful. They also tend to annoy me a bit, as they often assume I need an accented character, when all I want is to put a word in quotation marks…

compose key

Another method is to use a special compose key; for example, under Gnome 3 it's in the the Region and Language applet, under Options..., in Gnome 2 it's in the Keyboard applet in Layouts/Options…. This works for all programs, not just emacs (see this Ubuntu help page for some details). I've set my Compose-key to Right-Alt, so Right-Alt "e -> ë.

Using the compose key works pretty well for me; setting the input method may be more convenient when you need to write a lot of accented characters.

Now, his may be good and well for the accented characters and other variants of Latin characters, such as the German Ess-Zet ligature "ß" (note, you can get that character with latin-prefix "s -> ß, latin-postfix s" -> ß or <compose-key> ss -> ß). But what about Greek characters? Mathematical symbols? Smileys?

ucs-insert

One way to add characters like α, or is to use ucs-insert, by default bound to C-x 8 RET. If you know the official Unicode name for a character, you can find it there; note that there's auto-completion and you can use * wild-cards. For the mentioned characters, that would be GREEK SMALL LETTER ALPHA, INFINITY and WHITE SMILING FACE.

You can also use the Unicode code points; so C-x 8 RET 03b1 RET will insert α as well, since its code point is U+03B1. In case you don't know the code points of Unicode characters, a tool like the Character Map (gucharmap) in Gnome may be useful.

abbrev

Since ucs-insert may not be convenient in all cases, you may want to add shortcuts for oft-used special characters to you abbrev table. See the entry on Abbrevs in the emacs manual. I usually edit the entries by hand with M-x edit-abbrevs, and I have entries like:

(text-mode-abbrev-table)
"Delta0"       0    "Δ"
"^2"           0    "²"
"^3"           0    "³"
"almost0"      0    "≈"
"alpha0"       0    "α"
"any0"         0    "∀"
"beta0"        0    "β"
"c0"           0    "©"
"deg0"         0    "℃"
"delta0"       0    "δ"
"elm0"         0    "∈"
"epsilon0"     0    "ε"
"eta0"         0    "η"
"heart0"       0    "♥"
"inf0"         0    "∞"
"int0"         0    "∫"
"notis0"       0    "≠"

Now, alpha0 will be auto-replaced with α. I'm using the 0 suffix for most entries so I can easily remember them, without making it hard to use alpha as a normal word. Note, abbrevs are a bit picky when it comes to the characters in the shortcut – for example, setting != -> won't work.

inheriting abbrevs from other modes

If you have set up a nice set of abbreviations for text-mode, you may want to use them in other modes as well; you can accomplish this by including the text-mode abbreviations into the table for the current one, for example in your ERC setup:

;; inherit abbrevs from text-mode
(abbrev-table-put erc-mode-abbrev-table :parents (list text-mode-abbrev-table))

5 comments:

Pavel Iosad said...

Input methods are great. I'm writing a thesis in linguistics with lots of phonetic symbols, and IPA-X-SAMPA is has saved me unthinkable amounts of time, especially in combination with C-h.

But if you need weird symbols only occasionally, and they are moderately weird, switching to an input method like TeX is indeed overkill; C-x 8 gives lots of useful symbols without going the roundabout with Unicode names. E.g. C-x 8 " a gives ä, and C-x 8 S gives §, etc.

Dave Sailer said...

Thanks. Unfortunately I can't devote my whole life to researching emacs. So it's nice to find that about every third post of yours there is some truly golden nugget that helps me immensely.

I know about EmacsWiki and the emacs manual but somehow can never make sense of either.

I've been using "Spanish Minor Mode for GNU Emacs" [http://www.1729.com/spanish/spanish-emacs.html] to help in learning the language. It's OK but can't do ü. And is outdated. Somehow or other I managed to modify it to defeat the deprecation warning.

But.

After reading your post I looked up the input methods. More complete, look better. Now I know.

It's taken about 15 years to find out (by accident, this week) that there is a routine to save a keyboard macro in the .emacs file and convert the syntax. I had to figure it out by trial and error. Wish I'd known then. Arrgh.

djcb said...

@Dave Sailer: I'm glad it's useful for you!

Shawnessy said...

I find the Agda input method is far better than ucs and latex for mathematical text. The completion of sequences of related symbols, e.g. arrows, is great.

http://wiki.portal.chalmers.se/agda/agda.php

http://wiki.portal.chalmers.se/agda/agda.php?n=Docs.UnicodeInput

Drew said...

See also ucs-cmds.el. You can easily define commands to insert individual Unicode chars. Bind them to keys to, in effect, add special chars to your keyboard.

http://www.emacswiki.org/emacs/download/ucs-cmds.el