greek-fontenc

Greek font encoding definition files

Author:

Günter Milde

Licence:

This work may be distributed and/or modified under the conditions of the LaTeX Project Public License, either version 1.3 of this license or any later version.

Abstract

LaTeX internal character representation (LICR) macros are a verbose but failsafe 7-bit ASCII encoding that works unaltered under both, 8-bit TeX and XeTeX/LuaTeX. Use cases are macro definitions and generated text

This bundle provides LICR macros for characters from the Greek script and encoding definition files for Greek text font encodings for use with fontenc (8-bit TeX) or fontspec (XeTeX/LuaTeX).

Included are also the LaTeX packages textalpha and alphabeta.

Changelog

0.9

2013-07-03

greek-fontenc.def “outsourced” from lgrxenc.def

experimental files xunicode-greek.sty and greek-euenc.def: LICRs for XeTeX/LuaTeX.

0.9.1

2013-07-18

Bugfix: wrong breathings psilioxia -> dasiaoxia.

0.9.2

2013-07-19

Bugfix: Disable composite defs starting with char macro,

fix “hiatus” handling.

0.9.3

2013-07-24

Fix “input” path in xunicode-greek and greek-euenc.def.

0.9.4

2013-09-10

greek-fontenc.sty: Greek text font encoding setup package,

remove xunicode-greek.sty.

0.10

2013-09-13

textalpha.sty and alphabeta.sty moved here from lgrx and updated to work with XeTeX/LuaTeX.

greek-fontenc.sty removed (obsoleted by textalpha.sty).

0.10.1

2013-10-01

Bugfix in greek-euenc.def and alphabeta-euenc.def.

0.11

2013-11-28

Compatibility with Xe/LuaTeX in 8-bit mode,

\greekscript TextCommand.

0.11.1

2013-12-01

Fix identification of greek-euenc.def.

0.11.2

2014-09-04

Documentation update, remove duplicate code.

0.12

2014-12-25

Fix auxiliary macro names in textalpha.

Conservative naming: move definition of \< and \> from greek-fontenc.def to textalpha.sty (Bugreport David Kastrup). Documentation update.

0.13

2015-09-04

Support for symbol variants,

keep-semicolon option in textalpha,

\lccode/\uccode corrections for Unicode (from Apostolos Syropoulos’ xgreek) in greek-euenc,

Do not convert \ypogegrammeni to \prosgegrammeni with \MakeUppercase.

0.13.1

2015-12-07

Fix rho with dasia bug in lgrenc.def (Linus Romer).

0.13.2

2016-02-05

Support for standard Unicode text font encoding “TU” (new in fontspec v2.5a).

0.13.3

2019-07-10

Drop error font declaration (cf. ltxbugs 4399).

0.13.4

2019-07-11

@uclclist entry for \prosgegrammeni.

Documentation update.

0.14

2020-02-28

Update test for Unicode fonts. Rename greek-euenc to tuenc-greek.

Use \UTFencoding instead of \LastDeclaredEncoding.

1.0

2020-09-25

Bugfix in textalpha: Let \greekscript set \encodingdefault.

\textKoppa as alias for \textkoppa in LGR.

Update URLs.

2.0

2020-10-30

Move common alias definitions to greek-fontenc.def.

textalpha loads TU with Xe/LuaTeX by default and provides \textmicro and LICR macros for archaic symbols from the Greek and Coptic Unicode block.

Use \UnicodeEncodingName (by the LaTeX kernel) instead of \UTFencname for the Unicode font encoding name.

Replace utf8 literals in tuenc-greek.def.

New file puenc-greek.def: setup for PU encoding (defined by hyperref for PDF strings).

Don’t use \textcompwordmark as base in accent commands.

Documentation update.

2.1

2022-06-14

Support the correct spelling \guillemet… for « and ». See https://github.com/latex3/latex2e/issues/65

TeX files

lgrenc.def

LGR Greek font encoding definitions.

This file is the successor of the basic LGR encoding definition file which comes with babel’s Greek support and the now obsolete lgrx bundle.

tuenc-greek.def

Font setup for Greek with XeTeX/LuaTeX.

puenc-greek.def

Greek LICR definitions for PDF strings.

greek-fontenc.def

Common Greek font encoding definitions.

greek-euenc.def

Backwards compatibility file loading tuenc-greek.

textalpha.sty

Greek symbols in text

Use \textalpha\textOmega independent of font encoding and TeX engine.

alphabeta.sty

Greek symbols in text and math.

Use \alpha\Omega independent of text/math mode, font encoding, and TeX engine.

alphabeta-lgr.def

Composite definitions for 8-bit TeX..

alphabeta-tuenc.def

Composite definitions with XeTeX/LuaTeX..

Literate source files were converted with PyLit to reStructuredText and with Docutils to the HTML documentation.

Documentation and test documents

Overview:

README, greek-fontenc.html

textalpha package documentation:

textalpha-doc.tex, textalpha-doc.pdf, textalpha-tu.pdf

alphabeta package documentation:

alphabeta-doc.tex, alphabeta-doc.pdf, alphabeta-tu.pdf

LGR test and usage example

test-lgrenc.tex, test-lgrenc.pdf

TU test and usage example

test-tuenc-greek.tex, test-tuenc-greek.pdf

Hyperref test and usage example

hyperref-with-greek.tex, hyperref-with-greek.pdf

Test with input encodings other than utf-8

test-inputenc.tex, test-inputenc.pdf

Greek diacritics with standard accent macros

diacritics.tex, diacritics.pdf

Experimental files

These files are still in development and will eventually be moved to/merged with other packages or removed in future versions:

lgr2licr.lua

LGR Transcription to Greek LICR transformation

Installation

If possible, get the bundle from your distribution using its installation manager.

Otherwise, make sure LaTeX can find the package and definition files:

Conflicts

The arabi package provides the Babel arabic option which loads arabicfnt.sty for font setup. This package overwrites the LICR macros \omega and \textomega with font selecting commands. See the report for Debian bug 858987 for details and the arabi workaround below.

Usage

There are several alternatives to set up the support for a Greek font encoding provided by this bundle, e.g.:

Babel:

Use the greek option with Babel:

\usepackage[greek]{babel}

This automatically loads lgrenc.def with 8-bit TeX and tuenc-greek.def with XeTeX/LuaTeX and provides localized auto-strings, hyphenation and other localizations (see babel-greek).

Babel can be used together with textalpha or alphabeta.

textalpha:

Ensure support for Greek characters in text mode:

\usepackage{textalpha}

eventually with the normalize-symbols option to handle symbol variants and/or the keep-semicolon option to use the semicolon as erotimatiko also in LGR

\usepackage[normalize-symbols,keep-semicolon]{textalpha}

This sets up LICR macros for Greek text charactes under both, 8-bit TeX and Xe-/LuaTeX. For details see textalpha-doc.tex and textalpha-doc.pdf (8-bit TeX) as well as test-tuenc-greek.tex and test-tuenc-greek.pdf (XeTeX/LuaTeX).

alphabeta:

To use the short macro names (\alpha\Omega) known from math mode in both, text and math mode, write

\usepackage{alphabeta}

For details see alphabeta-doc.tex and alphabeta-doc.pdf.

fontenc:

Declare LGR via fontenc. For example, specify T1 (8-bit Latin) as default font encoding and LGR for Greek with

\usepackage[LGR,T1]{fontenc}

Note that without textalpha or alphabeta, Greek text macros work only if the current font encoding supports Greek. See [fntguide] for details and test-lgrenc.tex for an example.

It is possible to use 8-bit Greek text fonts in the LGR TeX font encoding also with XeTeX/LuaTeX, if the fontenc package is loaded before Babel, textalpha, or alphabeta, e.g.

\usepackage[LGR]{fontenc}
\usepackage{fontspec}
\setmainfont{Linux Libertine O} % Latin Modern does not support Greek
\setsansfont{Linux Biolinum O}
\usepackage{textalpha}

See test-tuenc-greek.tex, test-tuenc-greek.pdf and test-lgrenc.tex, test-lgrenc.pdf.

To work around the conflict with arabi, it may suffice to ensure greek is loaded after arabic:

\usepackage[arabic,greek,english]{babel}

More secure is an explicit reverse-definition, e.g.

% save original \omega
\let\mathomega\omega

\usepackage[utf8]{inputenc}
\usepackage[LAE,LGR,T1]{fontenc}
\usepackage[arabic,greek,english]{babel}

% fix arabtex:
\DeclareTextSymbol{\textomega}{LGR}{119}
\renewcommand{\omega}{\mathomega}

Greek text font encodings

Greek TeX font encodings are the envisaged T7, LGR, and LGI. Greek letters and symbols are also defined in the Unicode-based font encodings TU, and PU.

T7

The [encguide] reserves the name T7 for a Greek standard font encoding. However, up to now, there is no agreement on an implementation because the restrictions for general text encodings are too severe for typesetting polytonic Greek.

LGR

The LGR font encoding is the de-facto standard for typesetting Greek with (8-bit) LaTeX. greek-fontenc provides a comprehensive LGR font encoding definition file.

Fonts in this encoding include the CB fonts (matching CM), grtimes (Greek Times), Kerkis (matching URW Bookman), DejaVu, Libertine GC, and the GFS fonts. Setup of these fonts as Greek variant to matching Latin fonts is facilitated by the substitutefont package.

The LGR font encoding allows to access Greek characters via an ASCII transliteration. This enables simple input with a Latin keyboard. Characters with diacritics can be selected by ligature definitions in the font (see [greek-usage], [teubner-doc], [cbfonts]).

A major drawback of the transliteration is, that you cannot access Latin letters if LGR is the active font encoding (e.g. in documents or parts of documents given the Babel language greek or polutionikogreek). This means that for every Latin-written word or acronym an explicit language-switch is required. This problem can only be solved via a font-encoding comprising Latin and Greek like the envisaged T7 or Unicode (with XeTeX or LuaTeX).

LGI

The ‘Ibycus’ fonts from the package ibygrk implement an alternative transliteration scheme (also explained in [babel-patch]). It is currently not supported by greek-fontenc.

The font encoding file lgienc.def from ibycus-babel provides a basic setup (without any LICR macros or composite definitions).

TU

Standard Unicode font encoding for XeTeX and LuaTeX loaded by fontspec (since v2.5a) rsp. the LaTeX kernel since 2017/01/01 [LaTeX2e News Issue 26]_. greek-fontenc adds support for the Greek script (see tuenc-greek).

Xe/LuaTeX works with any system-wide installed OpenType font. Suitable fonts supporting Greek include CM Unicode, Deja Vu, EB Garamond, the GFS fonts, Libertine OTF, Libertinus, Old Standard, Tempora, and UM Typewriter (all available on CTAN) but also many commercial fonts. Unfortunately, the fontspec default, Latin Modern misses most Greek characters.

XeTeX uses the Unicode NFC normalization, so that combining characters are merged with the base character if a pre-composed character exists. This results in better looking output for characters with multiple diacritics. Unfortunately, LuaTeX does not apply the NFC normalization. This leads to suboptimal placing of some diactritics, especially the sub-iota (becoming unintelligable in combination with small letter eta).

TODO: The lua-uni-algos package may be helpfull to implement a NTC normalization to Greek text in LuaTeX.

The legacy Unicode font encodings EU1 and EU2 for XeTeX and LuaTeX respectively were superseded by TU in the 2017 fontspec release.

PU

The package hyperref defines the PU font encoding for use in PDF strings (ToC, bookmarks) which supports monotonic Greek. greek-fontenc adds support for polytonic Greek and some archaic characters also supported in LGR and TU (see hyperref-with-greek.tex, hyperref-with-greek.pdf).

Selecting Greek LICR macro names

This bundle provides LaTeX internal character representations (LICR macros) for Greek letters and diacritics. Macro names were selected based on the following considerations:

letters and symbols

  • The fntguide (section 6.4 Naming conventions) recommends:

    Where possible, text symbols should be named as \text followed by the Adobe glyph name: for example \textonequarter or \textsterling. Similarly, math symbols should be named as \math followed by the glyph name, for example \mathonequarter or \mathsterling.

    Problem:

    The Adobe Glyph List For New Fonts has names for many glyphs in the Greek and Coptic Unicode block, but not for Greek extended. The Adobe Glyph List (for existing fonts) lists additional glyph names used in older fonts. However, these are not intended for active use.

  • If there exists a math-mode macro for a symbol, the corresponding text macro could be formed by prepending text.

    Example:

    The glyph name for the GREEK SMALL LETTER FINAL SIGMA is sigma1, the corresponding math-macro is \varsigma. The text symbol is made available as \textvarsigma.

    Problem:

    The math macros for the symbol variants \varepsilon and \varphi map to characters named “GREEK SMALL LETTER …”, while \vartheta, \varkappa, \varrho, and \varpi map to “GREEK … SYMBOL” Unicode characters. (See also section 5.5.3 of the unicode-math documentation.)

  • The Unicode names list provides standardized descriptive names for all Unicode characters that use only capital letters of the Latin alphabet. While not suited for direct use in LICR macros, they can be either

    1. used as inspiration for new LICR macro names or

    2. converted to LICR macro names via a defined set of transformation rules.

    Example:

    \textfinalsigma is a descriptive alias for GREEK SMALL LETTER FINAL SIGMA derived via the rules:

    • drop “LETTER” if the name remains unique,

    • drop “GREEK” if the name remains unique,

    • use capitalized name for capital letters, lowercase for “SMALL” letters and drop “SMALL”,

    • concatenate

  • Omit the “text” prefix for macros that do not have a math counterpart?

    Pro:
    • Simpler,

    • ease of use (less typing, better readability of source text),

    • many established text macro names without “text”,

    • text prefix does not mark a macro as encoding-specific or “inserting a glyph”. There are e.g. font-changing macros (\textbf, \textit) and encoding-changing macros (\textgreek, \textcyr).

    • There are examples of encoding-specific macros without the text-prefix, especially for letters, see encguide.

    Contra:
    • Less consistent,

    • possible name clashes

    • text prefix marks a macro as confined to text (as opposed to math) mode,

    The font encoding definition files use the text prefix for symbols. Aliases (short forms, compatibility defs, etc.) are defined in additional packages (e.g. alphabeta.sty, babel-greek, or teubner)

accent macros

  • standard accent macros (\DeclareTextAccent definitions in latex/base/...) are one-symbol macros (\' \" ... \u \v ...) .

  • tipa.sty, xunicode, and ucs use the “text” prefix also for accents.

    However, the Adobe Glyph List For New Fonts maps, e.g., “tonos” and “dieresistonos” to the spacing characters GREEK TONOS and GREEK DIALYTIKA TONOS, hence texttonos and textdiaresistonos should be spacing characters.

  • textcomp (ts1enc.def) defines \capital... accents (i.e. without text prefix).

Currently, greek-fontenc uses for diacritics:

  • Greek names like in Unicode, and ucsencs.def, and

  • the prefix \acc to distinguish the macros as TextAaccent and reduce the risc of name clashes (cf. \@tabacckludge).

For the end-user “symbol macros” (\~ \' \` \" \< \> \"' ...) are provided. (The non-standard macros \< and \> only with textalpha or alphabeta.)

symbol variants

See also http://en.wikipedia.org/wiki/Greek_alphabet#Glyph_variants

Mathematical notation distinguishes variant shapes for beta (β|ϐ), theta (θ|ϑ), phi (φ|ϕ), pi (π|ϖ), kappa (κ|ϰ), rho (ρ|ϱ), Theta (Θ|ϴ), and epsilon (ε|ϵ). The variations have no syntactic meaning in Greek text and Greek text fonts use the shape variants indiscriminately.

Unicode defines separate code points for the symbol variants for use in mathematical context. However, they are sometimes also used in place of the corresponding letter characters in Unicode-encoded text.

The variant shapes are not given separate code-points in the LGR font encoding.

In mathematical mode, TeX supports the distinction between θ|ϑ, π|ϖ, φ|ϕ, ρ|ϱ, and ε|ϵ with \var<lettername> macros. However, the mapping of letter/symbol in Unicode to “normal”/variant in TeX is inconsistent and variant macros for ϴ ϐ, and ϰ are not available without additional packages (e.g. amssymb provides ϰ as \varkappa).

greek-fontenc provides \text<lettername>symbol LICR macros for these characters:

  • With Unicode fonts, the macros select the GREEK <lettername> SYMBOL``.

  • With LGR encoded fonts, they report an error by default and are mapped to the corresponding letter with the normalize-symbols option of textalpha and alphabeta (loosing the distinction between the shape variants).

References

An alternative, more complete set of short mnemonic character names is the XML Entity Definitions for Characters W3C Recommendation from 01 April 2010.

For glyph names of the LGR encoding see, e.g., CB.enc by Apostolos Syropoulos and xl-lgr.enc from the libertine (legacy) package. lgr.cmap provides a mapping to Unicode characters.

A full set of \text* symbol macros is defined in ucsencs.def from the ucs package.

[fntguide]

LaTeX3 Project Team, LaTeX2ε font selection, 2005. http://mirror.ctan.org/macros/latex/base/fntguide.pdf

[encguide]

Frank Mittelbach, Robin Fairbairns, Werner Lemberg, LaTeX3 Project Team, LaTeX font encodings, 2006. http://mirror.ctan.org/macros/latex/base/encguide.pdf

[greek-usage]

Apostolos Syropoulos, Writing Greek with the greek option of the babel package, 1997. http://mirrors.ctan.org/language/babel/contrib/greek/usage.pdf

[cbfonts]

Claudio Beccari, The CB Greek fonts, Εὔτυπον, τεῦχος № 21, 2008. http://www.eutypon.gr/eutypon/pdf/e2008-21/e21-a01.pdf

[teubner-doc]

Claudio Beccari, teubner.sty An extension to the greek option of the babel package, 2011. http://mirror.ctan.org/macros/latex/contrib/teubner/teubner-doc.pdf

[babel-patch]

Werner Lemberg, Unicode support for the Greek LGR encoding Εὔτυπον, τεῦχος № 20, 2008. http://www.eutypon.gr/eutypon/pdf/e2008-20/e20-a03.pdf

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.