locale — describes a locale definition file
The locale
definition file contains all the information that the
localedef(1) command needs to
convert it into the binary locale database.
The definition files consist of sections which each describe a locale category in detail. See locale(7) for additional details for these categories.
The locale definition file starts with a header that may consist of the following keywords:
escape_charis followed by a character that should be used as the escape-character for the rest of the file to mark characters that should be interpreted in a special way. It defaults to the backslash (\).
comment_charis followed by a character that will be used as the comment-character for the rest of the file. It defaults to the number sign (#).
The locale definition has one part for each locale
category. Each part can be copied from another existing
locale or can be defined from scratch. If the category
should be copied, the only valid keyword in the definition
is copy followed
by the name of the locale in double quotes which should be
copied. The exceptions for this rule are LC_COLLATE and LC_CTYPE where a copy statement can be
followed by locale-specific rules and selected
overrides.
When defining a category from scratch, all field
descriptors and strings should be defined as Unicode code
points in angle brackets, unless otherwise stated below.
For example, "€" is to be presented as
"<U20AC>", "%a" as "<U0025><U0061>", and
"Monday" as
"<U0053><U0075><U006E><U0064><U0061><U0079>".
Values defined as Unicode code points must be in double
quotes, plain number values are not quoted (but
LC_CTYPE and LC_COLLATE follow special formatting, see
the system-provided locale files for examples).
The following category sections are defined by POSIX:
LC_CTYPE
LC_COLLATE
LC_MESSAGES
LC_MONETARY
LC_NUMERIC
LC_TIME
In addition, since version 2.2, the GNU C library supports the following nonstandard categories:
LC_ADDRESS
LC_IDENTIFICATION
LC_MEASUREMENT
LC_NAME
LC_PAPER
LC_TELEPHONE
See locale(7) for a more detailed description of each category.
The definition starts with the string LC_ADDRESS in the first column.
The following keywords are allowed:
postal_fmtfollowed by a string containing field descriptors that define the format used for postal addresses in the locale. The following field descriptors are recognized:
- %n
Person's name, possibly constructed with the
LC_NAMEname_fmtkeyword (since glibc 2.24).- %a
Care of person, or organization.
- %f
Firm name.
- %d
Department name.
- %b
Building name.
- %s
Street or block (e.g., Japanese) name.
- %h
House number or designation.
- %N
Insert an end-of-line if the previous descriptor's value was not an empty string; otherwise ignore.
- %t
Insert a space if the previous descriptor's value was not an empty string; otherwise ignore.
- %r
Room number, door designation.
- %e
Floor number.
- %C
Country designation, from the
country_postkeyword.- %l
Local township within town or city (since glibc 2.24).
- %z
Zip number, postal code.
- %T
Town, city.
- %S
State, province, or prefecture.
- %c
Country, as taken from data record.
Each field descriptor may have an 'R' after the '%' to specify that the information is taken from a Romanized version string of the entity.
country_namefollowed by the country name in the language of
the current document (e.g., "Deutschland" for the
de_DE
locale).
country_postfollowed by the abbreviation of the country (see CERT_MAILCODES).
country_ab2followed by the two-letter abbreviation of the country (ISO 3166).
country_ab3followed by the three-letter abbreviation of the country (ISO 3166).
country_numfollowed by the numeric country code as plain numbers (ISO 3166).
country_carfollowed by the international licence plate country code.
country_isbnfollowed by the ISBN code (for books).
lang_namefollowed by the language name in the language of the current document.
lang_abfollowed by the two-letter abbreviation of the language (ISO 639).
lang_termfollowed by the three-letter abbreviation of the language (ISO 639-2/T).
lang_libfollowed by the three-letter abbreviation of the
language for library use (ISO 639-2/B). Applications
should in general prefer lang_term over
lang_lib.
The LC_ADDRESS definition
ends with the string END
LC_ADDRESS.
The definition starts with the string LC_CTYPE in the first column.
The following keywords are allowed:
upperfollowed by a list of uppercase letters. The
letters A through
Z are included
automatically. Characters also specified as
cntrl,
digit,
punct, or
space are
not allowed.
lowerfollowed by a list of lowercase letters. The
letters a through
z are included
automatically. Characters also specified as
cntrl,
digit,
punct, or
space are
not allowed.
alphafollowed by a list of letters. All character
specified as either upper or lower are
automatically included. Characters also specified as
cntrl,
digit,
punct, or
space are
not allowed.
digitfollowed by the characters classified as numeric
digits. Only the digits 0 through 9 are allowed. They are included by
default in this class.
spacefollowed by a list of characters defined as
white-space characters. Characters also specified as
upper,
lower,
alpha,
digit,
graph, or
xdigit are
not allowed. The characters <space>,
<form-feed>,
<newline>,
<carriage-return>,
<tab>, and
<vertical-tab>
are automatically included.
cntrlfollowed by a list of control characters.
Characters also specified as upper, lower, alpha, digit, punct, graph, print, or xdigit are not
allowed.
punctfollowed by a list of punctuation characters.
Characters also specified as upper, lower, alpha, digit, cntrl, xdigit, or the
<space>
character are not allowed.
graphfollowed by a list of printable characters, not
including the <space>
character. The characters defined as upper, lower, alpha, digit, xdigit, and
punct are
automatically included. Characters also specified as
cntrl are
not allowed.
printfollowed by a list of printable characters,
including the <space>
character. The characters defined as upper, lower, alpha, digit, xdigit, punct, and the
<space>
character are automatically included. Characters also
specified as cntrl are not
allowed.
xdigitfollowed by a list of characters classified as
hexadecimal digits. The decimal digits must be
included followed by one or more set of six
characters in ascending order. The following
characters are included by default: 0 through 9, a
through f, A through F.
blankfollowed by a list of characters classified as
blank. The
characters <space> and
<tab>
are automatically included.
charclassfollowed by a list of locale-specific character class names which are then to be defined in the locale.
toupperfollowed by a list of mappings from lowercase to
uppercase letters. Each mapping is a pair of a
lowercase and an uppercase letter separated with a
, and enclosed in
parentheses. The members of the list are separated
with semicolons.
tolowerfollowed by a list of mappings from uppercase to lowercase letters. If the keyword tolower is not present, the reverse of the toupper list is used.
followed by a list of mapping pairs of characters and letters to be used in titles (headings).
classfollowed by a locale-specific character class definition, starting with the class name followed by the characters belonging to the class.
charconvfollowed by a list of locale-specific character mapping names which are then to be defined in the locale.
outdigitfollowed by a list of alternate output digits for the locale.
followed by a list of mapping pairs of alternate digits and separators for input digits for the locale.
followed by a list of mapping pairs of alternate separators for output for the locale.
translit_startmarks the start of the transliteration rules
section. The section can contain the include keyword in
the beginning followed by locale-specific rules and
overrides. Any rule specified in the locale file will
override any rule copied or included from other
files. In case of duplicate rule definitions in the
locale file, only the first rule is used.
A transliteration rule consist of a character to
be transliterated followed by a list of
transliteration targets separated by semicolons. The
first target which can be presented in the target
character set is used, if none of them can be used
the default_missing
character will be used instead.
includein the transliteration rules section includes a transliteration rule file (and optionally a repertoire map file).
default_missingin the transliteration rules section defines the default character to be used for transliteration where none of the targets cannot be presented in the target character set.
translit_endmarks the end of the transliteration rules.
The LC_CTYPE definition
ends with the string END
LC_CTYPE.
Note that glibc does not support all POSIX-defined options, only the options described below are supported (as of glibc 2.23).
The definition starts with the string LC_COLLATE in the first column.
The following keywords are allowed:
coll_weight_maxfollowed by the number representing used collation levels. This keyword is recognized but ignored by glibc.
collating-elementfollowed by the definition of a collating-element symbol representing a multicharacter collating element.
collating-symbolfollowed by the definition of a collating symbol that can be used in collation order statements.
definefollowed by string to be
evaluated in an ifdef string / else / endif construct.
reorder-afterfollowed by a redefinition of a collation rule.
reorder-endmarks the end of the redefinition of a collation rule.
reorder-sections-afterfollowed by a script name to reorder listed scripts after.
reorder-sections-endmarks the end of the reordering of sections.
scriptfollowed by a declaration of a script.
symbol-equivalencefollowed by a collating-symbol to be equivalent to another defined collating-symbol.
The collation rule definition starts with a line:
order_startfollowed by a list of keywords chosen from
forward,
backward,
or position. The order
definition consists of lines that describe the
collation order and is terminated with the keyword
order_end.
The LC_COLLATE definition
ends with the string END
LC_COLLATE.
The definition starts with the string LC_IDENTIFICATION in the first
column.
The values in this category are defined as plain strings.
The following keywords are allowed:
titlefollowed by the title of the locale document (e.g., "Maori language locale for New Zealand").
sourcefollowed by the name of the organization that maintains this document.
addressfollowed by the address of the organization that maintains this document.
contactfollowed by the name of the contact person at the organization that maintains this document.
emailfollowed by the email address of the person or organization that maintains this document.
telfollowed by the telephone number (in international format) of the organization that maintains this document. As of glibc 2.24, this keyword is deprecated in favor of other contact methods.
faxfollowed by the fax number (in international format) of the organization that maintains this document. As of glibc 2.24, this keyword is deprecated in favor of other contact methods.
languagefollowed by the name of the language to which this document applies.
territoryfollowed by the name of the country/geographic extent to which this document applies.
audiencefollowed by a description of the audience for which this document is intended.
applicationfollowed by a description of any special application for which this document is intended.
abbreviationfollowed by the short name for provider of the source of this document.
revisionfollowed by the revision number of this document.
datefollowed by the revision date of this document.
In addition, for each of the categories defined by the
document, there should be a line starting with the keyword
category,
followed by:
a string that identifies this locale category definition,
a semicolon, and
one of the LC_*
identifiers.
The LC_IDENTIFICATION
definition ends with the string END LC_IDENTIFICATION.
The definition starts with the string LC_MESSAGES in the first column.
The following keywords are allowed:
yesexprfollowed by a regular expression that describes possible yes-responses.
noexprfollowed by a regular expression that describes possible no-responses.
yesstrfollowed by the output string corresponding to "yes".
nostrfollowed by the output string corresponding to "no".
The LC_MESSAGES definition
ends with the string END
LC_MESSAGES.
The definition starts with the string LC_MEASUREMENT in the first column.
The following keywords are allowed:
measurementfollowed by number identifying the standard used for measurement. The following values are recognized:
Metric.
US customary measurements.
The LC_MEASUREMENT
definition ends with the string END LC_MEASUREMENT.
The definition starts with the string LC_MONETARY in the first column.
Values for int_curr_symbol, currency_symbol, mon_decimal_point,
mon_thousands_sep,
positive_sign,
and negative_sign
are defined as Unicode code points, the others as plain
numbers.
The following keywords are allowed:
int_curr_symbolfollowed by the international currency symbol. This must be a 4-character string containing the international currency symbol as defined by the ISO 4217 standard (three characters) followed by a separator.
currency_symbolfollowed by the local currency symbol.
mon_decimal_pointfollowed by the string that will be used as the decimal delimiter when formatting monetary quantities.
mon_thousands_sepfollowed by the string that will be used as a group separator when formatting monetary quantities.
mon_groupingfollowed by a sequence of integers separated by
semicolons that describe the formatting of monetary
quantities. See grouping below for
details.
positive_signfollowed by a string that is used to indicate a positive sign for monetary quantities.
negative_signfollowed by a string that is used to indicate a negative sign for monetary quantities.
int_frac_digitsfollowed by the number of fractional digits that
should be used when formatting with the int_curr_symbol.
frac_digitsfollowed by the number of fractional digits that
should be used when formatting with the currency_symbol.
p_cs_precedesfollowed by an integer that indicates the
placement of currency_symbol for a
nonnegative formatted monetary quantity:
0the symbol succeeds the value.
1the symbol precedes the value.
p_sep_by_spacefollowed by an integer that indicates the
separation of currency_symbol, the
sign string, and the value for a nonnegative
formatted monetary quantity. The following values are
recognized:
0No space separates the currency symbol and the value.
1If the currency symbol and the sign string are adjacent, a space separates them from the value; otherwise a space separates the currency symbol and the value.
2If the currency symbol and the sign string are adjacent, a space separates them from the value; otherwise a space separates the sign string and the value.
n_cs_precedesfollowed by an integer that indicates the
placement of currency_symbol for a
negative formatted monetary quantity. The same values
are recognized as for p_cs_precedes.
n_sep_by_spacefollowed by an integer that indicates the
separation of currency_symbol, the
sign string, and the value for a negative formatted
monetary quantity. The same values are recognized as
for p_sep_by_space.
p_sign_posnfollowed by an integer that indicates where the
positive_sign should
be placed for a nonnegative monetary quantity:
0Parentheses enclose the quantity and the
currency_symbolorint_curr_symbol.1The sign string precedes the quantity and the
currency_symbolor theint_curr_symbol.2The sign string succeeds the quantity and the
currency_symbolor theint_curr_symbol.3The sign string precedes the
currency_symbolor theint_curr_symbol.4The sign string succeeds the
currency_symbolor theint_curr_symbol.
n_sign_posnfollowed by an integer that indicates where the
negative_sign should
be placed for a negative monetary quantity. The same
values are recognized as for p_sign_posn.
int_p_cs_precedesfollowed by an integer that indicates the
placement of int_curr_symbol for a
nonnegative internationally formatted monetary
quantity. The same values are recognized as for
p_cs_precedes.
int_n_cs_precedesfollowed by an integer that indicates the
placement of int_curr_symbol for a
negative internationally formatted monetary quantity.
The same values are recognized as for p_cs_precedes.
int_p_sep_by_spacefollowed by an integer that indicates the
separation of int_curr_symbol, the
sign string, and the value for a nonnegative
internationally formatted monetary quantity. The same
values are recognized as for p_sep_by_space.
int_n_sep_by_spacefollowed by an integer that indicates the
separation of int_curr_symbol, the
sign string, and the value for a negative
internationally formatted monetary quantity. The same
values are recognized as for p_sep_by_space.
int_p_sign_posnfollowed by an integer that indicates where the
positive_sign should
be placed for a nonnegative internationally formatted
monetary quantity. The same values are recognized as
for p_sign_posn.
int_n_sign_posnfollowed by an integer that indicates where the
negative_sign should
be placed for a negative internationally formatted
monetary quantity. The same values are recognized as
for p_sign_posn.
The LC_MONETARY definition
ends with the string END
LC_MONETARY.
The definition starts with the string LC_NAME in the first column.
Various keywords are allowed, but only name_fmt is mandatory.
Other keywords are needed only if there is common
convention to use the corresponding salutation in this
locale. The allowed keywords are as follows:
name_fmtfollowed by a string containing field descriptors that define the format used for names in the locale. The following field descriptors are recognized:
- %f
Family name(s).
- %F
Family names in uppercase.
- %g
First given name.
- %G
First given initial.
- %l
First given name with Latin letters.
- %o
Other shorter name.
- %m
Additional given name(s).
- %M
Initials for additional given name(s).
- %p
Profession.
- %s
Salutation, such as "Doctor".
- %S
Abbreviated salutation, such as "Mr." or "Dr.".
- %d
Salutation, using the FDCC-sets conventions.
- %t
If the preceding field descriptor resulted in an empty string, then the empty string, otherwise a space character.
name_genfollowed by the general salutation for any gender.
name_mrfollowed by the salutation for men.
name_mrsfollowed by the salutation for married women.
name_missfollowed by the salutation for unmarried women.
name_msfollowed by the salutation valid for all women.
The LC_NAME definition
ends with the string END
LC_NAME.
The definition starts with the string LC_NUMERIC in the first column.
The following keywords are allowed:
decimal_pointfollowed by the string that will be used as the decimal delimiter when formatting numeric quantities.
thousands_sepfollowed by the string that will be used as a group separator when formatting numeric quantities.
groupingfollowed by a sequence of integers as plain numbers separated by semicolons that describe the formatting of numeric quantities.
Each integer specifies the number of digits in a group. The first integer defines the size of the group immediately to the left of the decimal delimiter. Subsequent integers define succeeding groups to the left of the previous group. If the last integer is not −1, then the size of the previous group (if any) is repeatedly used for the remainder of the digits. If the last integer is −1, then no further grouping is performed.
The LC_NUMERIC definition
ends with the string END
LC_NUMERIC.
The definition starts with the string LC_PAPER in the first column.
Values in this category are defined as plain numbers.
The following keywords are allowed:
heightfollowed by the height, in millimeters, of the standard paper format.
widthfollowed by the width, in millimeters, of the standard paper format.
The LC_PAPER definition
ends with the string END
LC_PAPER.
The definition starts with the string LC_TELEPHONE in the first column.
The following keywords are allowed:
tel_int_fmtfollowed by a string that contains field descriptors that identify the format used to dial international numbers. The following field descriptors are recognized:
- %a
Area code without nationwide prefix (the prefix is often "00").
- %A
Area code including nationwide prefix.
- %l
Local number (within area code).
- %e
Extension (to local number).
- %c
Country code.
- %C
Alternate carrier service code used for dialing abroad.
- %t
If the preceding field descriptor resulted in an empty string, then the empty string, otherwise a space character.
tel_dom_fmtfollowed by a string that contains field
descriptors that identify the format used to dial
domestic numbers. The recognized field descriptors
are the same as for tel_int_fmt.
int_selectfollowed by the prefix used to call international phone numbers.
int_prefixfollowed by the prefix used from other countries to dial this country.
The LC_TELEPHONE
definition ends with the string END LC_TELEPHONE.
The definition starts with the string LC_TIME in the first column.
The following keywords are allowed:
abdayfollowed by a list of abbreviated names of the
days of the week. The list starts with the first day
of the week as specified by week (Sunday by
default). See NOTES.
dayfollowed by a list of names of the days of the
week. The list starts with the first day of the week
as specified by week (Sunday by
default). See NOTES.
abmonfollowed by a list of abbreviated month names.
monfollowed by a list of month names.
d_t_fmtfollowed by the appropriate date and time format (for syntax, see strftime(3)).
d_fmtfollowed by the appropriate date format (for syntax, see strftime(3)).
t_fmtfollowed by the appropriate time format (for syntax, see strftime(3)).
am_pmfollowed by the appropriate representation of the
am and
pm strings.
This should be left empty for locales not using AM/PM
convention.
t_fmt_ampmfollowed by the appropriate time format (for syntax, see strftime(3)) when using 12h clock format. This should be left empty for locales not using AM/PM convention.
erafollowed by semicolon-separated strings that define how years are counted and displayed for each era in the locale. Each string has the following format:
direction:offset:start_date:end_date:era_name:era_formatThe fields are to be defined as follows:
directionEither
+or−.+means the years closer tostart_datehave lower numbers than years closer toend_date. - means the opposite.offsetThe number of the year closest to
start_datein the era, corresponding to the%Eydescriptor (see strptime(3)).start_dateThe start of the era in the form of
yyyy/mm/dd. Years prior AD 1 are represented as negative numbers.end_dateThe end of the era in the form of
yyyy/mm/dd, or one of the two special values of−*or+*.−*means the ending date is the beginning of time.+*means the ending date is the end of time.era_nameThe name of the era corresponding to the
%ECdescriptor (see strptime(3)).era_formatThe format of the year in the era corresponding to the
%EYdescriptor (see strptime(3)).
era_d_fmtfollowed by the format of the date in alternative
era notation, corresponding to the %Ex descriptor (see
strptime(3)).
era_t_fmtfollowed by the format of the time in alternative
era notation, corresponding to the %EX descriptor (see
strptime(3)).
era_d_t_fmtfollowed by the format of the date and time in
alternative era notation, corresponding to the
%Ec
descriptor (see strptime(3)).
alt_digitsfollowed by the alternative digits used for date and time in the locale.
weekfollowed by a list of three values as plain
numbers: The number of days in a week (by default 7),
a date of beginning of the week (by default
corresponds to Sunday), and the minimal length of the
first week in year (by default 4). Regarding the
start of the week, 19971130 shall be used for Sunday
and 19971201 shall be
used for Monday. See NOTES.
first_weekday (since
glibc 2.2)followed by the number of the first day from the
day list to
be shown in calendar applications. The default value
of 1 (plain number)
corresponds to either Sunday or Monday depending on
the value of the second week list item. See
NOTES.
first_workday (since
glibc 2.2)followed by the number of the first working day
from the day list. The default
value is 2 (plain
number). See NOTES.
cal_directionfollowed by a plain number value that indicates the direction for the display of calendar dates, as follows:
Left-right from top.
Top-down from left.
Right-left from top.
date_fmtfollowed by the appropriate date representation for date(1) (for syntax, see strftime(3)).
The LC_TIME definition
ends with the string END
LC_TIME.
/usr/lib/locale/locale-archiveUsual default locale archive location.
/usr/share/i18n/localesUsual default path for locale definition files.
The collective GNU C library community wisdom regarding
abday, day, week, first_weekday, and first_workday states at
https://sourceware.org/glibc/wiki/Locales the following:
The value of the second week list item
specifies the base of the abday and day lists.
first_weekday specifies
the offset of the first day-of-week in the abday and day lists.
For compatibility reasons, all glibc locales should
set the value of the second week list item to
19971130 (Sunday) and
base the abday and day lists
appropriately, and set first_weekday and
first_workday
to 1 or 2, depending on whether the week and
work week actually starts on Sunday or Monday for the
locale.
iconv(1), locale(1), localedef(1), localeconv(3), newlocale(3), setlocale(3), strftime(3), strptime(3), uselocale(3), charmap(5), charsets(7), locale(7), unicode(7), utf-8(7)
This page is part of release 4.07 of the Linux man-pages project. A
description of the project, information about reporting bugs,
and the latest version of this page, can be found at
https://www.kernel.org/doc/man−pages/.
|
t -*- coding: UTF-8 -*- Copyright (C) 1994 Jochen Hein (HeinStudent.TU-Clausthal.de) Copyright (C) 2008 Petr Baudis (paskysuse.cz) Copyright (C) 2014 Michael Kerrisk <mtkmanpagesgmail.com> %%%LICENSE_START(GPLv2+_SW_3_PARA) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this manual; if not, see <http://www.gnu.org/licenses/>. %%%LICENSE_END 2008-06-17 Petr Baudis <paskysuse.cz> LC_TIME: Describe first_weekday and first_workday |