文字に対するXML実体の定義(第3版)(XML Entity Definitions for Characters (3rd Edition) 日本語訳)

1 導入
Introduction

表記方法と記号は, 特に科学文書において, 人間のコミュニケーションに大変重要であると判明してきています. 数学表記の方法が簡潔で暗示的であるように絶え間なく変わってきたために, 数学は分野分野で成長してきました. 数学表記で使用するたくさんの新しい記号が開発され, 数学者達は, 他の分野で導入されたたくさんの記号の利用を自制してきませんでした. その結果, 一般に科学では, とりわけ数学においては, とても膨大な記号の集合を使用することになりました. それらの膨大な記号を使わなければ, すらすらと科学を記述することはできません. 記号に対応する字体を特定の表示装置で表現できなければ, 科学的な内容を読むことはできません. 大抵の場合, 文字を直接ユニコード文字のデータとして, または数値によるXML文字参照として, 記述した方が良いです.

Notation and symbols have proved very important for human communication, especially in scientific documents. Mathematics has grown in part because its notation continually changes toward being succinct and suggestive. There have been many new signs developed for use in mathematical notation, and mathematicians have not held back from making use of many symbols originally introduced elsewhere. The result is that science in general, and particularly mathematics, makes use of a very large collection of symbols. It is difficult to write science fluently if these characters are not available for use. It is difficult to read science if corresponding glyphs are not available for presentation on specific display devices. In the majority of cases it is preferable to store characters directly as Unicode character data or as XML numeric character references.

しかしながら, 環境によっては, XML実体参照として用意されたASCII文字による入力方法の方がより便利です. 多くの実体名は広く一般に使われています. この仕様書は, それら各々の実体名とユニコードとを結び付ける方法の標準化を目指しています. この仕様書は, 過去の仕様書で既に使われてきた実体名しか導入しません. これらの実体名は, XML実体参照のような入力方法のために設計された, 覚えやすい短いものであることに注意が必要です. ユニコード標準の一部である長めの正式名称ではありません.

However, in some environments it is more convenient to use the ASCII input mechanism provided by XML entity references. Many entity names are in common use, and this specification aims to provide standard mappings to Unicode for each of these names. It introduces no names that have not already been used in earlier specifications. Note that these names are short mnemonic names designed for input methods such as XML entity references, not the longer formal names that form part of the Unicode standard.

具体的には, "iso"という文字で始まる実体セットの実体名は, 最初にSGML ([SGML])で標準化され, [ISO9573-13-1991]で更新されています. W3C数学作業部会は, 大本の標準化委員会(ISO/IECJTC1 SC34)から, それらの実体セットの維持と開発を引き継ぐように促されてきました. "mml"で始まる実体セットは, 最初にMathML [MathML2]で標準化され, "xhtml"で始まるものはHTML [HTML4]で標準化されました.

Specifically, the entity names in the sets starting with the letters "iso" were first standardized in SGML ([SGML]) and updated in [ISO9573-13-1991]. The W3C Math Working Group has been invited to take over the maintenance and development of these sets by the original standards committee (ISO/IEC JTC1 SC34). The sets with names starting "mml" were first standardized in MathML [MathML2] and those starting with "xhtml" were first standardized in HTML [HTML4].

この文書は, ウェブで実体名を用いてきた長年の集大成です. HTMLでは, 特別な文字に対して使われる実体名が少しだけありましたが, 数学記号とともに新しい名前の氾濫が起こりました. そして, この文書は, MathML 2.0 [MathML2]勧告の第6章の拡張や最終の改正と見なされることになります. 今, この文書は, ユニコードへの割り当ての定義とともに, XMLやHTMLで利用されてきた文字実体参照を調和させた完全な一覧を示します.

This document is the result of years of employing entity names on the Web. There were always a few named entities used for special characters in HTML, and many more names used for MathML. This means that this document can be viewed as an extension and final revision of Chapter 6 of the MathML 2.0 [MathML2] recommendation. Now it presents a completed listing harmonizing the known uses of character entity names in XML and HTML, together with defined mappings to Unicode.

非常に多く文字実体参照があり, それらを示したファイルは頻繁な探索を受けるデータ資源となるので, 一時的なカタログファイルが提供されています. 仕様書とともに提供された実体名データを変更する必要は当分生じないと思われるので, 関係のある実体名データをローカル環境にキャッシュするような実装が推奨されています.

Since there are so many character entity names, and the files specifying them are resources that may be subject to frequent lookup, a template catalog file has also been provided. Users are strongly encouraged to design their implementations so that relevant entity name tables are cached locally, since it is not expected that the listings provided with this specification will need changing for some long time.

2 実体名セット
Sets of names

2.1 HTML MathML 実体セット
The HTML MathML Entity Set

歴史的に, 実体セットは, 新しい文書型のたびに定義しないといけないにも関わらず, 関連する文字の小さなグループに分けられてきました. それらを統合したhtmlmathmlセットを使用することが強く推奨されています. この文書は(D ソースファイルで見られるものと同じソースファイルから派生した)HTMLパーサーを構築するための実体名に対し, 唯一無二の実体名セットを定義しています.

Historically the entity sets have been split into relatively small groups of related characters however for any new document type that is being defined it is strongly recommended that the combined htmlmathml set is used. This defines an identical set of names to the names built in to the HTML parser (derived from the same source materials as this document, see D Source Files).

XML DTDにhtmlmathmlセットを組み込む場合, 次のように記載します.

To incorporate the htmlmathml set into an XML DTD, a typical construct is:

       <!ENTITY % htmlmathml-f PUBLIC
         "-//W3C//ENTITIES HTML MathML Set//EN//XML"
         "https://www.w3.org/2003/entities/2007/htmlmathml-f.ent"
       >
       %htmlmathml-f;

公表された識別子は言葉どおりに使用されるべきで, システムの識別子はそれぞれの環境の要求に応じて変更されるべきです.

The public identifier should always be used verbatim, the system identifier should be changed to suit local requirements.

2つの形式の実体セットが利用できます.

The entity set is available in two forms:

htmlmathml-f HTMLとMathMLの実体定義の拡張セット
the expanded set of HTML and MathML entity definitions
htmlmathml 次の節で紹介する昔ながらの実体セットの定義への参照を通じてHTMLとMathMLの実体を定義したもの
the HTML and MathML entities defined via reference to the legacy entity set definitions as listed in the following section

JSON形式のものも利用できます. そのJSON配列は, 実体名をコード化し, ユニコードに割り当て, (XMLパーサーにではなく)省略のためのセミコロンを使用するHTMLパーサーに実体参照の一覧を提供します. そのため, HTMLで&を用いるの同じように&ampを利用してもよいです.

The information is also available in JSON format. The JSON arrays encode the entity names and mappings to Unicode and also a list of those entity references for which the HTML (but not XML) parser allows the trailing semicolon to be omitted. So &amp may be used as well as & when using HTML.

htmlmathml.json

ユニコード文字を実体宣言に置き換える, 逆に結び付けを行うXSLT2スタイルシートが利用できます.

An XSLT2 stylesheet is available which performs the reverse mapping, replacing Unicode characters by entity references.

htmlmathml.xsl

昔ながらの実体セット
Legacy Entity Sets

この仕様書は, 過去の仕様書で定義された, たくさんの実体名セットのユニコードへの割り当て方を定義します.

This specification defines mappings to Unicode of many sets of names that have been defined by earlier specifications.

全ての実体セットを統合して一覧にした2つの表を示します. 最初がユニコード順で, 次がアルファベット順です.

We present two tables listing all the sets combined, first in Unicode order and then in alphabetic order:

ユニコード順の全一覧表
All in Unicode order
アルファベット順の全一覧表
All in alphabetic order

実体セットそれぞれを記した表を示します. 実体セットの表はそれぞれ, 対応する実体セットのDTD実体定義へのリンクを含んでおり, また, 実体名をユニコード文字から逆引きするためのXSLT2スタイルシート(もちろん, このスタイルシートは, 1つの実体名に対して1つのユニコードのコードポイントを結び付けることのみ可能)へのリンクも含んでいます.

Then there come tables documenting each of the entity sets. Each set has a link to the DTD entity declaration for the corresponding entity set, and also a link to an XSLT2 stylesheet that will implement a reverse mapping from characters to entity names (this is, of course, only possible for entity names that map to a single Unicode code point).

isobox 罫線
Box and Line Drawing
isocyr1 ロシアのキリル文字
Russian Cyrillic
isocyr2 ロシア以外のキリル文字
Non-Russian Cyrillic
isodia 発音記号
Diacritical Marks
isolat1 ラテン文字拡張1
Added Latin 1
isolat2 ラテン文字拡張2
Added Latin 2
isonum 数学記号と特殊記号
Numeric and Special Graphic
isopub 出版
Publishing
isoamsa 数学記号拡張:矢印記号
Added Math Symbols: Arrow Relations
isoamsb 数学記号拡張:論理演算記号
Added Math Symbols: Binary Operators
isoamsc 数学記号拡張:データの区切り記号
Added Math Symbols: Delimiters
isoamsn 数学記号拡張:打ち消された不等号
Added Math Symbols: Negated Relations
isoamso 数学記号拡張:通常の記号
Added Math Symbols: Ordinary
isoamsr 数学記号拡張:不等号
Added Math Symbols: Relations
isogrk1 ギリシア文字 (MathML3 / HTML5 には含まれない)
Greek Letters (not in MathML3 / HTML5)
isogrk2 修飾されたギリシア文字 (MathML3 / HTML5 には含まれない)
Monotoniko Greek (not in MathML3 / HTML5)
isogrk3 ギリシア文字記号
Greek Symbols
isogrk4 代用のギリシア文字記号 (MathML3 / HTML5 には含まれない)
Alternative Greek Symbols (not in MathML3 / HTML5)
isomfrk 数学用アルファベット:フラクタル
Math Alphabets: Fraktur
isomopf 数学用アルファベット:オープンフェイス
Math Alphabets: Open Face
isomscr 数学用アルファベット:スクリプト
Math Alphabets: Script
isotech 一般技術記号
General Technical
mmlextra 追加のMathML記号
Additional MathML Symbols
mmlalias MathML別名
MathML Aliases
xhtml1-lat1 HTMLラテン文字
Latin for HTML
xhtml1-special HTML特殊文字
Special for HTML
xhtml1-symbol HTML記号
Symbol for HTML
html5-uppercase HTML別名(大文字)
uppercase aliases for HTML
predefined XML定義済実体
Predefined XML

実体セットそれぞれに対応するスタイルシートや実体定義ファイル(訳注:DTD形式で実体を定義したファイル, 一般に拡張子は"ent")に加えて, 統合されたスタイルシートが提供されています. また, 前節で述べたHTML MathMLセットと同様に, 統合された実体セットが2つのファイル形式で提供されています.

In addition to the stylesheets and entity files corresponding to each individual entity set, a combined stylesheet is provided, as well as a combined entity set, in two formats, as for the HTML MathML set described above.

w3centities 上記の全ての実体セットを参照しているW3C実体セット
W3C entities collection; referencing all entity sets listed above
w3centities-f 上記の全てについて, 重複を取り除いて単一のファイルにまとめた実体定義セット
the same set of entity definitions, expanded into a single file, with duplicates removed

3 科学文書におけるユニコード文字の範囲
Unicode Character Ranges for Scientific Documents

ある特定の文字は, 科学文書を作成することと関連があります. 下記の表では, 数学で最もよく利用される文字を含むユニコードの範囲を示します.

Certain characters are of particular relevance to scientific document production. The following tables display Unicode ranges containing the characters that are most used in mathematics.

この節でリンクしている表はそれぞれ256枚の画像を含んでおり, 画像がローカルの環境にキャッシュされていないと読み込みに時間がかかるであろうことに注意が必要です.

Note that each of the tables linked from this section contains 256 images and may take a while to load if the images have not been cached locally.

000 C0制御文字と基本ラテン文字, C1制御文字と追加ラテン1
C0 Controls and Basic Latin, C1 Controls and Latin-1 Supplement
001 ラテン文字拡張A, ラテン文字拡張B
Latin Extended-A, Latin Extended-B
002 IPA拡張, 前進を伴う修飾文字
IPA Extensions, Spacing Modifier Letters
003 合成用発音記号, ギリシア文字とコプト語の文字
Combining Diacritical Marks, Greek and Coptic
004 キリル文字
Cyrillic
006 アラビア文字
Arabic
020 一般的な句読点, 上付き文字, 下付き文字, 通貨記号, 合成用記号用発音記号
General Punctuation, Superscripts and Subscripts, Currency Symbols, Combining Diacritical Marks for Symbols
021 文字の様な記号, 数字に準じるもの, 矢印
Letterlike Symbols, Number Forms, Arrows
022 数学演算子
Mathematical Operators
023 その他技術記号
Miscellaneous Technical
024 制御用文字記号, 光学文字認識, 囲み文字
Control Pictures, Optical Character Recognition, Enclosed Alphanumerics
025 罫線記号, ブロック要素, 幾何学的図形
Box Drawing, Block Elements, Geometric Shapes
026 その他記号
Miscellaneous Symbols
027 飾り文字, その他数学記号A, 追加矢印A
Dingbats, Miscellaneous Mathematical Symbols-A, Supplemental Arrows-A
029 追加矢印B, その他数学記号B
Supplemental Arrows-B, Miscellaneous Mathematical Symbols-B
02A 追加数学演算子
Supplemental Mathematical Operators
02B その他記号/矢印
Miscellaneous Symbols and Arrows
0FB アルファベット表示形, アラビア文字表示形A
Alphabetic Presentation Forms, Arabic Presentation Forms-A
0FE 異体字選択用文字, 縦表示形, 合成用半記号, CJK互換形, 小字形, アラビア文字表示形B
Variation Selectors, Vertical Forms, Combining Half Marks, CJK Compatibility Forms, Small Form Variants, Arabic Presentation Forms-B
1D4 数学用英数字記号
Mathematical Alphanumeric Symbols
1D5 数学用英数字記号(つづき)
Mathematical Alphanumeric Symbols (continued)
1D6 数学用英数字記号(つづき)
Mathematical Alphanumeric Symbols (continued)
1D7 数学用英数字記号(つづき)
Mathematical Alphanumeric Symbols (continued)
1EE アラビア文字の数学用英数字記号
Arabic Mathematical Alphabetic Symbols

4 数学用英数字記号
Mathematical Alphanumeric Characters

この仕様書で定義されている実体はだいたい, ユニコード第0面の文字のような記号やユニコード第1面の数学用英数字記号に含まれている記号に関係しています. 下記の表はこれらの記号全てを一覧にしたものです. 強調表示された記号は第1面にあるものではなく, また然るべき実体名を持っています.

Many of the entities defined by this specification relate to the mathematical alphanumeric characters contained in the letter-like symbols block of Unicode Plane 0, or in the Mathematical Alphanumeric Symbols block in Unicode Plane 1. The following tables list all these symbols, highlighting those that are not in Plane 1, and giving entity names where appropriate.

太字(明朝体)
Bold (Serif)
イタリック体または斜体
Italic or Slanted
太字のイタリック体または斜体
Bold Italic or Slanted
スクリプト(またはカリグラフィー)
Script (or Calligraphic)
太字のスクリプト
Bold Script
フラクタル
Fraktur
太字のフラクタル
Bold Fraktur
ゴシック体
Sans Serif
太字のゴシック体
Bold Sans Serif
斜体のゴシック体
Slanted Sans Serif
斜体で太字のゴシック体
Slanted Bold Sans Serif
等幅フォント
Monospace
二重線(オープンフェイス, 黒板における太字)
Double Struck (Open Face, Blackboard Bold)
語頭形アラビア文字
Arabic Initial Form
テール付きアラビア文字
Arabic Tailed Form
ループ付きアラビア文字
Arabic Looped Form
縦長のアラビア文字
Arabic Stretched Form

5 打ち消された文字や異体字に対する実体
Entities for Negated and Variant Characters

仕様書の大部分において, 実体定義はそれぞれ単一のユニコード文字に適用されます. この節では, 2文字以上の文字列に適用される実体定義について述べます.

Each of the entity definitions in a majority of the specification expands to a single Unicode character. The definitions that expand to a sequence of two or more characters are outlined in this section.

5.1 打ち消された数学用文字
Negated Mathematical Characters

これまでに一覧にしたユニコード文字に加えて, 打ち消された形や抹消された形の文字を表示するのに, 文字U+0338 (/), U+20D2 (|), U+20E5 (\)を合成して使ってもよいです. 合成する文字は, "基となる"文字のすぐ後に, 間に記述記号や空白を挟まずに置かれるべきです. アクセント記号を合成する場合も同様です.

In addition to the Unicode Characters so far listed, one may use the combining characters U+0338 (/), U+20D2 (|) and U+20E5 (\) to produce negated or canceled forms of characters. A combining character should be placed immediately after its "base" character, with no intervening markup or space, just as is the case for combining accents.

基本的に任意のユニコード文字に打ち消しの文字を合成してもよいですが, 数学用にデザインされたフォントは大抵, 合成済の打ち消された字形を持っています. そのような場合, MathML表示ソフトウェアは, これらの合成済の字形を表示するべきです. 合成した文字の文字コードは, U+2260に相当するU+003D U+0338のように既に存在するUCS文字として表せることも, U+2202 U+0338のように表せないこともあります. 後者の打ち消され文字について, 名前が付けられているものを次の表に一覧で示します.

In principle, the negation characters may be applied to any Unicode character, although fonts designed for mathematics typically have some negated glyphs ready composed. A MathML renderer should be able to use these pre-composed glyphs in these cases. A compound character code either represents a UCS character that is already available, as in the case of U+003D U+0338 which amounts to U+2260, or it does not, as is the case for U+2202 U+0338. The common cases of negations, of the latter type, that have been identified are listed in the tables.

文字に合成する長い斜線
combining long solidus overlay
文字に合成する長い縦線
combining long vertical line overlay
文字に合成する長い逆の斜線
combining reverse solidus overlay

文字を合成することによって表せるものに対して, 既に単一の文字が定義されているなら, ばらばらの文字による表現ではなく単一の文字を用いるべきというのが, W3Cとユニコードの方針であることを強調しておきます. また, 既に存在する合成文字として表されるものについて, 新しい単一の文字が取り入れられることはないということです. このことについて, さらに詳しい情報は, Unicode Standard Annex 15, Unicode Normalization Forms [ユニコード15](訳注:"ユニコード標準付録15 ユニコード正規化形式"という意味), 特にNormalization Form C(訳注:"正規化形式C"という意味)の議論を見て下さい.

Note that it is the policy of the W3C and of Unicode that if a single character is already defined for what can be achieved with a combining character, that character must be used instead of the decomposed form. It is also intended that no new single characters representing what can be done with existing compositions will be introduced. For further information on these matters see the Unicode Standard Annex 15, Unicode Normalization Forms [Unicode15], especially the discussion of Normalization Form C.

5.2 数学用異体字
Variant Mathematical Characters

ユニコードは, 単純なフォントの異体字に, 文字コードを割り振ることを避けようと試みています. 字形が異なる場合にコードポイントが割り当てられるためには, 微妙な意味の違い以上の何かがあるべきです. 価値のない異体字を記録するため, ユニコード3.2には, 後置修飾詞としてふるまう特別な文字U+FE00 (異体字選択用文字1)があります. しかしながら, 正式に認められている異体字選択用文字との組み合わせは, ユニコードの一部として記録されている一覧表のものに制限されています. 異体字選択用文字1は, その一覧表の文字にのみ適用されてもよいです. ユニコードでは, 出来上がった組み合わせ文字は, 別々の文字ではなく元の文字の異体字と見なされます. ユニコードに対応したシステムは, 利用可能なフォントが異体字の字形を提供していない場合, 組み合わせ文字を元の文字として描いてもよいです.

Unicode attempts to avoid having several character codes for simple font variants. For a code point to be assigned there should be more than a nuance in glyphs to be recorded. To record variants worth noting there is a special character in Unicode 3.2, U+FE00 (VARIATION SELECTOR-1), which acts as a postfix modifier. However the legally allowed combinations with this variation selector are restricted to a list recorded as part of Unicode. The VARIATION SELECTOR-1 character may only be applied to the characters listed here. The resulting combination is not regarded by Unicode as a separate character, but a variation on the base character. Unicode aware systems may render the combination as the base if the available fonts do not support the variant glyph shape.

異体字選択用文字1
variation selector-1
異体字選択用文字2
variation selector-2

A 特別に考慮すべき点
A Special Considerations

A.1 イプシロン
A.1 Epsilon

歴史上, 小文字のイプシロンの異体字の形には, たくさんの混乱や同意不足がありました.

Historically there has been much confusion and lack of agreement over variant forms for lower case epsilon.

この仕様書は, 下記の定義を用いています. epsilonという名前は原文のギリシア語で用いる文字(U+03B5)として使われ, varepsilonは数学で一般的に用いるイプシロン記号の文字(U+03F5)として使われることに注意が必要です. また, この使用方法は, 似た文字の組(例えは, thetaとvartheta)の名前の付け方についても同様です. ただし, TeXやMathML2や従来のISO実体セットのユニコードへの割り当て方などで使用される名付け方の慣習とは互換性がないことに注意が必要です.

This specification uses the definitions below. Note that the name epsilon is used for the character used in textual Greek (U+03B5) and varepsilon used for the epsilon symbol character more commonly used in mathematics (U+03F5). Note that this usage is compatible with the naming of similar pairs of characters (for example theta, vartheta) but incompatible with the naming convention used in TeX, MathML2 and some earlier mappings of the ISO entity sets to Unicode.

実体名 Entity	セット名 Set	説明 Description	ユニコード文字 Unicode Character
eacgr	isogrk2	=小文字イプシロン, アクセント, ギリシア語 =small epsilon, accent, Greek	U+03AD	ギリシア文字のアクセント記号付き小文字イプシロン GREEK SMALL LETTER EPSILON WITH TONOS
egr	isogrk1	=小文字イプシロン, ギリシア語 =small epsilon, Greek	U+03B5	ギリシア文字の小文字イプシロン GREEK SMALL LETTER EPSILON
epsi	isogrk3	/epsilon /epsilon
epsilon	xhtml1-symbol
epsiv	isogrk3	/straightepsilon, 小文字イプシロン, ギリシア語 /straightepsilon, small epsilon, Greek	U+03F5	ギリシア文字の三日月状のイプシロン記号 GREEK LUNATE EPSILON SYMBOL
straightepsilon	mmlalias	ISOGRK3 epsivの別名 alias ISOGRK3 epsiv
varepsilon	mmlalias	ISOGRK3 epsivの別名 alias ISOGRK3 epsiv
bepsi	isoamsr	/backepsilon R: 例のようなもの /backepsilon R: such that	U+03F6	ギリシア文字の反転した三日月状のイプシロン記号 GREEK REVERSED LUNATE EPSILON SYMBOL
backepsilon	mmlalias	ISOAMSR bepsiの別名 alias ISOAMSR bepsi	U+03F6
b.epsi	isogrk4	小文字イプシロン, ギリシア語 small epsilon, Greek	U+1D6C6	数学用太字の小文字イプシロン MATHEMATICAL BOLD SMALL EPSILON
b.epsiv	isogrk4	イプシロンの異体字 variant epsilon	U+1D6DC	数学用太字のイプシロン記号 MATHEMATICAL BOLD EPSILON SYMBOL

A.2 ファイ
Phi

ファイの状況は, イプシロンととても似ています. ただし, ユニコードの以前のバージョンが, U+03C6とU+03D5に対する字形の例を取り違えていたというより複雑な事情があります. また, 古いフォントの中には, いまだに古い慣習に従って使われているものがあります. この仕様書で使われている定義は, 下記の一覧の通りです.

The situation for phi is very similar to that of epsilon, although with the further complication that early versions of Unicode had the sample glyphs for U+03C6 and U+03D5 swapped from the current usage, and some older fonts still in use follow that older convention. The definitions used in this specification are as listed below.

実体名 Entity	セット名 Set	説明 Description	ユニコード文字 Unicode Character
phi	isogrk3	/phi - 小文字ファイ, ギリシア語 /phi - small phi, Greek	U+03C6	ギリシア文字の小文字ファイ GREEK SMALL LETTER PHI
phi	xhtml1-symbol	ギリシア文字の小文字ファイ greek small letter phi
phgr	isogrk1	=小文字ファイ, ギリシア語 =small phi, Greek
straightphi	mmlalias	ISOGRK3 phivの別名 alias ISOGRK3 phiv	U+03D5	ギリシア文字のファイ記号 GREEK PHI SYMBOL
phiv	isogrk3	/varphi - 直立のファイ /varphi - straight phi
varphi	mmlalias	ISOGRK3 phivの別名 alias ISOGRK3 phiv
b.phi	isogrk4	小文字ファイ, ギリシア語 small phi, Greek	U+1D6D7	数学用太字の小文字ファイ MATHEMATICAL BOLD SMALL PHI
b.phiv	isogrk4	ファイの異体字 variant phi	U+1D6DF	数学用太字のファイ記号 MATHEMATICAL BOLD PHI SYMBOL

A.3 複数文字の実体
Multiple Character Entities

前節の一覧で示した合成文字や異体字の組み合わせ以外の, 1文字より長い文字列を置き換える, 残りの実体は下表の一覧のとおりです.

In addition to the combining and variant character combinations listed in the previous sections, the following table lists the remaining entity replacement texts that consist of more than one character.

実体名 Entity	セット名 Set	説明 Description	ユニコード文字 Unicode Character
fjlig	isopub	小文字fj連字 small fj ligature	U+0066 U+006A	(fj連字) fj ligature
ThickSpace	mmlextra	5/18em幅の空白 space of width 5/18 em	U+205F U+200A	(5/18em幅の空白) space of width 5/18 em
race	isoamsb	反転した相似, 下線 reverse most positive, line below	U+223D U+0331	(下線付きの)反転したチルダ REVERSED TILDE with underline
acE	isoamsb	相似, 二重下線 most positive, two lines below	U+223E U+0333	(二重下線付きの)ひっくり返ったゆったりしたS INVERTED LAZY S with double underline

ユニコードには, アルファベット表示系の区画に含まれるfi(U+FB01)のような一般のfの連字はあるにも関わらず, fjという文字はありません. fjlig実体は, "fj"という文字の組に当てはめられます. 現在の文字入力装置は, fjという連字をフォントが提供しているなら, fjという組み合わせに対し自動的にその連字を用いるべきです.

Unicode does not have an fj character, although the other common f ligatures such as fi (U+FB01) are contained in the Alphabetic Presentation Forms block. The fjlig entity is mapped to the pair of characters "fj"; modern typesetting engines should automatically use the fj ligature for this combination if the font supplies such a ligature.

ユニコードは, (5/18emを除く, 6/18em以下の全ての1/18emの倍数の幅の)空白文字を持っています. そのため, ThickSpace実体は, 空白文字の組に割り当てられています. U+2005(1/4em)が使われていたこともありました. 表示されるフォントがどの大きさでも, その差は目に見えて分かるものではないとはいえ, 1/4emは5/18emと等しくないことから上記の定義が選ばれました.

Unicode has a range of space characters (including all multiples of 1/18 em up to 6/18, except for 5/18 em) thus the ThickSpace entity is mapped to a pair of space characters. An alternative would have been to use U+2005 (1/4 em), but 1/4 em is not equal to 5/18 em, so the above definition was chosen, despite the fact that the difference is unlikely to be visibly noticeable at most typeset font sizes.

race実体とacE実体は, ユニコードがコードポイントを持っていない下線付きの文字です. そのため, 文字に下線を合成することは, 打ち消された演算子のために一画加えるのと類似した方法で行われています.

The entities race and acE denote underlined characters for which Unicode does not have codepoints, thus combining underline characters have been used, in a way analogous to the use of combining strokes for negated operators.

A.4 合成文字として実装された実体
Entities Defined to be a Combining Character

合成文字から構成される文字列を置き換えた実体は下表の一覧のとおりです.

The following table lists the entity replacement texts that consist of a combining character.

実体名 Entity	セット名 Set	説明 Description	ユニコード文字 Unicode Character
DownBreve	mmlextra	(空白を伴わない)ひっくり返った短音記号 breve, inverted (non-spacing)	U+0311	合成用のひっくり返った短音記号 COMBINING INVERTED BREVE
tdot	isotech	文字の上の3つの点 three dots above	U+20DB	文字の上に合成する3つの点 COMBINING THREE DOTS ABOVE
TripleDot	mmlalias	ISOTECH tdotの別名 alias ISOTECH tdot	U+20DB	文字の上に合成する3つの点 COMBINING THREE DOTS ABOVE
DotDot	isotech	文字の上の4つの点 four dots above	U+20DC	文字の上に合成する4つの点 COMBINING FOUR DOTS ABOVE

[Charmod-norm]でより詳しく説明していますが, 実体の展開とユニコード標準化が行われる順番によって異なった結果となる可能性があるので, 実体の文字列を合成文字で置き換えることは勧められません. この仕様書は可能な限り合成でない文字を使うようにしていますが, DownBreve, tdot, TripleDot, DotDotの場合, ユニコードにはアクセントを合成する形しかありません.

For reasons explained further in [Charmod-norm], it is not advisable to start the replacement text of an entity with a combining character, as then potentially different results may be produced depending on the order in which entity expansion and Unicode normalisation are performed. As far as possible this specification uses non-combining characters, however, in the cases DownBreve, tdot, TripleDot and DotDot Unicode only has combining forms of the accents.

この仕様書の従来版は, 前の文字列と実体が合成されて展開する可能性を避けるために, これらの実体を空白で始まる文字列で置き換えるよう定義していました. しかしながら, 様々な理由からHTMLに組み入れられた実体はそこの箇所に空白を持っていません. そのため, 現在の定義は合成文字のみから構成されています. このことにより, HTMLやXHTMLは, これらの定義を用いるどの仕様書とも調和することになります.

Earlier versions of this specification defined these entities with the replacement text starting with a space, to avoid the possibility that the expansion of the entity combined with preceding text. However for various reasons the entities as incorporated in HTML do not have a space here, and so the definitions now consist just of the combining character so that HTML and XHTML are consistent with any specifications using these definitions.

B 変更点
Changes

B.1 2014年4月10日(第2版勧告)以降の変更点
Changes since 2014-04-10 (Second Edition Recommendation)

ソースファイルをユニコード15.0に更新しました. 文字の一覧表には影響がありましたが, 生成された実体ファイルやスタイルシートには変更はありませんでした. U+FE01異体字選択用文字に対する新しい表が追加され, U+FE00の異体字セットが大きく拡張されました(それらの標準化された異体字のほとんどは, ユニコード14で加えられたものです). スクリプトのアルファベットの表は, 両方の異体字を示すように拡張されました.

Source files updated to Unicode 15.0, affecting the character tables, but with no changes to generated entity files or stylesheets. New table for the U+FE01 Variation selector and greatly extended set of variations in the U+FE00 table (most of these standardised variants were added at Unicode 14). The script alphabet table has been extended to show both variants.

2021年11月版のW3C手続き文書への参照を加えました.

Reference added to the November 2021 W3C Process Document.

最新のW3C公開手続きで必要とされているGitHubへのリンクを含むよう, 前書きを若干変更しました.

Some changes to the front matter including link to GitHub as required by the latest W3C publication process.

新しいW3C文書書式に合致するようCSS書式を調整しました.

Adjustments to CSS styling to match new W3C document style.

元データの保管場所をgithubに動かしました. そのため, ログ(過去の履歴)は現在公開されています.

The source repository has been moved to github so the log is now public.

A.4 合成文字として定義された実体で説明したように, DownBreve, tdot, TripleDot, DotDotは, もはや空白で始まりません.

As detailed in A.4 Entities Defined to be a Combining Character DownBreve, tdot, TripleDot and DotDot are no longer prefixed by a space.

B.2 2010年4月1日(第1版勧告)から2014年4月10日(第2版勧告)の間の変更点
Changes between 2010-04-01 and 2014-04-10 (First and Second Edition Recommendations)

ソースファイルをユニコード6.3に更新しました. 文字の一覧表には影響がありましたが、生成された実体ファイルやスタイルシートに変更はありませんでした.

Source files updated to Unicode 6.3, affecting the character tables, but with no changes to generated entity files or stylesheets.

ソースファイルをユニコード6.1のアラビア数学のアルファベット(U+1EE??)のデータに対応するよう更新しました. 第3節および第4節に表を追加しました.

Source files updated Unicode 6.1 data on Arabic math alphabets (U+1EE??). Additional tables shown in Sections 3 and 4.

2 実体セットを, MathMLやHTMLで利用されているhtmlmathmlセットを強調するよう再編成しました.

Section 2 Sets of names reorganized to highlight the htmlmathml set which is used in MathML and HTML. Also link to XSL and JSON formats for the HTML MathML set.

[MathML3], [HTML5], [ユニコード]についての参考文献を更新しました.

References updated: [MathML3], [HTML5] and [Unicode].

B.3 2010年2月11日から2010年4月1日の間の変更点
Changes between 2010-04-01 and 2010-02-11

例の画像を少し改良し, ユニコードが参照している画像により一致したものを用いるようにしました.

Several example images improved, bringing them more in line with the Unicode reference images.

B.4 2009年11月17日から2010年2月11日の間の変更点
Changes between 2010-02-11 and 2009-11-17

ユニコードを内部IDを用いたU01234という形で表示する代わりに, 一貫してU+1234という記述法を使ったこと等, 様々な編集の改良を行いました.

Various editorial improvements, including using Unicode U+1234 notation more consistently rather than displaying the internal IDs of the form U01234.

2009年11月17日の草案版で配布された結合された実体ファイルは, 2つの実体名が事例によってのみ異なっているときに, 片方のものしか含んでいないという間違いがありました. 最新版では修正しています.

The combined entities file distributed with the 2009-11-17 draft introduced an error that if two entity names differed only by case, only one was included. This has been corrected.

HTMLやMathMLで利用可能な実体に対応する結合された実体セットhtmlmathmlを適切に提供するようにしました. XMLで定義済の実体に対応する(以前内部的に使われていた)定義済のセットを文書化しました.

The combined entity set htmlmathml corresponding to the entities usable in HTML and MathML is now explicitly provided. The predefined set, corresponding to the entities predefined in XML is now documented (it was previously used internally).

xvee実体とxwedge実体は, あるユニコード(U+22C1とU+22C0)に割り当てられていましたが, 実体の説明が入れ違っていました. xveeは論理和で, xwedgeは論理積です. この間違いは, [ISO9573-13-1991]で1999年に報告されましたが, Proposed Technical Corrigendum(訳注:"提案された技術的な正誤表"という意味で, 以前の仕様書ではW3Cの正誤表へのリンクが貼ってありました.)は直されていませんでした. 実体ファイルはこの変更の影響を受けません.

The entities xvee and xwedge had the correct Unicode assignments (U+22C1 and U+22C0) but the entity descriptions have been swapped, xvee is logical or and xwedge is logical and. This error in [ISO9573-13-1991] was reported in 1999, in a Proposed Technical Corrigendum, but not previously fixed. The entity files are unaffected by this change.

NotGreaterFullEqual実体が, 間違って打ち消された小なり記号(U+2266 U+0338)に割り当てられていたので, うち消された大なり記号(U+2267 U+0338)に訂正しました.

The entity NotGreaterFullEqual which had been erroneously assigned to a negated less than operator (U+2266 U+0338) has been corrected to be the negated greater than operator (U+2267 U+0338).

実体ファイルへの参照を, W3Cのサーバーへからローカルな環境のコピーへに変えるために, カタログファイルの例catalogが提供するようにしました.

A sample catalog is now provided to redirect references to the entity files to copies on the local machine rather than the W3C server.

B.5 2008年7月21日から2009年11月17日の間の変更点
Changes between 2009-11-17 and 2008-07-21

html5-uppercaseセットを文書化しました

The html5-uppercase set is now documented.

正規化形式Cに合わせるため, ohm実体とangst実体をU+03A9とU+00C5に変更しました. W3C Bugzillaの記録を見て下さい.

The entities ohm and angst have changed to U+03A9 and U+00C5 to match NFC. See w3c bugzilla entry.

誤ってU+29DAが割り当てられていたrace実体に, U+2233D U+0331の組を割り当てるようにしました. (U+223Dは, 大本のISO文書では, 回転させたチルダの代わりに回転したSの形をしている訳では全くありませんが, ユニコード5.2では大変似た文字として登場します.)

The entity race, which had been erroneously assigned U+29DA, is now assigned the combination U+223D U+0331. (U+223D isn't quite the shape shown in the original ISO document which is a rotated S rather than a rotated tilde, but this appears to be the closest character in Unicode 5.2.)

bsolhsub実体とsuphsol実体は, 以前は2つの文字の合成U+005C U+2282とU+2283 U+002Fに割り当てられていました. しかしながら, これらの実体に対応するためにはっきりと文字コードが加えられたユニコード5のU+27C8とU+27C9に割り当てるようにしました.

The entities bsolhsub and suphsol which were previously mapped to two-character combinations U+005C U+2282 and U+2283 U+002F are now mapped to the Unicode 5 characters that were added specifically to support these entities, U+27C8 and U+27C9.

ユニコード5.2に対応するようにソースファイルを全て更新しました.

The source files have all been updated to match Unicode 5.2.

ThickSpace実体を, 3文字の組み合わせU+2009 U+200A U+200Aでなく, 2文字の組み合わせU+205F U+200Aに割り当てるようにしました. つまり, (3/18 + 1/18 + 1/18)emではなく, (4/18 + 1/18)emに割り当てるようにしました.

The entity ThickSpace now maps to the pair U+205F U+200A rather than the triple U+2009 U+200A U+200A (4/18 + 1/18)em rather than (3/18 + 1/18 + 1/18)em.

UnderBar実体を, 合成用の文字U+0332でなく, 間隔取るための文字_に割り当てました.

The entity UnderBar maps to the spacing character _ rather than the combining character U+0332.

OverBar実体を, 長音記号U+00AFでなく, (XHTMLのoline実体に似た)間隔を取るための文字U+203Eに割り当てました.

The entity OverBar maps to the spacing character U+203E (like the XHTML entity oline) rather than the macron character U+00AF.

epsiv実体とvarepsilon実体を, epsilon実体(U+03B5)の別名ではなく, イプシロン記号U+03F5に割り当てるようにしました.

The entities epsiv and varepsilon are now mapped to the epsilon symbol U+03F5 rather than being aliases for the entity epsilon, U+03B5.

phiv実体とvarphi実体を, phi実体(U+03C6)の別名ではなく, ファイ記号U+03D5に割り当てるようにしました.

The entities phiv and varphi are now mapped to the phi symbol U+03D5 rather than being aliases for the entity phi, U+03C6.

B.6 2007年12月14日から2008年7月21日の間の変更点
Changes between 2008-07-21 and 2007-12-14

この文書の草案版の次の実体の定義を変更しました.

The following entity definitions have changed at this draft:

phi, lang, rang, OverParenthesis, UnderParenthesis, OverBrace, UnderBrace, lbbrk, rbbrk.

C この仕様書の実体と以前のW3CのDTDとの間の違い
Differences between these entities and earlier W3C DTDs

C.1 XHTML1.0との違い
Differences from XHTML 1.0

ここで述べられているXHTMLの実体定義とXHTML 1.0 DTDで述べられているXHTMLの実体セットとの間の違い.

Differences between the XHTML entity definitions described here and the entity set described in the XHTML 1.0 DTD.

langとrang
lang and rang: U+27E8とU+27E9. XHTML1.0では, (U+3008とU+3009への正当な分解のできる)U+2329とU+232Aが使われていました.
U+27E8 and U+27E9; XHTML 1.0 used U+2329 and U+232A (which have canonical decomposition to U+3008 and U+3009).

注意:
Note:

[HTML5]の最新の草案版は, この仕様書に由来する実体定義を用いています.

The current drafts of [HTML5] use entity definitions derived from this specification.

C.2 MathML2.0(第2版)との違い
Differences from MathML 2.0 (second edition)

MathML2と現在の実体定義との違いは次のとおりです.

The differences between MathML 2 and the current entity definitions are listed below.

fjlig
fjlig: ISOPUB(とMathML1)はfjの連字と定義していました. ユニコードは特定の文字を割り当てておらず, MathML2からこの実体は抜け落ちていました. [SGML]との互換性を最大限確保するため, 再度定義されました.
ISOPUB (and MathML 1) defined an fj ligature; Unicode does not have a specific character and the entity was dropped from MathML2. It is re-instated here for maximum compatibility with [SGML].
phi
phi: U+03C6 ギリシア語の小文字のファイ(HTML4で使われていた定義). MathML2では, U+03D5 ギリシア文字のファイ記号が使われていました.
U+03C6 GREEK SMALL LETTER PHI (the definition used in HTML4); MathML2 used U+03D5 GREEK PHI SYMBOL.
epsiv, varepsilon, phiv, varphi
epsiv, varepsilon, phiv, varphi: (varthetaのようなvarを頭に付ける他の表記方法と合わせるため) 記号文字に割り当てられるよう変更されました.
these have been changed to map to the symbol character (to match other uses of the var prefix such as vartheta).
jmath
jmath: U+0237. MathML2では, ユニコード4.1以前は点のないjがなかったため, U+006A(j)が使われていました.
U+0237; MathML 2 used U+006A (j) as there was no dotless j before Unicode 4.1.
trpezium, elinters
trpezium, elinters: U+23E2とU+23E7. これらの実体へ対応するためにユニコード5.0で加えられた文字の代わりに, MathML2ではU+FFFD(置換用の文字)が使われていました.
U+23E2 and U+23E7; MathML 2 used U+FFFD (REPLACEMENT CHARACTER) as these characters were added at Unicode 5.0 specifically to support these entities.
ohm, angst
ohm, angst: 前に述べたとおり, これらの実体の定義は変更されたので, 正規化形式Cの正規形にある文字を定義として用いるようにしました.
As noted above, the definitions of these entities have been changed so that the definitions use characters that are in NFC normal form.
bsolhsubとsuphsol
bsolhsub and suphsol: U+27C8とU+27C9. MathML2では, U+005C U+02282とU+2283 U+002Fが使われていました.
U+27C8 and U+27C9; MathML2 used U+005C U+02282 and U+2283 U+002F.
NotGreaterFullEqual
NotGreaterFullEqual: U+2267 U+0338. MathML2では, 間違った定義U+2266 U+0338が使われていました.
U+2267 U+0338 ; MathML2 used the erroneous definition U+2266 U+0338.

次のかっこ記号は, ユニコードのバージョン3.1から5.1までの間に数学記号の集まりに加えられました. MathML2では, CJK句読点を意図する, かっこに似た文字が使われていました.

The following bracket symbols have been added to the Mathematical symbols block in Unicode versions between 3.1 and 5.1. MathML2 used similar characters intended for CJK punctuation.

lang, langle, LeftAngleBracketとrang, rangle, RightAngleBracket
lang, langle, LeftAngleBracket and rang, rangle, RightAngleBracket: U+27E8とU+27E9. MathML2では, (U+3008とU+3009への正当な分解のできる)U+2329とU+232Aが使われていました.
U+27E8 and U+27E9; MathML2 used U+2329 and U+232A (which have canonical decomposition to U+3008 and U+3009).
LangとRang
Lang and Rang: U+27EAとU+27EB. MathML2では, U+300AとU+300Bが使われていました.
U+27EA and U+27EB; MathML2 used U+300A and U+300B.
lbbrkとrbbrk
lbbrk and rbbrk: U+2772とU+2773. MathML2では, U+3014とU+3015が使われていました.
U+2772 and U+2773; MathML2 used U+3014 and U+3015.
loangとroang
loang and roang: U+27ECとU+27ED. MathML2では, U+3018とU+3019が使われていました.
U+27EC and U+27ED; MathML2 used U+3018 and U+3019.
lobrkとrobrk
lobrk and robrk: U+27E6とU+27E7. MathML2では, U+301AとU+301Bが使われていました.
U+27E6 and U+27E7; MathML2 used U+301A and U+301B.
OverBraceとUnderBrace
OverBrace and UnderBrace: U+23DEとU+23DF. MathML2では, U+FE37とU+FE38が使われていました.
U+23DE and U+23DF; MathML2 used U+FE37 and U+FE38.
OverParenthesisとUnderParenthesis
OverParenthesis and UnderParenthesis: U+23DCとU+23DD. MathML2では, U+FE35とU+FE36が使われていました.
U+23DC and U+23DD; MathML2 used U+FE35 and U+FE36.
LeftDoubleBracketとRightDoubleBracket
LeftDoubleBracket and RightDoubleBracket: U+27E6とU+27E7. MathML2では, U+301AとU+301Bが使われていました.
U+27E6 and U+27E7; MathML2 used U+301A and U+301B.

注意:
Note:

[MathML3]は, この仕様書で定義された実体セットを使っています.

[MathML3] uses the entity sets defined by this specification.

D ソースファイル
Source Files

実体宣言を構築する全てのデータファイル, XSLTによる文字の割り当て, この文書から参照されるHTMLの表は, https://github.com/w3c/xml-entities/から利用可能です.

All data files used to construct the entity declarations, XSLT character maps, and HTML tables referenced from this document are available from https://github.com/w3c/xml-entities/.

unicode.xml 様々な実体セット, アプリケーションソフトウェアやTeXでの名前, その他のデータと, 全てのユニコード文字について詳細に記述したマスターファイル. このファイルは長年にわたって管理されており, 元々はSebastian Rahtz氏によりjadetexデストリビューションの一部でした. 1999年ごろからはMathML仕様書のソースファイルの一部としてDavid Carlisle氏によって管理されてきました. ユニコード15の全ての文字をデータ化しています.注意: unicode.xmlは5MBを超える大きさであり, 実際のところ, ブラウザで直接見るには適切ではありません. ブラウザでunicode.xmlの上記のリンクをたどるより, ファイルを保存した方がよいでしょう.
master file detailing all Unicode characters with names in various entity sets and applications, TeX equivalents and other data. This file has been maintained for many years, originally by Sebastian Rahtz as part of the jadetex distribution and since around 1999 as part of the MathML specification sources by David Carlisle. The current version encodes data for all characters in Unicode 15. Note: unicode.xml is over 5MB in size and may not really be suitable for direct viewing in a browser. You may prefer to save the file rather than follow the above link to unicode.xml in a browser.
charlist.rnc unicode.xmlに対するrelax NG スキーマ.
relax NG schema for unicode.xml.
unicode.xsl unicode.xmlをHTMLの表として描くためのXSLTスタイルシート.
XSLT stylesheet that renders unicode.xml as an HTML table.
character-set.xml この文書のソースファイル.
the source file for this document.
xmlspec.xsl 標準のxmlspecスタイルシートのコピー.
a copy of the standard xmlspec stylesheet.
run この文書の実体セットを作るための小さなスクリプトファイル.
small script file that builds this collection.
xhtml1.xml XHTML1.0実体定義の一覧.
record of XHTML 1.0 entity definitions.
mml2.xml MathML2.0(第2版)実体定義の一覧.
record of MathML 2.0 (second edition) entity definitions.
unicodedata.xsl unicode.xmlの新しいコピーを作るためのスタイルシート. ユニコードのデータファイルからデータを取り込むもので, ユニコードの新しいバージョンが出たときにunicode.xmlを更新するのに利用されます.
stylesheet that generates a new copy of unicode.xml, incorporating data from the Unicode data file, used to update unicode.xml as new versions of Unicode are released.
entities.xsl 実体に対するDTD宣言を作成するスタイルシート.
stylesheet to generate the DTD declarations for the entities.
charmap.xsl XSLTによる文字の割り当てを作成するスタイルシート.
stylesheet to generate the XSLT character maps.
characters.xsl 参照付きのHTMLの表を含むこの文書を作成するスクリプト.
stylesheet to generate this document, including the referenced HTML tables.
schemas.xml XML文書と適切なRelax NG スキーマを関連付けるスタイルシート.
file associating XML documents with appropriate Relax NG schema.
catalog https://www.w3.org/2003/entities/2007/にある実体やスタイルシートを, /etc/xml/w3c-entitiesにあるローカルファイルシステム上のものへ切り替えるOASIS XML カタログファイルの例. このファイルは, ローカルにあるファイルのコピーの場所を参照するように編集する必要があります. たくさんのXML処理プログラムがこのカタログ形式を読み込むよう設定されているでしょうが, 特定の機能は使用している処理プログラムに依存します.
Sample OASIS XML catalog that redirects references to the entity or stylesheet files at https://www.w3.org/2003/entities/2007/ to the local file system at /etc/xml/w3c-entities. It should be edited to refer to the location of a local copy of the files. Many XML parsers may be configured to read this catalog format, but the specific options depend on the parser being used.

E 参考文献
References

SGML SGML: ISO/IEC 8879:1986, Information processing — Text and office systems — Standard Generalized Markup Language (SGML)(訳注:"情報処理—文字とオフィスシステム—文書記述言語(SGML)"という意味)
ISO/IEC 8879:1986, Information processing — Text and office systems — Standard Generalized Markup Language (SGML)
ISO9573-13-1991 ISO9573-13-1991: ISO/IEC TR 9573-13:1991, Information technology — SGML support facilities — Techniques for using SGML — Part 13: Public entity sets for mathematics and science(訳注:"情報技術—SGMLを補完する規格, SGMLを利用するための技術—パート13:数学と科学のための公式実体セット"という意味)
ISO/IEC TR 9573-13:1991, Information technology — SGML support facilities — Techniques for using SGML — Part 13: Public entity sets for mathematics and science
ユニコード Unicode: ユニコードコンソーシアム. The Unicode Standard, Version 5.2 (Mountain View, CA: The Unicode Consortium, 2009. ISBN 978-1-936213-00-9)により定義されたユニコード標準バージョン5.2.0 . ユニコード 6.3 に更新(https://www.unicode.org/versions/Unicode6.3.0/)
The Unicode Consortium. The Unicode Standard, Version 5.2.0, defined by: The Unicode Standard, Version 5.2 (Mountain View, CA: The Unicode Consortium, 2009. ISBN 978-1-936213-00-9). Unicode 6.3 update (https://www.unicode.org/versions/Unicode6.3.0/)
ユニコード15 Unicode15: Unicode Standard Annex 15(訳注:"ユニコード標準付録15"という意味), Version 6.3.0;Unicode Normalization Forms(訳注:"ユニコード正規化形式"という意味), ユニコードコンソーシアム, 2013-09-20. (https://www.unicode.org/reports/tr15/)
Unicode Standard Annex 15, Version 6.3.0; Unicode Normalization Forms, The Unicode Consortium, 2013-09-20. (https://www.unicode.org/reports/tr15/)
ユニコード25 Unicode25: Barbara Beeton, Asmus Freytag, Murray Sargent III 著 Unicode Support for Mathematics(訳注:"数学のためのユニコードの実装"という意味), Unicode Technical Report #25(訳注:"ユニコード技術レポート#25"という意味) 2012-04-02.(https://www.unicode.org/reports/tr25/)
Barbara Beeton, Asmus Freytag, Murray Sargent III, Unicode Support for Mathematics, Unicode Technical Report #25 2012-04-02. (https://www.unicode.org/reports/tr25/)
MathML2 MathML2: David Carlisle, Patrick Ion, Robert Miner, Nico Poppelier 著 数学用マークアップ言語(MathML)バージョン2.0(第2版)W3C勧告 2003年10月21日(https://www.w3.org/TR/2003/REC-MathML2-20031021/)
David Carlisle, Patrick Ion, Robert Miner, Nico Poppelier, Mathematical Markup Language (MathML) Version 2.0 (Second Edition) W3C Recommendation 21 October 2003 (https://www.w3.org/TR/2003/REC-MathML2-20031021/)
MathML3 MathML3: David Carlisle, Patrick Ion, Robert Miner 著 数学用マークアップ言語(MathML)バージョン3.0 第2版W3C勧告 2014年4月10日(https://www.w3.org/TR/2014/REC-MathML3-20140410/)
(訳注:日本語訳https://takamu.sakura.ne.jp/mathml3-ja/index.html)
David Carlisle, Patrick Ion, Robert Miner, Mathematical Markup Language (MathML) Version 3.0 2nd Edition W3C Recommendation 10 April 2014 (https://www.w3.org/TR/2014/REC-MathML3-20140410/)
HTML4 HTML4: Dave Raggett, Arnaud Le Hors, Ian Jacobs 著 HTML 4.01 仕様書W3C勧告 1999年12月24日, 置換済 2018年3月27日(https://www.w3.org/TR/1999/REC-html401-19991224)
(訳注:日本語訳http://www.asahi-net.or.jp/~sd5a-ucd/rec-html401j/cover.html.ja.sjis)
Dave Raggett, Arnaud Le Hors, Ian Jacobs, HTML 4.01 Specification W3C Recommendation 24 December 1999, superseded 27 March 2018 (https://www.w3.org/TR/2018/SPSD-html401-20180327/)
HTML5 HTML5: Ian Hickson (Google), Simon Pieters (Mozilla), Anne van Kesteren (Apple), Philip Jägenstedt (Google), Domenic Denicola (Google) 著 HTML 5 ‘Living Standard’, 2023年3月1日 (https://www.w3.org/TR/html5/)
Ian Hickson (Google), Simon Pieters (Mozilla), Anne van Kesteren (Apple), Philip Jägenstedt (Google), Domenic Denicola (Google) HTML 5 ‘Living Standard’, 1 March 2023 (https://www.w3.org/TR/html5/)
Charmod-norm Charmod-norm: François Yergeau, Martin J. Dürst, Richard Ishida, Addison Phillips, Misha Wolf, Tex Texin 著 Character Model for the World Wide Web 1.0: Normalization(訳注:"ワールドワイドウェブのための文字モデル 1.0 : 標準化"という意味)W3C草案 2012年5月1日(https://www.w3.org/TR/charmod-norm/)
François Yergeau, Martin J. Dürst, Richard Ishida, Addison Phillips, Misha Wolf, Tex Texin, Character Model for the World Wide Web 1.0: Normalization W3C Working Draft 1 May 2012 (https://www.w3.org/TR/charmod-norm/)

文字に対するXML実体の定義(第3版)
XML Entity Definitions for Characters (3rd Edition)

概要
Abstract

この文書の位置付け
Status of this Document

1 導入
Introduction

2 実体名セット
Sets of names