Information Processing Society of Japan Trial Standard
|Publication of the Version 1.0E (this version)||2003-05-23|
|Publication of the Version 1.xE||--|
|Publication of the Version 1.yE||--|
|Errata 1 to the Version 1.0E||2003-11-02|
|Please send comments about this document to||TS desk, IPSJ/ITSCJ|
|Copyright||©2003 IPSJ/ITSCJ, All Right Reserved.|
The normative version of the specification is the Japanese version found at the ITSCJ web.
This Trial Standard has been reviewed and endorsed by the IPSJ/ITSCJ technical committee. The authors of this document are the members of Working Group Five of IPSJ/ITSCJ TS (Trial Standard) Committee. They have developed this document as an activity of the Working Group taken in 2002 and 2003.
This Trial Standard specifies the Basic Subset of Coded Character Sets - Japanese Core Ideographs, which consists of the Kanji characters required in ordinary social life in Japan.
The characters in the Basic Subset have been extracted out of published standards such as JIS X 0208 and IPSJ-TS 0005, and reports on occurrence of Kanji characters encountered in some newspapers and dictionaries, considering the degree of functional importance of Kanji characters.
The following documents contain provisions which, through reference in this text, constitutes of this Trial Standard. For the references, the latest edition of the normative document referred to applies.
IPSJ-TS 0005:2002, Basic Subset of Coded Character Sets (BUCS), 2002-03
JIS X 0208:1997, 7-bit and 8-bit double byte coded KANJI sets for information interchange, 1997-01
For the purpose of this Trial Standard, the definitions in IPSJ-TS 0005 apply.
The Kanji characters in the Basic Subset - Japanese Core Ideographs are defined referring to the following documents:
Assembling the subsets 4.2.1 and 4.2.2 and adjustment of 4.2.3 are carried out to configure the Basic Subset of 4,593 characters, Japanese Core Ideographs.
The subset consists of the following 4,567 characters defined in JIS X 0208:
a) 3,739 characters which are contained in any sources of ,  and 
b) 670 characters which are contained with high occurrence in any two sources of ,  and , excluding the characters of a)
c) 130 characters which are contained in  and , excluding the characters of a) and b)
d) 28 characters which are required to describe person and place names, excluding the characters of a), b) and c)
The subset consists of the following 28 characters that are not defined in JIS X 0208:
a) 15 characters that describe headings in 
b) 13 characters of , which are required to describe person and place names
The 4,595 characters of the subsets 4.2.1 and 4.2.2 are adjusted from the character shapes' point of view, considering the source .
a) Five shapes are changed.
b) Two shapes are replaced with the corresponding shapes contained in the set of 4,595 characters. Accordingly, 2 characters are removed.
The elements of the Basic Subset are ordered according to 康煕字典(Kangxi Dictionary) and assigned with their sequential numbers. Each element is represented with a [[sequential number], UCS code position, character shape] tuple structure.
The Basic Subset is shown in Table 6.1.
Table 6.1 Basic Subset of Coded Character Sets - Japanese Core Ideographs