• 热门标签

当前位置: 主页 > 航空资料 >

时间:2010-06-30 09:00来源:蓝天飞行翻译 作者:admin
曝光台 注意防骗 网曝天猫店富美金盛家居专营店坑蒙拐骗欺诈消费者

}
/**
* Computes a boolean[8] representation of a char. The bit ordering is from more
* to less significant.
* i.e. <pre>string "4" = char 0x34 = 0011 0100 = [false, false, true, true,
* false, true, false, false] </pre>
*
*
*/
private static boolean[] charToBoolean(char val) {
boolean tB[] = new boolean[8];
for (int i = 0; i < 8; i++) {
if (bitIsSet(val, i)) {
tB[7-i] = true;
}
}
44 Edition Number: 4.5
AIXM PRIMER
return tB;
}
/**
* Indicates whether the ith bit of the supplied char is set (== 1).
*/
private static boolean bitIsSet(char val, int i) {
return (val & (int) Math.pow(2, i)) > 0;
}
/**
* Computes the hexadecimal representation of the array. The result will contain
* no lower case letters. (Cf AIXM definition of "Data Types for Cyclic
* Redundancy Check Values (CRCV)".)
*/
private static String booleanToHex(boolean[] array) {
StringBuffer sb = new StringBuffer();
for (int i = 0; i < array.length; i = i+4) {
sb.append(decodeBoolean(i, array));
}
return sb.toString().toUpperCase();
}
/**
* Computes the hexadecimal representation of the 4-bit segment of the array
* commencing at the given offset.
*/
private static String decodeBoolean(int offset, boolean[] array) {
char toReturn = 0x0;
for (int j = 0; j < 4; j++) {
if (array[offset+(3-j)]) {
toReturn += Math.pow(2, j);
}
}
return Integer.toHexString(toReturn);
}
}
Edition Number: 4.5 45
AIXM PRIMER
AIXM PRIMER
Appendix C. Character Encoding Issues
C.1. Introduction
This appendix is intended as an introduction to the concept of character encoding, explaining what it is
and why it is important, particularly with regard to XML documents.
C.2. What is a Character Encoding?
Simply put, a character encoding is a standard by which computers process character data. In order to
fully appreciate what a character encoding is, and how they can potentially lead to problems, it is beneficial
to know how they have evolved. Before the personal computer became widespread, most computer
software processed character data using a standard called ASCII (American Standard Code for Information
Interchange). ASCII represented every character in the English alphabet, plus punctuation characters
etc. using the numbers between 32 and 127. Codes below 32 represented "unprintable" characters such
as line-feeds. However, because there are eight bits to a byte, this meant that the codes from 128 to 255
were unallocated.
Table C.1. Printable ASCII Characters
! ". # $ % & ' ( ) * + , - . /
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
@ A B C D E F G H I J K L M N O
P Q R S T U V W X Y Z [ \ ] ^ _
` a b c d e f g h i j k l m n o
p q r s t u v w x y z { | } ~
Inevitably, those codes were used in different ways by different people. And as computers became
globally available, non-English-speaking users utilised those codes for their own alphabet characters.
This led to a number of different encoding standards (known as OEM) being developed around the
world, mostly as extensions of ASCII.
C.3. Unicode
Unicode is a standard, by the Unicode Consortium, which defines a character repertoire and character
code intended to be fully compatible with ISO 10646, and an encoding for it. ISO 10646 is more general
(abstract) in nature, whereas Unicode "imposes additional constraints on implementations to ensure that
they treat characters uniformly across platforms and applications", as they say in section Unicode &
ISO 10646 of the Unicode FAQ. Unicode was originally designed to be a 16-bit code, but it was extended
so that currently code positions are expressed as integers in the hexadecimal range 0..10FFFF (decimal
0..1 114 111). That space is divided into 16-bit "planes". Until recently, the use of Unicode has mostly
been limited to "Basic Multilingual Plane (BMP)" consisting of the range 0..FFFF. The ISO 10646 and
Unicode character repertoire can be regarded as a superset of most character repertoires in use. However,
the code positions of characters vary from one character code to another.
Originally, before extending the code range past 16 bits, the "native" Unicode encoding was UCS-2,
which presents each code number as two consecutive octets m and n so that the number equals 256m+n.
This means, to express it in computer jargon, that the code number is presented as a two-byte integer.
According to the Unicode consortium, the term UCS-2 should now be avoided, as it is associated with
 
中国航空网 www.aero.cn
航空翻译 www.aviation.cn
本文链接地址:AIXM_Primer_4.5(21)