MS Windows上で作成された文書をlinux,freebsdのターミナル(konsole,gnome-terminal)上のvimで編集するとき文書全体または一部が文字化けして、編集できない場合があります。そのような場合、以下の特定の文字が関係しています。
読み | 全角文字 | CP932符号(16進) | UTF-8符号(16進) | UTF-16符号(16進) | Octal UTF-8 bytes | HTML Entity (decimal) (hex) (named) |
---|---|---|---|---|---|---|
FULLWIDTH HYPHEN-MINUS
MINUS SIGN HORIZONTAL BAR EN DASH EM DASH HYPHEN LEFT DOUBLE QUOTATION MARK RIGHT DOUBLE QUOTATION MARK WAVE DASH NO-BREAK SPACE |
- − ― – — ‐ “ ” 〜 |
81af 817c 815c 829c 815c 815d 8167 8168 8160 8541 |
efbc8d e28892 e28095 e28093 e28094 e28090 e2809c e2809d e3809c c2a0 |
ff0d
2212 2015 2013 2014 2010 201c 201d 301c 00a0 |
357 274 215
342 210 222 342 200 225 342 200 223 342 200 224 342 200 220 342 200 234 342 200 235 343 200 234 302 240 |
- -
− − − ― ― – – – — — — ‐ ‐ “ “ “ ” ” ” 〜 〜     |
丸数字 | ① | 8740 | e291a0 | 2460 | 342 221 240 | ① ① |
② | 8741 | e291a1 | 2461 | 342 221 241 | ② ② | |
③ | 8742 | e291a2 | 2462 | 342 221 242 | ③ ③ | |
④ | 8743 | e291a3 | 2463 | 342 221 243 | ④ ④ | |
⑤ | 8744 | e291a4 | 2464 | 342 221 244 | ⑤ ⑤ | |
⑥ | 8745 | e291a5 | 2465 | 342 221 245 | ⑥ ⑥ | |
カッコ株 | ㈱ | 878a | e388b1 | 3231 | 343 210 261 | ㈱ ㈱ |
http://www.iana.org/assignments/character-sets/character-sets.xhtmlによると「大文字小文字の使用は区別されない」となっています。
:e ++enc=sjis~/.vimrc 記述例
set encoding=utf-8cp932, shift_jis, sjis は似ているが正確には同じではないとの記述がmbyte.txtにある。
set termencoding=utf-8
set fileencoding=utf-8
set fileencodings=utf-8,sjis,cp932,utf-16,utf-16le,euc-jp,iso-2022-jp,ucls2le,ucs-2,iso8859-1
perl -p -e 's/\x81\x7c/-/' inputfile > outputfile perl -p -e 's/\x87\x40/(1)/;s/\x87\x41/(2)/;s/\x87\x42/(3)/' inputfile > outputfile
名前 | 全角文字 | CP932符号(16進) | UTF-8符号(16進) | UTF-16符号(16進) |
---|---|---|---|---|
CIRCLED DIGIT ONE | ① | 8740 | e291a0 | 2460 |
CIRCLED DIGIT TWO | ② | 8741 | e291a1 | 2461 |
CIRCLED DIGIT THREE | ③ | 8742 | e291a2 | 2462 |
CIRCLED DIGIT FOUR | ④ | 8743 | e291a3 | 2463 |
CIRCLED DIGIT FIVE | ⑤ | 8744 | e291a4 | 2464 |
CIRCLED DIGIT SIX | ⑥ | 8745 | e291a5 | 2465 |
CIRCLED DIGIT SEVEN | ⑦ | 8746 | e291a6 | 2466 |
CIRCLED DIGIT EIGHT | ⑧ | 8747 | e291a7 | 2467 |
CIRCLED DIGIT NINE | ⑨ | 8748 | e291a8 | 2468 |
CIRCLED NUMBER TEN | ⑩ | 8749 | e291a9 | 2469 |
CIRCLED NUMBER ELEVEN | ⑪ | 874a | e291aa | 246a |
CIRCLED NUMBER TWELVE | ⑫ | 874b | e291ab | 246b |
CIRCLED NUMBER THIRTEEN | ⑬ | 874c | e291ac | 246c |
CIRCLED NUMBER FOURTEEN | ⑭ | 874d | e291ad | 246d |
CIRCLED NUMBER FIFTEEN | ⑮ | 874e | e291ae | 246e |
CIRCLED NUMBER SIXTEEN | ⑯ | 874f | e291af | 246f |
CIRCLED NUMBER SEVENTEEN | ⑰ | 8750 | e291b0 | 2470 |
CIRCLED NUMBER EIGHTEEN | ⑱ | 8751 | e291b1 | 2471 |
CIRCLED NUMBER NINETEEN | ⑲ | 8752 | e291b2 | 2473 |
CIRCLED NUMBER TWENTY | ⑳ | 8753 | e291b3 | 2473 |
ROMAN NUMERAL ONE | Ⅰ | 8754 | e285a0 | 2160 |
ROMAN NUMERAL TWO | Ⅱ | 8755 | e285a1 | 2161 |
ROMAN NUMERAL THREE | Ⅲ | 8756 | e285a2 | 2162 |
ROMAN NUMERAL FOUR | Ⅳ | 8757 | e285a3 | 2163 |
ROMAN NUMERAL FIVE | Ⅴ | 8758 | e285a4 | 2164 |
ROMAN NUMERAL SIX | Ⅵ | 8759 | e285a5 | 2165 |
ROMAN NUMERAL SEVEN | Ⅶ | 875a | e285a6 | 2166 |
ROMAN NUMERAL EIGHT | Ⅷ | 875b | e285a7 | 2167 |
ROMAN NUMERAL NINE | Ⅸ | 875c | e285a8 | 2168 |
ROMAN NUMERAL TEN | Ⅹ | 875d | e285a9 | 2169 |
SQUARE MIRI | ㍉ | 875f | e38d89 | 3349 |
SQUARE KIRO | ㌔ | 8760 | e38c94 | 3314 |
SQUARE SENTI | ㌢ | 8761 | e38ca2 | 3322 |
SQUARE MEETORU | ㍍ | 8762 | e38d8d | 334d |
SQUARE GURAMU | ㌘ | 8763 | e38c98 | 3318 |
SQUARE TON | ㌧ | 8764 | e38ca7 | 3327 |
SQUARE AARU | ㌃ | 8765 | e38c83 | 3303 |
SQUARE HEKUTAARU | ㌶ | 8766 | e38cb6 | 3336 |
SQUARE RITTORU | ㍑ | 8767 | e38d91 | 3351 |
SQUARE WATTO | ㍗ | 8768 | e38d97 | 3357 |
SQUARE KARORII | ㌍ | 8769 | e38c8d | 330d |
SQUARE DORU | ㌦ | 876a | e38ca6 | 3326 |
SQUARE SENTO | ㌣ | 876b | e38ca3 | 3323 |
SQUARE PAASENTO | ㌫ | 876c | e38cab | 332b |
SQUARE MIRIBAARU | ㍊ | 876d | e38d8a | 334a |
SQUARE PEEZI | ㌻ | 876e | e38cbb | 333b |
SQUARE MM | ㎜ | 876f | e38e9c | 339c |
SQUARE CM | ㎝ | 8770 | e38e9d | 339d |
SQUARE KM | ㎞ | 8771 | e38e9e | 339e |
SQUARE MG | ㎎ | 8772 | e38e8e | 338e |
SQUARE KG | ㎏ | 8773 | e38e8f | 338f |
SQUARE CC | ㏄ | 8774 | e38f84 | 33c4 |
SQUARE M SQUARED | ㎡ | 8775 | e38ea1 | 33a1 |
SQUARE ERA NAME HEISEI | ㍻ | 877e | e38dbb | 337b |
REVERSED DOUBLE PRIME QUOTATION MARK | 〝 | 8780 | e3809d | 301d |
LOW DOUBLE PRIME QUOTATION MARK | 〟 | 8781 | e3809f | 301f |
NUMERO SIGN | № | 8782 | e28496 | 2116 |
SQUARE KK | ㏍ | 8783 | e38f8d | 33cd |
TELEPHONE SIGN | ℡ | 8784 | e284a1 | 2121 |
CIRCLED IDEOGRAPH HIGH | ㊤ | 8785 | e38aa4 | 32a4 |
CIRCLED IDEOGRAPH CENTER | ㊥ | 8786 | e38aa5 | 32a5 |
CIRCLED IDEOGRAPH LOW | ㊦ | 8787 | e38aa6 | 32a6 |
CIRCLED IDEOGRAPH LEFT | ㊧ | 8788 | e38aa7 | 32a7 |
CIRCLED IDEOGRAPH RIGHT | ㊨ | 8789 | e38aa8 | 32a8 |
PARENTHESIZED IDEOGRAPH STOCK | ㈱ | 878a | e388b1 | 3231 |
PARENTHESIZED IDEOGRAPH HAVE | ㈲ | 878b | e388b2 | 3232 |
PARENTHESIZED IDEOGRAPH REPRESENT | ㈹ | 878c | e388b9 | 3239 |
SQUARE ERA NAME MEIZI | ㍾ | 878d | e38dbe | 337e |
SQUARE ERA NAME TAISYOU | ㍽ | 878e | e38dbd | 337d |
SQUARE ERA NAME SYOUWA | ㍼ | 878f | e38dbc | 337c |
APPROXIMATELY EQUAL TO OR THE IMAGE OF | ≒ | 8790 | e28992 | 2252 |
IDENTICAL TO | ≡ | 8791 | e289a1 | 2261 |
INTEGRAL | ∫ | 8792 | e288ab | 222b |
CONTOUR INTEGRAL | ∮ | 8793 | e288ae | 222e |
N-ARY SUMMATION | ∑ | 8794 | e28891 | 2211 |
SQUARE ROOT | √ | 8795 | e2889a | 221a |
UP TACK | ⊥ | 8796 | e28aa5 | 22a5 |
ANGLE | ∠ | 8797 | e288a0 | 2220 |
RIGHT ANGLE | ∟ | 8798 | e2889f | 221f |
RIGHT TRIANGLE | ⊿ | 8799 | e28abf | 22bf |
BECAUSE | ∵ | 879a | e288b5 | 2235 |
INTERSECTION | ∩ | 879b | e288a9 | 2229 |
UNION | ∪ | 879c | e288aa | 222a |