Skip to content Skip to sidebar Skip to footer

Printing Unicode Characters To Stdout In Python Prints Wrong Glyphs

I want to print a set of Unicode characters to my command prompt terminal. Even when I enforce the encoding to be 'UTF-8' the terminal prints some garbage. $python -c 'import sys;

Solution 1:

Python cannot control the encoding used by your terminal; you'll have to change that somewhere else.

In other words, just because you force python to output UTF-8 encoded text to the terminal, does not mean your terminal will magically start to accept that output as UTF-8 as well.

The Mac OS X terminal has already been configured to work with UTF-8.

On Windows, you can switch the console codepage with the chcp command:

chcp 65001

where 65001 is the Windows codepage for UTF-8. See Unicode characters in Windows command line - how?

Solution 2:

You have to use a UTF-8 code page (cp65001) to expect UTF-8 encoded text to display.

Python 3.3 claims to support code page 65001 (UTF-8) on Windows.

C:\>chcp 65001
Active code page: 65001

C:\>python
Python 3.3.0rc1 (v3.3.0rc1:8bb5c7bc46ba, Aug 252012, 13:50:30) [MSC v.160064 bit (AMD64)] on win32
Type"help", "copyright", "credits"or"license"for more information.
>>> print('\u2044')
⁄

Although it is buggy:

>>>print('\u2044')>>>print('\u2044'*8)
⁄⁄⁄⁄⁄⁄⁄⁄
��⁄⁄⁄⁄
⁄⁄
��

>>>print('1\u20442 2\u20443 4\u20445')
1⁄2 2⁄3 4⁄5
⁄5

Post a Comment for "Printing Unicode Characters To Stdout In Python Prints Wrong Glyphs"