Horje
Convert Unicode String to a Byte String in Python

Python is a versatile programming language known for its simplicity and readability. Unicode support is a crucial aspect of Python, allowing developers to handle characters from various scripts and languages. However, there are instances where you might need to convert a Unicode string to a regular string. In this article, we will explore five different methods to achieve this in Python.

Convert A Unicode String to a Byte String In Python

Below, are the ways to convert a Unicode String to a Byte String In Python.

  • Using encode() with UTF-8
  • Using encode() with a Different Encoding
  • Using bytes() Constructor
  • Using str.encode() Method

Convert A Unicode to a Byte String Using encode() with UTF-8

In this example, a Unicode string containing English and Chinese characters is encoded to a byte string using UTF-8 encoding. The resulting `bytes_representation` is printed, demonstrating the transformation of the mixed-language Unicode string into its byte representation suitable for storage or transmission in a UTF-8 encoded format.

Python3

unicode_string = "Hello, 你好"
  
bytes_representation = unicode_string.encode('utf-8')
  
print(bytes_representation)

Output

b'Hello, \xe4\xbd\xa0\xe5\xa5\xbd'

Unicode String To Byte String Using encode() with a Different Encoding

In this example, the Unicode string is encoded into a byte string using UTF-16 encoding, resulting in a sequence of bytes that represents the mixed-language string. The byte string is then printed to demonstrate the UTF-16 encoded representation of the Unicode characters.

Python3

unicode_string = "Hello, 你好"
  
byte_string_utf16 = unicode_string.encode('utf-16')
  
# Displaying the byte string
print(byte_string_utf16)

Output

b'\xff\xfeH\x00e\x00l\x00l\x00o\x00,\x00 \x00`O}Y'

Convert A Unicode String To Byte String Using bytes() Constructor

In this example, the Unicode string is converted to a byte string using the bytes() constructor with UTF-8 encoding. The resulting `byte_string_bytes` represents the UTF-8 encoded byte sequence of the mixed-language Unicode string.

Python3

unicode_string = "Hello, 你好"
  
byte_string_bytes = bytes(unicode_string, 'utf-8')
  
# Displaying the byte string
print(byte_string_bytes)

Output

b'Hello, \xe4\xbd\xa0\xe5\xa5\xbd'

Python Unicode String To Byte String Using str.encode() Method

In this example, the Unicode string is transformed into a byte string using the str.encode() method with UTF-8 encoding. The resulting `byte_string_str_encode` represents the UTF-8 encoded byte sequence of the mixed-language Unicode string.

Python3

unicode_string = "Hello, 你好"
  
byte_string_str_encode = str.encode(unicode_string, 'utf-8')
  
# Displaying the byte string
print(byte_string_str_encode)

Output

b'Hello, \xe4\xbd\xa0\xe5\xa5\xbd'




Reffered: https://www.geeksforgeeks.org


Geeks Premier League

Related
Square Root of 9 | How to Find Value of Square Root of 9? Square Root of 9 | How to Find Value of Square Root of 9?
Filter List of Dictionaries Based on Key Values in Python Filter List of Dictionaries Based on Key Values in Python
React Bootstrap Overlay Component React Bootstrap Overlay Component
How to bind event handlers in React? How to bind event handlers in React?
What Is Project Collaboration? What Is Project Collaboration?

Type:
Geek
Category:
Coding
Sub Category:
Tutorial
Uploaded by:
Admin
Views:
11