How to Remove Special Characters From a Python String: An Introduction

By | April 17, 2020

After we have got text from a text file, we have to remove some special characters. In this tutorial, we will introduce how to remove them for python beginners.

Special Characters

Special characters are not stable, they may different based on different applications.

As to english, common characters are printable characters. Other characters are special characters.

To know what are printable characters, you can read the tutorial below:

An Introduction to ASCII (0 – 255) for Beginners

How to remove specail characters?

If you only plan to reserve the printable characters in english, you can do like this:

import re

text = "© is a blog site."
pattern = re.compile(r'[^\x20-\x7F]')
text = re.sub(pattern, '', text)

Here text contains a specail character ©, we remove it.

However, if you have known specail characters you plan to remove, you can do like this:

text = "© is a blog site."

sp = ['©', 'a']

text = [ t for t in text if t not in sp]

In this example, ‘©‘ and  ‘a‘ are special characters, we will remove them. You can replace them by your own special characters.

Leave a Reply

Your email address will not be published. Required fields are marked *