Fix TypeError: cannot use a string pattern on a bytes-like object – Python Tutorial

By | July 10, 2019

“TypeError: cannot use a string pattern on a bytes-like object” will occur when you are using a byte object in python regular expression. In this paper we will introduce you how to fix this error.

Here is an example.

This example open a url and get html web page content.

import urllib.request
with urllib.request.urlopen('http://www.python.org/') as f:
    html = f.read()
    print (type(html))

We will get:

<class 'bytes'>

Which means type of html variable is bytes.

Use a regular expression to parse it.

    webpage_regex = re.compile('<a[^>]+href=["\'](.*?)["\']',re.IGNORECASE)
    links = webpage_regex.findall(html)
    print (links)

We will get error:

typeerror - connot use a string pattern on a bytes-like object

The reason for causing this error is html variable is bytes. To fix it, we can decode it.

    html = html.decode('utf-8')
    print (type(html))

Then html is:

<class 'str'>

We can use python regular expression to parse it.

    webpage_regex = re.compile('<a[^>]+href=["\'](.*?)["\']',re.IGNORECASE)
    links = webpage_regex.findall(html)
    print (links)

The result is:

['http://browsehappy.com/', '#content', '#python-network', '/'

Leave a Reply