Python OpenerDirector Ignore 301 or 302 Redirection in Python 3.x – Python Web Crawl Tutorial

By | July 24, 2019

When we use python OpenDirector object to open a url, two important things we must concern:

1.Ignore SSL verification or not.

To make OpenerDirector object to ingore ssl verification, you can read this tutorial.

Best Practice to OpenerDirector Ignore SSL Verification in Python 3.x – Python Web Crawler Tutorial

2.Ignore 301 or 302 redirection

python 301 or 302 redirection

In this tutorial, we will introduce how to ingore with OpenerDirector.

You can do that with our steps below.

Create your own HTTPRedirectHandler class

    class CustomHTTPRedirectHandler(urllib.request.HTTPRedirectHandler):
        def redirect_request(self, req, fp, code, msg, hdrs, newurl):
            return None

Overwrite redirect_request() function, this function is called when 301 or 302 redirection occur.

Create a OpenerDirector object

redirectHandler = CustomHTTPRedirectHandler()
opener = urllib.request.build_opener(redirectHandler)

Then you can use opener object to open a url, this object will ingore 301 or 302 redirection.

crawl_response = opener.open(crawl_url, timeout = 30)

Leave a Reply

Your email address will not be published. Required fields are marked *