Started writing a simple web crawler in Python for downloading a website


I have been having trouble looking through the C standard library so I thought to download the C library reference from cplusplus.com. I just wrote a simple script for downloading the front page and it is currently missing the style on the website but it got the basic contents. I’ll need to some work before it gives the best result.

I need to add code for recursively download everything but it’s a start.

Here’s my Python script.

#! python3
import urllib.request

filehandle = urllib.request.urlopen(
    'http://www.cplusplus.com/reference/clibrary/')

with open('test.html', 'w+b') as f:
    for line in filehandle:
        f.write(line)

filehandle.close()
Advertisements