urlopen#
- urlopen(url, filename=None, save=None, headers=None, params=None, data=None, prefix='http', convert=True, die=False, response='text', verbose=False)[source]#
Download a single URL.
Alias to
urllib.request.urlopen(url).read()
. See alsosc.download()
for downloading multiple URLs. Note:sc.urlopen()
/sc.wget()
are aliases.- Parameters:
url (str) – the URL to open, either as GET or POST
filename (str) – if supplied, save to file instead of returning output
save (bool) – if supplied instead of
filename
, then use the default filenameheaders (dict) – a dictionary of headers to pass
params (dict) – a dictionary of parameters to pass to the GET request
data (dict)
prefix (str) – the string to ensure the URL starts with (else, add it)
convert (bool) – whether to convert from bytes to string
die (bool) – whether to raise an exception if converting to text failed
response (str) – what to return: ‘text’ (default), ‘json’ (dictionary version of the data), ‘status’ (the HTTP status), or ‘full’ (the full response object)
verbose (bool) – whether to print progress
Examples:
html = sc.urlopen('wikipedia.org') # Retrieve into variable html sc.urlopen('http://wikipedia.org', filename='wikipedia.html') # Save to file wikipedia.html sc.urlopen('https://wikipedia.org', save=True, headers={'User-Agent':'Custom agent'}) # Save to the default filename (here, wikipedia.org), with headers sc.urlopen('wikipedia.org', response='status') # Only return the HTTP status of the site
New in version 2.0.0: renamed fromwget
tourlopen
; new argumentsNew in version 2.0.1: creates folders by default if they do not existNew in version 2.0.4: “prefix” argument, e.g. prepend “http://” if not presentNew in version 3.1.4: renamed “return_response” to “response”; additional options