downloader#

This module provides a multiple download functions for downloading files from given urls.

Functions	Description
`download_data()`	download a single file from a given url
`download_datas()`	sequentially download multiple files from given urls
`async_download_datas()`	asynchronously download multiple files from given urls
`mp_download_datas()`	download multiple files from given urls using multiprocessing

Functions#

downloader.download_data(folder=None, file_name=None, client=None, engine='requests', follow_redirects=True, retry=0, authorize_from_browser=False)#

Download a single file.

Parameters:#

url: str: url of web file
folder: str: the folder to store output files. Default current folder.
file_name: str: the file name. If None, will parse from web response or url. file_name can be the absolute path if folder is None.
client: requests.Session() for requests engine or httpx.Client() for httpx engine: client maintaining connection. Default None
engine: one of [“requests”,”httpx”]: engine for downloading
follow_redirects: bool: Enables or disables HTTP redirects
retry: int: number of reconnection when status code is 503
authorize_from_browser: bool: Whether to load cookies used by your web browser for authorization. This means you can use python to download data by logging in to website via browser (So far the following browsers are supported: Chrome,Firefox, Opera, Edge, Chromium”). It will be very useful when website doesn’t support “HTTP Basic Auth”. Default is False.

downloader.download_datas(folder=None, file_names=None, engine='requests', authorize_from_browser=False, desc='')#

download data from a list like object which containing urls. This function will download files one by one.

Parameters:#

urls: iterator: iterator contains urls
folder: str: the folder to store output files. Default current folder.
engine: one of [“requests”,”httpx”]: engine for downloading
file_names: iterator: iterator contains names of files. Leaving it None if you want the program to parse them from website. file_names can contain the absolute paths if folder is None.
authorize_from_browser: bool: Whether to load cookies used by your web browser for authorization. This means you can use python to download data by logging in to website via browser (So far the following browsers are supported: Chrome,Firefox, Opera, Edge, Chromium”). It will be very useful when website doesn’t support “HTTP Basic Auth”. Default is False.
desc: str: description of data downloading

Examples:#

>>> from data_downloader import downloader

specify the urls and folder

>>> urls=['http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141117_20141211/20141117_20141211.geo.unw.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141024_20150221/20141024_20150221.geo.unw.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141024_20150128/20141024_20150128.geo.cc.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141024_20150128/20141024_20150128.geo.unw.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141211_20150128/20141211_20150128.geo.cc.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141117_20150317/20141117_20150317.geo.cc.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141117_20150221/20141117_20150221.geo.cc.tif']
>>> folder = 'D:\data'

download data from urls and store them in folder

>>> downloader.download_datas(urls,folder)

downloader.async_download_datas(folder=None, file_names=None, limit=30, desc='', follow_redirects=True, retry=0, authorize_from_browser=False)#

Download multiple files simultaneously.

Parameters:#

urls: iterator: iterator contains urls
folder: str: the folder to store output files. Default current folder.
authorize_from_browser: bool: Whether to load cookies used by your web browser for authorization. This means you can use python to download data by logging in to website via browser (So far the following browsers are supported: Chrome,Firefox, Opera, Edge, Chromium”). It will be very useful when website doesn’t support “HTTP Basic Auth”. Default is False.
file_names: iterator: iterator contains names of files. Leaving it None if you want the program to parse them from website. file_names can contain the absolute paths if folder is None.
limit: int: the number of files downloading simultaneously
desc: str: description of datas downloading
follow_redirects: bool: Enables or disables HTTP redirects
retry: int: number of reconnection when status code is 503

Example:#

>>> from data_downloader import downloader

specify the urls and folder

>>> urls=['http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141117_20141211/20141117_20141211.geo.unw.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141024_20150221/20141024_20150221.geo.unw.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141024_20150128/20141024_20150128.geo.cc.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141024_20150128/20141024_20150128.geo.unw.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141211_20150128/20141211_20150128.geo.cc.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141117_20150317/20141117_20150317.geo.cc.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141117_20150221/20141117_20150221.geo.cc.tif']
>>> folder = 'D:\data'

download data from urls and store them in folder

>>> downloader.async_download_datas(urls,folder,None,desc='interferograms')

downloader.mp_download_datas(folder=None, file_names=None, ncore=None, desc='', follow_redirects=True, retry=0, engine='requests', authorize_from_browser=False)#

download data from a list like object which containing urls. This function will download multiple files simultaneously using multiprocess.

Parameters:#

urls: iterator: iterator contains urls
folder: str: the folder to store output files. Default current folder.
engine: one of [“requests”,”httpx”]: engine for downloading
file_names: iterator: iterator contains names of files. Leaving it None if you want the program to parse them from website. file_names can contain the absolute paths if folder is None.
ncore: int: Number of cores for parallel processing. If ncore is None then the number returned by os.cpu_count() is used. Default None.
desc: str: description of data downloading
authorize_from_browser: bool: Whether to load cookies used by your web browser for authorization. This means you can use python to download data by logging in to website via browser (So far the following browsers are supported: Chrome,Firefox, Opera, Edge, Chromium”). It will be very useful when website doesn’t support “HTTP Basic Auth”. Default is False.

Examples:#

>>> from data_downloader import downloader

specify the urls and folder

>>> urls=['http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141117_20141211/20141117_20141211.geo.unw.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141024_20150221/20141024_20150221.geo.unw.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141024_20150128/20141024_20150128.geo.cc.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141024_20150128/20141024_20150128.geo.unw.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141211_20150128/20141211_20150128.geo.cc.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141117_20150317/20141117_20150317.geo.cc.tif',
'http://gws-access.ceda.ac.uk/public/nceo_geohazards/LiCSAR_products/106/106D_05049_131313/interferograms/20141117_20150221/20141117_20150221.geo.cc.tif']
>>> folder = 'D:\data'

download data from urls and store them in folder

>>> downloader.mp_download_datas(urls,folder)

downloader#

Functions#

Parameters:#

Parameters:#

Examples:#

Parameters:#

Example:#

Parameters:#

Examples:#

This Page