Downloader¶
- class parfive.Downloader(max_conn=5, progress=True, file_progress=True, loop=None, notebook=None, overwrite=False, headers=None, use_aiofiles=False)[source]¶
Bases:
objectDownload files in parallel.
- Parameters
max_conn (
int, optional) – The number of parallel download slots.progress (
bool, optional) – IfTrueshow a main progress bar showing how many of the total files have been downloaded. IfFalse, no progress bars will be shown at all.file_progress (
bool, optional) – IfTrueandprogressis true, showmax_connprogress bars detailing the progress of each individual file being downloaded.loop (
asyncio.AbstractEventLoop, optional) – No longer used, and will be removed in a future release.notebook (
bool, optional) – IfTruetqdm will be used in notebook mode. IfNonean attempt will be made to detect the notebook and guess which progress bar to use.overwrite (
boolorstr, optional) – Determine how to handle downloading if a file already exists with the same name. IfFalsethe file download will be skipped and the path returned to the existing file, ifTruethe file will be downloaded and the existing file will be overwritten, if'unique'the filename will be modified to be unique.headers (
dict) – Request headers to be passed to the server. AddsUser-Agentinformation aboutparfive,aiohttpandpythonif not passed explicitely.
Attributes Summary
aiofiles requires a different default chunk size
The total number of files already queued for download.
aiofiles will be used if installed and must be explicitly enabled
Methods Summary
download([timeouts])Download all files in the queue.
enqueue_file(url[, path, filename, overwrite])Add a file to the download queue.
retry(results)Retry any failed downloads in a results object.
run_download([timeouts])Download all files in the queue.
simple_download(urls, *[, path, overwrite])Download a series of URLs to a single destination.
Attributes Documentation
- default_chunk_size¶
aiofiles requires a different default chunk size
- queued_downloads¶
The total number of files already queued for download.
- use_aiofiles¶
aiofiles will be used if installed and must be explicitly enabled
PARFIVE_OVERWRITE_ENABLE_AIOFILES takes precedence if present, aiofiles will not be used
finally the Downloader’s constructor argument is considered.
Methods Documentation
- download(timeouts=None)[source]¶
Download all files in the queue.
- Parameters
timeouts (
dict, optional) – Overrides for the default timeouts for http downloads. Supported keys are any accepted by theaiohttp.ClientTimeoutclass. Defaults to no timeout for total session timeout (overriding the aiohttp 5 minute default) and 90 seconds for socket read timeout.- Returns
parfive.Results– A list of files downloaded.
Notes
This is a synchronous version of
run_download, anasyncioevent loop will be created to run the download (in it’s own thread if a loop is already running).The defaults for the
'total'and'sock_read'timeouts can be overridden by two environment variablesPARFIVE_TOTAL_TIMEOUTandPARFIVE_SOCK_READ_TIMEOUT.
- enqueue_file(url, path=None, filename=None, overwrite=None, **kwargs)[source]¶
Add a file to the download queue.
- Parameters
url (
str) – The URL to retrieve.path (
str, optional) – The directory to retrieve the file into, ifNonedefaults to the current directory.filename (
strorcallable, optional) – The filename to save the file as. Can also be a callable which takes two arguments the url and the response object from opening that URL, and returns the filename. (Note, for FTP downloads the response will beNone.) IfNonethe HTTP headers will be read for the filename, or the last segment of the URL will be used.overwrite (
boolorstr, optional) – Determine how to handle downloading if a file already exists with the same name. IfFalsethe file download will be skipped and the path returned to the existing file, ifTruethe file will be downloaded and the existing file will be overwritten, if'unique'the filename will be modified to be unique. IfNonethe value set when constructing theDownloaderobject will be used.kwargs (
dict) – Extra keyword arguments are passed toaiohttp.ClientSession.getoraioftp.Client.contextdepending on the protocol.
Notes
Proxy URL is read from the environment variables
HTTP_PROXYorHTTPS_PROXY, depending on the protocol of theurlpassed. Proxy Authenticationproxy_authshould be passed as aaiohttp.BasicAuthobject. Proxy Headersproxy_headersshould be passed asdictobject.
- retry(results)[source]¶
Retry any failed downloads in a results object.
Note
This will start a new event loop.
- Parameters
results (
parfive.Results) – A previous results object, the.errorsproperty will be read and the downloads retried.- Returns
parfive.Results– A modified version of the inputresultswith all the errors from this download attempt and any new files appended to the list of file paths.
- async run_download(timeouts=None)[source]¶
Download all files in the queue.
- Parameters
timeouts (
dict, optional) – Overrides for the default timeouts for http downloads. Supported keys are any accepted by theaiohttp.ClientTimeoutclass. Defaults to no timeout for total session timeout (overriding the aiohttp 5 minute default) and 90 seconds for socket read timeout.- Returns
parfive.Results– A list of files downloaded.
Notes
The defaults for the
'total'and'sock_read'timeouts can be overridden by two environment variablesPARFIVE_TOTAL_TIMEOUTandPARFIVE_SOCK_READ_TIMEOUT.
- classmethod simple_download(urls, *, path='./', overwrite=None)[source]¶
Download a series of URLs to a single destination.
- Parameters
urls (iterable) – A sequence of URLs to download.
path (
pathlib.Path, optional) – The destination directory for the downloaded files. Defaults to the current directory.overwrite (
bool, optional) – Overwrite the files at the destination directory. IfFalsethe URL will not be downloaded if a file with the corresponding filename already exists.
- Returns
parfive.Results– A list of files downloaded.