Download Module
- farmnet.data.download.checksum_ok(filename: str | Path, remote_checksum: str)[source]
Verifies whether a local file’s MD5 checksum matches a given remote checksum.
This function reads the contents of a file, computes its MD5 checksum, and compares it to the provided checksum string. If the checksums do not match, a warning message is printed to the console.
- Parameters:
filename (str | pathlib.Path) – Path to the file to verify. Can be a string or a Path object.
remote_checksum (str) – The expected checksum string to compare against.
- Returns:
None
- Return type:
None
Example
>>> from pathlib import Path >>> from farmnet.data.download import checksum_ok >>> from farmnet.utils import getenv
Correct checksum: .. code-block:: python
>>> file_path = Path(getenv("DOWNLOAD_PATH", "./data")) / 'Kelmarsh_WT_static.csv' >>> remote_checksum = 'af3a038f0f7fddfc1608ad0bfc8cf5ba' >>> checksum_ok(file_path, remote_checksum)
False checksum: .. code-block:: python
>>> file_path = Path(getenv("DOWNLOAD_PATH", "./data")) / 'Kelmarsh_WT_static.csv' >>> remote_checksum = 'bf3a038f0f7fddfc1608ad0bfc8cf5ba' >>> checksum_ok(file_path, remote_checksum) ================================================= Checksum wrong for Kelmarsh_WT_static.csv af3a038f0f7fddfc1608ad0bfc8cf5ba != bf3a038f0f7fddfc1608ad0bfc8cf5ba =================================================
- farmnet.data.download.download_from_url(url: str, download_path: str | Path)[source]
Downloads a file from a given URL and saves it to the specified path with a progress bar.
This function performs a streaming HTTP GET request to download a file from the provided URL. It saves the file to the given local path while displaying a progress bar using tqdm.
- Parameters:
url (str) – The URL of the file to download.
download_path (str | pathlib.Path) – The local path where the downloaded file will be saved. Can be a string or a pathlib.Path.
- Returns:
None
- Return type:
None
- Raises:
HTTPError – If the HTTP request returned an unsuccessful status code.
OSError – If the file cannot be written to the specified path.
Example
>>> from pathlib import Path >>> from farmnet.data.download import download_from_url >>> import os >>> import requests
>>> path = Path("Kelmarsh_WT_static.csv") >>> download_from_url( ... "https://zenodo.org/api/records/8252025/files/Kelmarsh_WT_static.csv/content", ... path ... ) >>> path.exists() True >>> os.remove(path)
>>> path = Path("Kelmarsh_WT_static.csv") >>> try: ... download_from_url( ... "https://zenodo.ch/api/records/8252025/files/Kelmarsh_WT_static.csv/content", ... path ... ) ... except requests.exceptions.ConnectionError: ... pass >>> path.exists() False
- farmnet.data.download.get_data_home_folder(data_home_folder: str | Path = None) Path[source]
Returns the absolute path to the data home folder. If no path is provided, the function uses the WEID_DATA_PATH environment variable. If that is not set, it defaults to ~/.weid_data. The directory is created if it does not exist.
- Parameters:
data_home_folder (str or Path, optional) – Optional custom path to the data folder.
- Returns:
Absolute path to the data home folder.
- Return type:
Path
Example
>>> from farmnet.data.download import get_data_home_folder >>> from pathlib import Path >>> import pathlib >>> import os
>>> from pathlib import Path >>> path = get_data_home_folder() >>> isinstance(path, Path) True >>> path.exists() True
>>> test_path = pathlib.Path().resolve() / Path("test_folder") >>> path = get_data_home_folder(test_path) >>> path == test_path True >>> path.exists() True >>> os.rmdir(test_path)
- farmnet.data.download.substring_match(include: list[str], string: str) str | None[source]
Checks whether any of the substrings in a list are present in a given string.
Iterates over the list of substrings and returns the input string if any substring is found within it. Returns None if no matches are found.
- Parameters:
include (list[str]) – List of substrings to check for.
string (str) – The target string to search within.
- Returns:
The original string if a match is found; otherwise, None.
- Return type:
str | None
Example
>>> from farmnet.data.download import substring_match
>>> include = ['Kelmarsh_WT_static.csv', 'Kelmarsh_SCADA_2022_4457.zip'] >>> substring_match(include, 'Kelmarsh_SCADA_2022_4457.zip') 'Kelmarsh_SCADA_2022_4457.zip' >>> substring_match(include, 'Test.zip')
- farmnet.data.download.zenodo_download(id: str | int, download_folder: str | Path, force_download: bool = False, include: list[str] | None = None, exclude: list[str] | None = None)[source]
Downloads files from a Zenodo record.
Retrieves files associated with a Zenodo record by ID and downloads them to the specified folder. Files can be filtered using inclusion or exclusion patterns. The file is only downloaded if it does not already exist unless force_download is set to True.
- Parameters:
id (str or int) – The Zenodo record ID.
download_folder (str or Path) – Path to the folder where files will be downloaded.
force_download (bool, optional) – If True, re-download files even if they already exist.
include (list[str], optional) – List of substrings; only files that match one of them will be downloaded.
exclude (list[str], optional) – List of substrings; files that match one of them will be skipped.
- Raises:
requests.HTTPError – If the Zenodo API request fails.
ValueError – If downloaded file checksum does not match the remote checksum.
Example
>>> import os >>> from pathlib import Path >>> from farmnet.utils import getenv >>> from farmnet.data.download import zenodo_download
>>> download_path = Path(getenv("DOWNLOAD_PATH", "./data")) >>> file_name = Path("Kelmarsh_WT_static.csv") >>> zenodo_download( ... 8252025, ... download_path, ... force_download=False, ... include=["Kelmarsh_WT_static.csv"], ... ) >>> file_path = download_path / file_name >>> file_path.exists() True >>> os.remove(file_path)