
Hugging Face contains a number of useful packages that can be used in various machine learning applications. The full list is provided in the Hugging Face Docs. To use these packages, you need to install them into a virtual environment or conda environment. The instructions for two example packages, Transformers and Datasets, are provided below.
Sections
Installation
Python Virtual Environment
- Create and activate your virtualenv as explained on the Python Installs page.
- Set the following path to specify a location (other than your home directory) to store cached datasets and models. The path should be in the
/projectnb
or/rprojectnb
directory for your project, preferably in a subfolder that you’ve created:(my_newenv) [rcs@scc1 ~] export HF_HOME=/projectnb/YOUR_PROJECT/YOUR_FOLDER/hf_cache
For further details on dataset cache setup please see the Hugging Face Docs.
- Install Transformers and Datasets with
pip
:(my_newenv) [rcs@scc1 ~] pip install transformers datasets
Conda Environment
- Create and activate your conda env as explained on the Miniconda Installs page.
- Set the following path to specify a location (other than your home directory) to cache downloads as described above:
(my_newenv) [rcs@scc1 ~] export HF_HOME=/projectnb/YOUR_PROJECT/YOUR_FOLDER/hf_cache
For further details on dataset cache setup please see the Hugging Face Docs.
- Install Transformers and Datasets with
conda install
:(my_newenv) [rcs@scc1 ~] conda install -c huggingface -c conda-forge transformers datasets
Setting HF_HOME in Python
The value of the cache directory can be set in a Python script. To do so, insert the following lines of code into your Python script before any Hugging Face libraries are imported:
import os
os.environ['HF_HOME'] = '/projectnb/YOUR_PROJECT/YOUR_FOLDER/hf_cache'