大语言模型(LLaMa、qwen等)进行微调时,考虑到减少显存占用,会使用如下方式加载模型。

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(
    model_dir,
    use_cache=False,
    device_map="cuda:0",
    torch_dtype=torch.bfloat16,
    quantization_config=quantization_config)

网上搜了若干方法,依旧报错,信息大致如下:

RuntimeError:
CUDA Setup failed despite GPU being available. Please run the following command to get more information: 
python -m bi​tsandbytes
Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them to your LD_LIBRARY_PATH. ...


attributeerror: module 'bitsandbytes.nn' has no attribute 'linear4bit'

the `load_in_4bit` and `load_in_8bit` arguments are deprecated 
and will be removed in the future versions. 
please, pass a `bitsandbytesconfig` object in `quantization_config` argument instead.

attributeerror: 'nonetype' object has no attribute 'cquantize_blockwise_bf16_nf4'

the installed version of bitsandbytes was compiled without gpu support. 8-bit optimizers, 8-bit multiplication, and gpu quantization are unavailable.

终极解决办法:

pip uninstall bitsandbytes
pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.0-py3-none-win_amd64.whl

关于其他版本可自行查看下载

https://github.com/jllllll/bitsandbytes-windows-webui/releases/

最后:

小编会不定期发布相关设计内容包括但不限于如下内容:信号处理、通信仿真、算法设计、matlab appdesigner,gui设计、simulink仿真......希望能帮到你!

Logo

技术共进,成长同行——讯飞AI开发者社区

更多推荐