> 文章列表 > 大语言模型-微调chatglm6b

大语言模型-微调chatglm6b

大语言模型-微调chatglm6b

使用lora微调chatglm6b

https://github.com/yuanzhoulvpi2017/zero_nlp/tree/main/simple_thu_chatglm6b

还是来自上一篇文章documentsearch的作者

thuglm

LLM部分的代码有三部分:模型配置,模型主体和模型量化

配置主要还是和数据、模型参数有关


class ChatGLMConfig(PretrainedConfig):r"""This is the configuration class to store the configuration of a [`~ChatGLMModel`].It is used to instantiate an ChatGLM model according to the specified arguments, defining the modelarchitecture. Instantiating a configuration with the defaults will yield a similar configuration to that ofthe ChatGLM-6B [THUDM/ChatGLM-6B](https://huggingface.co/THUDM/chatglm-6b) architecture.Configuration objects inherit from  [`PretrainedConfig`] and can be usedto control the model outputs. Read the documentation from  [`PretrainedConfig`]for more information.Args:vocab_size (`int`, *optional*, defaults to 150528):Vocabulary size of the ChatGLM-6B model. Defines the number of different tokens that can be represented by the`inputs_ids` passed when calling [`~ChatGLMModel`] or[`~TFChatGLMModel`].hidden_size (`int`, *optional*, defaults to 4096):Dimension of the encoder layers and the pooler layer.num_hidden_layers (`int`, *optional*, defaults to 28):Number of hidden layers in the Transformer encoder.num_attention_heads (`int`, *optional*, defaults to 32):Number of attention heads for each attention layer in the Transformer encoder.inner_hidden_size (`int`, *optional*, defaults to 16384):Dimension of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.max_sequence_length (`int`, *optional*, defaults to 512):The maximum sequence length that this model might ever be used with.Typically set this to something large just in case (e.g., 512 or 1024 or 2048).layernorm_epsilon (`float`, *optional*, defaults to 1e-5):The epsilon used by the layer normalization layers.use_cache (`bool`, *optional*, defaults to `True`):Whether the model should return the last key/values attentions (not used by all models).Example:```python>>> from configuration_chatglm import ChatGLMConfig>>> from modeling_chatglm import ChatGLMModel>>> # Initializing a ChatGLM-6B THUDM/ChatGLM-6B style configuration>>> configuration = ChatGLMConfig()>>> # Initializing a model from the THUDM/ChatGLM-6B style configuration>>> model = ChatGLMModel(configuration)>>> # Accessing the model configuration>>> configuration = model.config```
"""model_type = "chatglm"def __init__(self,vocab_size=150528,hidden_size=4096,num_layers=28,num_attention_heads=32,layernorm_epsilon=1e-5,use_cache=False,bos_token_id=150004,eos_token_id=150005,mask_token_id=150000,gmask_token_id=150001,pad_token_id=0,max_sequence_length=2048,inner_hidden_size=16384,position_encoding_2d=True,quantization_bit=0,pre_seq_len=None,prefix_projection=False,kwargs):self.num_layers = num_layersself.vocab_size = vocab_sizeself.hidden_size = hidden_sizeself.num_attention_heads = num_attention_headsself.max_sequence_length = max_sequence_lengthself.layernorm_epsilon = layernorm_epsilonself.inner_hidden_size = inner_hidden_sizeself.use_cache = use_cacheself.bos_token_id = bos_token_idself.eos_token_id = eos_token_idself.pad_token_id = pad_token_idself.mask_token_id = mask_token_idself.gmask_token_id = gmask_token_idself.position_encoding_2d = position_encoding_2dself.quantization_bit = quantization_bitself.pre_seq_len = pre_seq_lenself.prefix_projection = prefix_projectionsuper().__init__(pad_token_id=pad_token_id,bos_token_id=bos_token_id,eos_token_id=eos_token_id,kwargs)

模型部分基本和huggingface开源的gpt2类似

量化的代码主要是为了减少推理对硬件的要求,让更多人可以尝试。可以最后进行使用

finetune

  • 数据使用汉语alpaca
  • peft减少资源使用

推理

分布式模型

  • https://github.com/yuanzhoulvpi2017/zero_nlp/tree/main/Chatglm6b_ModelParallel
  • https://github.com/yuanzhoulvpi2017/zero_nlp/tree/main/Chatglm6b_ModelParallel_ptuning