# Model for testing RM scripts This model is just GPT2 base (~100M param) with a value head appended, untrained. Use this for debugging RLHF setups (could make a smaller one too). The predictions should be somewhat random. Load the model as follows: ``` from transformers import AutoModelForSequenceClassification rm = AutoModelForSequenceClassification.from_pretrained("natolambert/gpt2-dummy-rm") ``` or as a pipeline ``` from Transformers import pipeline reward_pipe = pipeline( "text-classification", model="natolambert/gpt2-dummy-rm", # revision=args.model_revision, # model_kwargs={"load_in_8bit": True, "device_map": {"": current_device}, "torch_dtype": torch.float16}, ) reward_pipeline_kwargs = {} pipe_outputs = reward_pipe(texts, **reward_pipeline_kwargs) ```