Edit model card

outputs

This model is a fine-tuned version of microsoft/Phi-3.5-mini-instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6456

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0009
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 15
  • training_steps: 150

Training results

Training Loss Epoch Step Validation Loss
1.8177 1.1111 5 1.4564
1.1218 2.2222 10 0.9293
0.8806 3.3333 15 0.8302
0.6797 4.4444 20 0.8546
0.4134 5.5556 25 0.9876
0.1811 6.6667 30 1.2165
0.162 7.7778 35 1.3668
0.11 8.8889 40 1.5960
0.0843 10.0 45 1.4322
0.0495 11.1111 50 1.4248
0.0338 12.2222 55 1.4805
0.024 13.3333 60 1.6548
0.0365 14.4444 65 1.6456

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.2
  • Pytorch 2.4.0+cu121
  • Datasets 3.0.0
  • Tokenizers 0.19.1
Downloads last month
64
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for Jlonge4/outputs

Adapter
this model