Training Large Language Model ***************************** .. versionadded:: 2.8.8 Oracle Cloud Infrastructure (OCI) `Data Science Jobs (Jobs) `_ provides fully managed infrastructure to enable training large language model at scale. This page shows an example of fine-tuning the `Llama 2 `_ model. For model details on the APIs, see :doc:`../jobs/run_pytorch_ddp`. .. include:: ../_template/distributed_training_policies.rst The `llama-recipes `_ repository contains example code to fine-tune llama2 model. The example `fine-tuning script `_ supports both full parameter fine-tuning and `Parameter-Efficient Fine-Tuning (PEFT) `_. With ADS, you can start the training job by taking the source code directly from Github with no code change. Access the Pre-Trained Model ============================ To fine-tune the model, you will first need to access the pre-trained model. The pre-trained model can be obtained from `Meta `_ or `HuggingFace `_. In this example, we will use the `access token `_ to download the pre-trained model from HuggingFace (by setting the ``HUGGING_FACE_HUB_TOKEN`` environment variable). Fine-Tuning the Model ===================== You can define the training job with ADS Python APIs or YAML. Here the examples for fine-tuning full parameters of the `7B model `_ using `FSDP `_. .. include:: ../jobs/tabs/llama2_full.rst You can create and start the job run API call or ADS CLI. .. include:: ../jobs/tabs/run_job.rst The job run will: * Setup the PyTorch conda environment and install additional dependencies. * Fetch the source code from GitHub and checkout the specific commit. * Run the training script with the specific arguments, which includes downloading the model and dataset. * Save the outputs to OCI object storage once the training finishes. Note that in the training command, there is no need specify the number of nodes, or the number of GPUs. ADS will automatically configure that base on the ``replica`` and ``shape`` you specified. The fine-tuning runs on the `samsum `_ dataset by default. You can also `add your custom datasets `_. Once the fine-tuning is finished, the checkpoints will be saved into OCI object storage bucket as specified. You can `load the FSDP checkpoints for inferencing `_. The same training script also support Parameter-Efficient Fine-Tuning (PEFT). You can change the ``command`` to the following for PEFT with `LoRA `_. Note that for PEFT, the fine-tuned weights are stored in the location specified by ``--output_dir``, while for full parameter fine-tuning, the checkpoints are stored in the location specified by ``--dist_checkpoint_root_folder`` and ``--dist_checkpoint_folder`` .. code-block:: bash torchrun llama_finetuning.py --enable_fsdp --use_peft --peft_method lora \ --pure_bf16 --batch_size_training 1 \ --model_name meta-llama/Llama-2-7b-hf --output_dir /home/datascience/outputs