TensorBoard¶
TensorBoard helps visualizing your experiments. You bring up a TensorBoard
session on your workstation and point to the directory that contains the TensorBoard logs.
Prerequisite
Object storage bucket
Access to Object Storage bucket from your workstation
ocifs
version 1.1.0 and above
Setting up local environment¶
It is required that tensorboard
is installed in a dedicated conda environment or virtual environment. Prepare an environment yaml file for creating conda environment with following command -
cat <<EOF > tensorboard-dep.yaml
dependencies:
- python=3.8
- pip
- pip:
- ocifs
- tensorboard
name: tensorboard
EOF
Create the conda environment from the yaml file generated in the preceeding step
conda env create -f tensorboard-dep.yaml
This will create a conda environment called tensorboard. Activate the conda environment by running -
conda activate tensorboard
Viewing logs from your experiments¶
To launch a TensorBoard session on your local workstation, run -
export OCIFS_IAM_KEY=api_key # If you are using resource principal, set resource_principal
tensorboard --logdir oci://my-bucket@my-namespace/path/to/logs
This will bring up TensorBoard app on your workstation. Access TensorBoard at http://localhost:6006/
Note: The logs take some initial time (few minutes) to reflect on the tensorboard dashboard.
Writing TensorBoard logs to Object Storage¶
Prerequisite
tensorboard
is installed.ocifs
version is 1.1.0 and above.oracle-ads
version 2.6.0 and above.
PyTorch¶
You could write your logs from your PyTorch
experiements directly to object storage and view the logs on TensorBoard running on your local workstation in real time. Here is an example or running PyTorch experiment and writing TensorBoard logs from OCI Data Science Notebook
Create or Open an existing
OCI Data Science Notebook
sessionRun
odsc conda install -s pytorch110_p37_cpu_v1
on terminal inside the notebook sessionActivate conda environment -
conda activate /home/datascience/conda/pytorch110_p37_cpu_v1
Install TensorBoard -
python3 -m pip install tensorboard
Upgrade to latest
ocifs
-python3 -m pip install ocifs --upgrade
Create a notebook and select
pytorch110_p37_cpu_v1
kernelCopy the following code into a cell and update the object storage path in the code snippet
# Reference: https://github.com/pytorch/tutorials/blob/master/recipes_source/recipes/tensorboard_with_pytorch.py
import torch
from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter("oci://my-bucket@my-namespace/path/to/logs")
x = torch.arange(-5, 5, 0.1).view(-1, 1)
y = -5 * x + 0.1 * torch.randn(x.size())
model = torch.nn.Linear(1, 1)
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr = 0.1)
def train_model(iter):
for epoch in range(iter):
y1 = model(x)
loss = criterion(y1, y)
writer.add_scalar("Loss/train", loss, epoch)
optimizer.zero_grad()
loss.backward()
optimizer.step()
train_model(10)
writer.flush()
writer.close()
Run the cell
View the logs from you workstation while the experiement is in progress by lauching TensorBoard with following command -
OCIFS_IAM_TYPE=api_key tensorboard --logdir "oci://my-bucket@my-namespace/path/to/logs"
For more possibilities with TensorBoard and PyTorch check this link
TensorFlow¶
Currently TensorFlow cannot write directly to object storage. However, we can create logs in the local directory and then copy the logs over to object storage, which then can be viewed from the TensorBoard running on your local workstation.
When you run a OCI Data Science Job
with ads.jobs.NotebookRuntime
or ads.jobs.GitRuntime
, all the output is automatically copied over to the configured object storage bucket.
OCI Data Science Notebook¶
Here is an example of running a TensorFlow experiment in OCI Data Science Notebook
and then viewing the logs from TensorBoard
Create or open an existing notebook session.
Download notebook - https://raw.githubusercontent.com/mayoor/stats-ml-exps/master/tensorboard_tf.ipynb
!wget https://raw.githubusercontent.com/mayoor/stats-ml-exps/master/tensorboard_tf.ipynb
Run
odsc conda install -s tensorflow27_p37_cpu_v1
on terminal to install TensorFlow 2.6 environment.Open the downloaded notebook -
tensorboard_tf.ipynb
Select
tensorflow27_p37_cpu_v1
kernel.Run all cells.
Copy TensorBoard logs folder -
tflogs
to object storage usingoci-cli
oci os object bulk-upload -bn "<my-bucket>" -ns "<my-namespace>" --src-dir tflogs --prefix myexperiment/tflogs/
View the logs from you workstation once the logs are uploaded by lauching the TensorBoard with following command -
OCIFS_IAM_TYPE=api_key tensorboard --logdir "oci://my-bucket@my-namespace/myexperiment/tflogs/"
OCI Data Science Jobs¶
Here is an example of running a TensorFlow experiment in OCI Data Science Jobs
and then viewing the logs from TensorBoard
Run the following code to submit a notebook to
OCI Data Science Job
. You could run this code snippet from your local workstation orOCI Data Science Notebook
session. You needoracle-ads
version >= 2.6.0.
from ads.jobs import Job, DataScienceJob, NotebookRuntime
# Define an OCI Data Science job to run a jupyter Python notebook
job = (
Job(name="<job_name>")
.with_infrastructure(
# The same configurations as the OCI notebook session will be used.
DataScienceJob()
.with_log_group_id("oci.xxxx.<log_group_ocid>")
.with_log_id("oci.xxx.<log_ocid>")
.with_project_id("oci.xxxx.<project_ocid>")
.with_shape_name("VM.Standard2.1")
.with_subnet_id("oci.xxxx.<subnet-ocid>")
.with_block_storage_size(50)
.with_compartment_id("oci.xxxx.<compartment_ocid>")
)
.with_runtime(
NotebookRuntime()
.with_notebook("https://raw.githubusercontent.com/mayoor/stats-ml-exps/master/tensorboard_tf.ipynb")
.with_service_conda("tensorflow27_p37_cpu_v1")
# Saves the notebook with outputs to OCI object storage.
.with_output("oci://my-bucket@my-namespace/myexperiment/jobs/")
)
).create()
# Run and monitor the job
run = job.run().watch()
View the logs from you workstation once the jobs is complete by lauching the tensorboard with following command -
OCIFS_IAM_TYPE=api_key tensorboard --logdir "oci://my-bucket@my-namespace//myexperiment/jobs/tflogs/"