In the Colab project, Google is providing some pretty impressive free computer resources (free compute, storage, and networking). Anyone with a Google identity (say, a gmail address) will be provisioned with up to two simultaneous VMs. Google kills off machines after at most 12 hours (that's to prevent crypomining squatters from infesting the system). (Google's FAQ is noncommital about QoS, but then it is free.)
Each VM can have an GPU (optionally even a TPU, subject to availability). Google Drive starts at 15 GB free presistant storage. The VMs can have as much as 350 GB of transient storage. And all that comes no admin costs except deciding who to share documents with.
Think of Colab as the newest member of the Google office apps suite: gMail, Sheets, Docs, Slides, etc. Colab is Google bringing Jupyter into their stable. Whereas in Sheets, Google runs arbitrary user code in JavaScript on spreadsheets, in Colab Google runs arbitrary user code in Python on Jupyter notebooks.
This project is a thought experiment leveraging Colab to a logical conclusion: a fully static website, with Colab as an optional execution backend which will crunch hard for 12 hours on, say, micrography object recognition or transcriptomic visualization.
Integrating Google Colaboratory into Your Machine Learning Workflow is an example of how folks are adopting Colab into their development environment.
!pip install humanize
!pip install gputil
!pip install psutil
import humanize
import platform
import psutil
import os
# Request TF 2.x, not 1.x
try:
# %tensorflow_version is a Colab-only thing
%tensorflow_version 2.x
except Exception:
print("TensorFlow 2.x does not seem to be available")
a_message = "Python runtime version: " + platform.python_version()
print(a_message)
What packages are installed for the detected running version of Python?
python_major_version = int(platform.python_version_tuple()[0])
print(python_major_version)
if python_major_version == 3:
print("Python 3.6 dist-packages")
!ls /usr/local/lib/python3.6/dist-packages
else:
# Python 2 it is...
print("Python 2.7 dist-packages")
!ls /usr/local/lib/python2.7/dist-packages
Colab comes pre-configured with many packages. (Oddly though, not conda
.) This proves to be a real convenience for running random notebooks without lengthy installs. On the otherhand, pretty much anything can be installed via pip
, conda
, etc. But a long install really kills the enjoyment flow.
pip freeze
will list the pre-installed Python packages:
!pip freeze
!jupyter-kernelspec list
!lscpu | grep "^CPU(s):"
More details on the CPU(s):
!cat /proc/cpuinfo
!cat /proc/meminfo | head -n3
RAM info humanized:
process = psutil.Process(os.getpid())
print("RAM Free: " + humanize.naturalsize( psutil.virtual_memory().available ), " | Proc size: " + humanize.naturalsize( process.memory_info().rss))
File system
As with AWS Lambda, there are intentionally few switches for selecting VM options. Memory and CPU are provided as matching packages, not independently configurable.
So, depending on what you ask for in terms of compute (CPU, GPU, TPU) you get more or less file system memory [*]. Note: for all compute options, the OS files initialize to consuming about 25MB of the file system before you are dropped into the kernel.
On 2019-05-19, the following tests gave these results:
Processor | FS Free GB | FS Total GB |
---|---|---|
CPU | 24 | 49 |
GPU | 318 | 359 |
TPU | 26 | 49 |
The low FS size for the TPU is probably because for the TPU case, those (the actually TPU boards) are separate machines while for GPUs those are a part of the machine the notebook is running on. So, for the CPU and the TPU options, Google is probably providing the same VM, ergo the file systems are essentially the same size.
!df -h .
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
print("GPU detected: NONE")
else:
print('GPU detected: {}'.format(device_name))
Assuming we have a GPU, what kind? The answer is in the final line of the following cell's output:
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
!ln -sf /opt/bin/nvidia-smi /usr/bin/nvidia-smi
import GPUtil as GPU
gpus = GPU.getGPUs()
if len(gpus) > 0:
gpu = gpus[0]
print("GPU RAM:\n Free: {0:.0f}MB | Used: {1:.0f}MB | Util {2:3.0f}% | Total {3:.0f}MB".format(gpu.memoryFree, gpu.memoryUsed, gpu.memoryUtil*100, gpu.memoryTotal))
else:
print("GPU detected: NONE")
!nvcc --version
!conda install tsnecuda cuda100 -c cannylab