Latest Posts
-
Pax Layer Basics
In this blog we are going to use a new library called praxis to create neural networks.
-
Building a classifier using JAX
In this blog, we will create a realistic dataset for binary classification. Then, we will use the JAX library along with other JAX-based ecosystem libraries like FLAX and OPTAX to train a LogisticRegression Model.
-
Using TPU runtimes in colab for JAXs
In colab, TPU runtime can be selected in the menu option
Runtime -> change runtime type -> Harware Acclerator
-
functools partial
The
functools
module is for higher-order functions: functions that act on or return other functions. In general, any callable object can be treated as a function for the purposes of this module. -
Building a simple java app with external dependencies
This blog describes, how we can build a simple java application using couple of external dependencies. This can help in decopuling certain components from a big corporate application and test it.
-
Installing tensorflow on M1 Macs
This blog provides step by step instructions for installing tensorflow on M1 Macbooks with Apple Silicon.
-
File transfer from blob storage using azure cli
Downloading and uploading files from blob should be simple but I often come across a lot of errors while doing so. This blob documents the steps that have worked for me.
-
Runing ML Training code on a VM
Krishan Subudhi 12/08/2020
-
Train a Covid19 Tweet sentiment classifier using Bert
Setup
-
Automatically activate conda evironment in Powershell for VSCode
VSCode automatically links conda environments in the integrated terminal through the python extension.
-
Host python code documentation using azure app service CI CD pipeline
This blog gives step by step guidance on.
- Create webapp for python flask
- Deploy to azure.
- Add AAD authentication.
- Create CI CD pipeline using azure devops.
- Create documentation using mkdocs.
- upload mkdocs documentation to a separate static html webapp.
- Set up CI CD for mkdocs
-
Visualizing Bert Embeddings
Set up tensorboard for pytorch by following this blog.
-
Using PyTorch 1.6 native AMP
This tutorial provides step by step instruction for using native amp introduced in PyTorch 1.6. Often times, its good to try stuffs using simple examples especially if they are related to graident updates. Scientists need to be careful while using mixed precission and write proper test cases. A single mis-step can result is model divergence or unexpected error. This tutorial uses a simple 1x1 linear layer and converts an FP32 model training to mixed precission model training. Weights and Gradients are printed at every stage to ensure correctness.
-
Zero shot NER using RoBERTA
import torch import transformers
-
Type faster using RoBERTA
The goal of the experiment is to detect and correct the mistakes during fast typing on phone while using the swipe feature. Fast gestures in swipe currently produce some wrong results and there is no flagging/correction done after a sentence is typed. User has to go back and check correctness or reduce the swiping speed. Using language models we can detect the mistakes and improve the typing speed.
-
Using Tensorboard efficiently in AzureML
Begin logging stats to tensorboard from your training scripts by following this AzureML documentation.
-
Using Tensorboard in Pytorch
Clear everything first
-
Using Tensorboard in Tensorflow-Keras (windows version)
# https://www.tensorflow.org/install/pip # !pip install tensorboard # !pip install tensorflow-cpu
-
Powershell bashrc equivalent
Linux has a file called .bashrc which gets executed whenever a new terminal starts. This .bashrc file is generally used for
-
Resize image using Python
Resize image in python using pillow library
-
Issue with Gradient accumulation while using apex (Fix included)
- Nvidia Apex is used for mixed precission training. Mixed precission training provides faster computatio using tensor cores and a lower memory footprint.
- Gradient accumulation is used to accomodate a bigger batch size than what the GPU memory supports. If my gradient accumulation is 2, I will be doing optimizer.step() once in every 2 steps. For steps where optimizer is not stepping up, only the gradients are accumulated.
- In distributed training, gradients are averaged across all the processes at every loss.backward step which is also called the all-reduce step.
- Apex mixed precission training does the communication in floating point 16.
- Even with floating point 16, doing reduction at every step can be costly. To avoid reduction at every step, an obvious optimization will be to avoid reduction when optimizer is not stepping up.
-
NVIDIA mixed precission training
Amp: Automatic Mixed Precision
-
Undo a git rebase
Suppose you did a
git rebase
in your local branch but mistakenly rebased to an older branch and pushed changes to remote, then here is the solution to revert your changes and go back to the previous state. -
Challenges of using HDInsight for pyspark
The goal was to do analysis on the following dataset using Spark without download large files to local machine.
-
Insertion transformer summary
-
Spark Quickstart on Windows 10 Machine
Apache Spark™ is a unified analytics engine for large-scale data processing.
-
PyTorch distributed communication - Multi node
WRITING DISTRIBUTED APPLICATIONS WITH PYTORCH
-
Bert Attention Visualization
#!pip install pytorch_transformers #!pip install seaborn import torch from pytorch_transformers import BertConfig,BertTokenizer, BertModel
-
How to create a new docker image
Steps to create, test and push a docker image
-
LAMB paper summary
-
Bert Memory Consumption
This document analyses the memory usage of Bert Base and Bert Large for different sequences. Additionally, the document provides memory usage without grad and finds that gradients consume most of the GPU memory for one Bert forward pass. This also analyses the maximum batch size that can be accomodated for both Bert base and large. All the tests were conducted in Azure NC24sv3 machines
-
Introduction to Transformers
What are Transformers?
-
Contingency table and Chi-squared distribution
Contingency table
Contingency table and chi squared distributions are used to determing if two categorical varibles are independent or not.
-
Azure Machine Learning Tutorial
Original Documentation Video link: https://channel9.msdn.com/Events/Connect/Microsoft-Connect–2018/D240/
-
PyTorch IsRead Predictor on my email
This is a GRU based RNN classifier to predict the read probability of a user from his/her email data.
-
PyTorch BERT
#! pip install pytorch-pretrained-bert
Using BERT
-
PyTorch RNN
A recurrent neural network (RNN) is a class of artificial neural network where connections between units form a directed cycle.
-
Word analogy using Glove Embeddings
Word Embeddings
-
Convolution Explained
Convolution
Convolution is the building block of Convolutional Neural Networks (CNN). CNNs are used both for image and text processing. Online diagrams do a great job explaining CNNs. I, however failed to find a good diagram with explanation of the convolution operation. This diagram aims to explains the details of convolution operation in a neural networks. I have also provided python scripts explaining details of the convolution operation inside pytorch.
-
Activation and Loss function implementations
Deep learning Functions
-
How to use Jekyl!
-
Quickstart
- Install a full Ruby development environment
-
Install Jekyll and bundler gems
gem install jekyll bundler bundle add jekyll-sitemap bundle install
-
Create a new Jekyll site at ./myblog
jekyll new myblog
-
Change into your new directory
cd myblog
-
Build the site and make it available on a local server
bundle exec jekyll serve
-
-
Python Flask web application in azure linux
Even though the tutorial involves azure, the instructions will work in any ubuntu based linux machine.