Beta Distribution


In this post, I am going to talk about Beta Distribution and some intuitive interpretations behind it.

An Example
Suppose we have two coins (A and B), and we are making a statistical experiment to identify whether these coins are biased or not. For coin A, we tossed 5 times and the results are: 1,0,0,0,0. (1 indicates Head and 0 indicates Tail). For coin B, we tossed 10 times and the results are: 1,1,0,0,0,0,0,0,0,0. The probability for theses two coins to be Tail are identical: 0.2. Is it safe to say, both coins equally favour the Tail?

Continue reading “Beta Distribution”

Solved: not able to connect to Google Compute Engine on port 22

Google has provided its infrastructure via Google Cloud Platform to developers for building applications, as an alternative to Amazon Web Service. The setup of a new instance is simple and intuitive, however, it is frustrating not being able to connect to the instance via ssh. After some research, I found out how to resolve the problem, and I hope it can help others who are having the same issue.

The Problem
Connection Failed: We are not able to connect to the VM on port 22.
I could not even telnet IP 22

The Solution
1. Activate Google Cloud Shell
2. Add a rule to the firewall to allow traffic from port 22:
gcloud compute firewall-rules create allowssh –allow tcp:22 –source-ranges

How to read a paper

The first principle is that you must not fool yourself — and you are the easiest person to fool.  – Richard Feynman

It is often helpful to read with questions in mind. This post summarizes a list of questions worthy asking while reading a paper. I would like to make this post a living document about how to read a paper, as I read more materials and gain more understanding of scientific research. The content of this post is largely from the references listed in the end. I greatly acknowledge the contributions from the references. Continue reading “How to read a paper”

Understanding the prefix-function of the Knuth-Morris-Pratt algorithm

Intuition is nothing but the outcome of earlier intellectual experience.    – Albert Einstein

There are plenty of tutorials about the ideas of the KMP algorithm, leaving us the wonder and owe of its magnificent simplicity and effectiveness. But one important part is frequently missing – why the prefix function can be computed in that particular way; what is the rationale behind the computation of the prefix function. This post aims to demystify the prefix function and elaborate on the proofs to understand why it works.

Let’s first review the computation of the prefix function, as shown below.
Screen Shot 2016-04-20 at 5.37.41 PM Continue reading “Understanding the prefix-function of the Knuth-Morris-Pratt algorithm”

How to do research

One of the biggest questions that graduate students have is “how to do research”. This post summarizes resources online about how to do research, and I hope it is helpful to the audience.

[1] How to do research at MIT AI lab link
reading, making connections, learning other fields, notebooks, writing, talks,  programming, advisors, thesis, research methodology, emotional factors.

[2] How to do research (advice) link
thinking of the question, answering the question, communicating the answer

[3] How to do graduate-level research: some advice link
Personal and professional principles: motivations and goals, tracking progress, time management, dealing with people, collaboration and mentoring, quality, attitude, dealing with failure, taking advantage of opportunities
The craft of research: keeping a notebook, reading, listening, talking, writing, programming, mathematical analysis, background subject knowledge
The art of research: identifying a problem, formulating a well-defined problem, thinking about a research problem, your advisor, the thesis


A Tutorial on Restricted Boltzmann Machines

Restricted Boltzmann Machines (RBMs) is a popular unsupervised building block of Deep Learning Architectures. Despite its popularity, it takes efforts to grasp the concept. This post aims at providing an introduction to RBMs, from a somewhat mathematical point of view. Most of the formulas here are from [1].

Boltzmann Machines are energy-based models [2] where the joint probability distribution is characterized by a scalar energy to each configuration of the variables. In energy-based models, inference consists of clamping the the values of a set of variables, and finding configurations of the remaining variables that minimize the energy function; learning consists in finding an energy function that minimizes the energy function of observed variables.  Boltzmann machines are also probabilistic graphical models using graph-based representation as the basis for encoding the distribution. Restricted Boltzmann Machines is a type of Boltzmann machine with special constraints – only a certain type of connection is allowed.
This post starts by introducing energy-based models, including the graphical representation of the model and its learning with gradient descent of log-likelihood. This post then discusses Boltzmann machines by placing a specific energy function in energy-based models. Restricted Boltzmann Machines are further discussed with the introduction of restrictions in Boltzmann Machines.

It is worthwhile to mention that there are three important notations in this post: x, h and yx represents a list of input variables taking the form x=\left \{ x_{1}, x_{2},\dots ,x_{N} \right \}, where x_{i} denotes the i-th input variable. h represents a list of hidden variables taking the form h=\left \{ h_{1}, h_{2},\dots ,h_{N} \right \}, where h_{i} denotes the i-th hidden variable. y represents the label of a given input.
As an example, for the problem of image recognition, x are the images of interest where x_{i} are the individual pixels from an image xh are the hidden features/descriptors that serve as a high-level representations of image. Finally, y are the labels of the images.
Continue reading “A Tutorial on Restricted Boltzmann Machines”

Histograms of Oriented Gradients Tutorial with Matlab

Histograms of Oriented Gradients is a feature extraction method that can generate descriptors from images. This blog post aims at providing illustrative examples of HOG with Matlab, as well as discussing its interesting characteristics.

HOG Person Detector Tutorial
I have found a very nice and intuitive tutorial here. In this post, I will focus on the illustrative examples with Matlab.

Key Idea
The key idea of HOG is that, “local object appearance and shape can often be characterized rather well by the distribution of local intensity gradients or edge directions, even without precise knowledge of the corresponding gradient or edge positions” [1], meaning the distribution of gradients can represent the appearance of an image to some extent. The transformation from images to HOG achieves invariance to local geometric and photometric transformations by ignoring specific image details while remaining the distribution of gradients.
Continue reading “Histograms of Oriented Gradients Tutorial with Matlab”