What is A Sobel Operator?

20 Sep 2017 | Computer Vision OpenCV

What is A Sobel Operator?

A Sobel Operator, or Sobel Filter, is a function that can be applied on an image to detect edges.

How does it work?

A Sobel Operator is a discrete differentiation operator, computing an approximation of the gradient of the image intensity function.

It has 2 masks/kernels: one to detect horizontal edges, and one to detect vertical edges.

Essentially you take the initial image, take a point, put the 3x3 mask on top of the 3x3 grid containing the point ,compute the matrix multiplication, and the output is a crude approximation of the gradient of the image intensity function at that point.

An image gradient is a directional change in the intensity or color in an image. The image intensity function is the gradient of a two-variable function (in this case the x and y of the image), and the Sobel operator is a relatively simple, computationally inexpensive, albeit rough approximation of this gradient.

Here is an example of a sobel edge detector that worked well:

My helpful screenshot

An Example of one that didnt work well will follow shortly!

3rd Place overall, Boilermake 4: DDR-VR

28 Jun 2017 | Virtual Reality Hackathon Unity

The weekend of January 21st, from Friday night to Sunday afternoon, I, along with 500 others, participated in Boilermake, Purdue University’s annual Hackathon. I was in a team with 3 other University of Illinois students; 2 Computer Engineers and 1 Materials Scientist.

We won 3rd place overall with our project DDR-VR. Most people have seen the arcade game Dance Dance Revolution; the player steps onto a mat with various arrows on it, and tries to step on them as arrows fly down the screen. We implemented the game as a 360 degree experience in VR; there were 4 screens, and every so often players would have to turn to face a new screen.

Here are some of my thoughts, observations, and experiences.

We arrived at the Hackathon on Friday night; I came on the University bus and reserved us some prime tables and chairs, and checked out an Oculus Rift. The rest of my team came in a car; this allowed them to bring along the DDR pad ($35 on craigslist) and the gaming computer we needed to power the VR experience.

My helpful screenshot

We immediately got to work stripping down, cleaning, and “penny modding” the DDR board, but it was simply too unreliable to play a game with. Thankfully, one of my teammates reached out to the Purdue University Dance Game club, and we were able to borrow a much more advanced (and shiny and expensive and new) game board.

My helpful screenshot

In the end, the hardware?s comprised of a Precision Dance Pad, and some 2-by-4s we screwed together with a borrowed drill. The rest of the team actually made a 7am trip to the hardware store on Saturday morning while I attended a workshop.

My helpful screenshot

The game has four playfields ? north, south, east, west ? and works by periodically disabling and re-enabling them. The player must turn to face the currently-active playfield.

We stumbled our way through Unity (first time using it!) over the weekend, starting with code from a rhythm game built by Josh Leland. We had to turn this into 4 different screens, make it work with the weird inputs from the dancepad, and make it work in VR. (and not look terrible). The software can load any simfile (the file format that holds DDR charts) after running it through a parser.

I don’t have a picture from the actual award ceremony, but we won 3rd place overall! We demo’d our system (wood crate and all) on the main stage of the award ceremony; we had a Purdue dance club member play the game!

My helpful screenshot

Research Paper Review: Clothing Parsing by Image Segmentation

27 Jun 2017 | Computer Vision Research Paper

Today at work, I was doing some research for my project when someone on the team shared this paper with me and asked me to read and understand it. It’s so far above my current understanding and skillset that I decided to write a blog post about it to help me understand what exactly I was learning.

Problem statement:

How to jointly segment the clothing images into regions of clothes and simultaneously transfer semantic tags at image level to these regions. There are several challenges in this problem:

There are an infinite variation of poses
There are a wide variety of clothing categories
Clothes and garments have a huge variety of styles and textures as compared versus other common objects.

Abstract :

This paper wants to create a system which takes in an image of a person (dressed complexly and against busy city backgrounds, as street style is wont to do) and outputs an image that identifies each item of clothing and labels what it is. Its a 2 step process; The first phase uses an exemplar-SVM (E-SVM) to extract regions on images and refines them.

The second step constructs a multi-image graphical model by taking the segmented regions as vertices, and, based on contexts on the way the clothes are being worn (ie item location and mutual interactions), labels them using the Graph Cuts algorithm.

My helpful screenshot

More In-depth Introduction:

Step 1: Iteratively refine regions grouped over all images by employing examplar-SVM (E-SVM)

First, extract superpixels and group them into regions. Most of these regions are cluttered and meaningless because of variations in clothing and humans. (A superpixel is an image patch which is better aligned with intensity edges than a rectangular patch)

Next, choose some coherent regions based on size and location.

Third, train a number of E-SVM ( essentially a math function that takes in a bunch of training inputs from training data sets, in this case a type of region from a dataset, and outputs a “yes this is the same thing” or a “no this isnt the same thing.”) classifiers for the selected regions using the HOG feature, i.e., one classifier for one region, and produce a set of region-based detectors, which are applied as top-down templates to localize similar regions over all images.

This allows for segmentations to be refined jointly because the trained E-SVM classifiers generate more coherent regions. The authors came up with this by realizing that clothing items of the same specific category often share shapes and structures.

Given all of this, however, it is very difficult to recognize then only by using supervised learning because of the large number of very specific categories and the large variations within classes.

Step 2: Phase 2 Co-labelling, Data-driven manner

First, they constructed a multi-image graphical model by taking the regions as vertices of graph.

Next, they link adjacent regions within each image as well as regions across different images, which shares similar appearance and latent semantic tags (tags coming from the dataset on the fashion website).

Thus they can borrow statisticalstrength from similar regions in different images and assign labels jointly. The optimization of co-labeling is solved by the efficient Graph Cuts algorithm that incorporates several constraints defined upon the clothing contexts.

Probabilistic Formulation

Uhhh yeah I’m not a genius; I took AP Statistics in high school, and Probability with Engineering Applications (Computer Engineering version) at the University of Illinois. I don’t understand this right now, but a high level understanding I gleaned is :

Use the probability that different superpixels are in a given region to identify different regions in an image. Group these to train E-SVMs. Then , look at the chance that similar shaped regions between different images have the same latent tags; if a lot of pictures with a roughly jacket like shape have the tag jacket, and a lot without this shape but all the others do, its probably a jacket.

PLEASE Take my the above as simply my understanding and not what it actually says because It was really confusing.

Detailed Explanation of Step 1: Unsupervised Image Co-Segmentation

The first phase iterates between 3 different steps; the optimization of R(set of regions in the image) while holding W(set of weights for the E-SVM) and C(segmentation propagation) constant, and vice versa.

i) Superpixel Grouping

The standard pipeline for superpixel grouping is a markov random field, or a set of variables that have a markov property (memoryless-ness in a stochiastic process) described by an undirected graph.

What needs to be specified is the number of regions. To automatically define this number, we replace the superpixel indicator with a set of binary variables defined on the edges between neighboring superpixels. if two superpixels whose edges meet at e belong to the same region, the indicator at e is 1. A super dumb-down of the math involved in this is “we want the most number of indicators at the most number of e’s to be 1”.

ii) Training E-SVM’s

Train an E-SVM classifier for each of the selected regions: Each selected region is considered as a positive example (exemplar), and a number of patches outside the selected region are cropped as negative examples.

In the implementation, they use HOG as the feature for each region (histogram of oriented gradients, or the idea that local object appearance and shape within an image can be described by the distribution of intensity gradients or edge directions).

The region selection indicator is determined by an automatic “noticable” detector. To be efficient, they only train the E-SVM’s for high confident foreground regions, ie those containing garment items.

iii) Segmentation Propagation

Search for possible propagations using the sliding window method. Since the E-SVM’s are trained independently, their responses may not be compatible; in this case they need to be calibrated by fitting a logistic distribution on the training set.

Detailed Explanation of Step 2: Unsupervised Image Co-Segmentation

I definitely cannot explain this much better then I did above (the math is difficult to take a stab at), but I can attempt to explain it with words.

The first step is to get information about each region we identified in the above step; its appearance, and its location with respect to other shapes.

The next step is to identify things that look similar and interact mutually. This works because some items, i.e jacket and pants will be next to each other, where as socks and a scarf will not.

The final step is to look across different images; regions in different images sharing similar appearance and locations are probably the same garment.

My helpful screenshot

Experiments

After being trained on 2 different data sets, this model is about 90 percent accurate at matching pixels to the right garments, and 65 percent accurate at identifying garments right.

Introducing the new ministrike3.github.io

24 Jun 2017 | Web

I just rebuilt my personal website

In the Fall of 2016, I hacked together a website from several different bootstrap templates and opensource JavaScript projects I found interesting. This left me with a relatively hacky, semi-functional website that was not ideal, and my blog on its own blogspot. After I began blogging with gusto at the start of the summer, I realized that I wanted to develop my own blog because of the limitations I faced with blogger. This led to me discovering Jeykll (which builds natively on github pages so woohoo). It’s also super fast (because it serves everything statically once built) and easy to use. I spent June 23-24 editing customizing, and developing my perfect Jekyll site.

There are still a few changes to be made that will be added over the next few weeks:

Editing the landing page; I already added my most recent blog post, but I’d like to add various other things to make the front page more exciting. I was thinking a 3 panel page with my most recent blogpost, my most recent event attended, and a link to whatever project I was currently working on.

This site is built on the shoulders of several projects:

Jekyll (Obviously)
Poole, The Jekyll Butler
Lanyon, the enhanced Poole

The rest of my blog posts from 2014-2017 should be imported over the next couple of weeks!

What's Jekyll?

23 Jun 2017 | Web

Jekyll is a static site generator, an open-source tool for creating simple yet powerful websites of all shapes and sizes. From the project’s readme:

Jekyll is a simple, blog aware, static site generator. It takes a template directory […] and spits out a complete, static website suitable for serving with Apache or your favorite web server.

It is also the engine behind GitHub Pages, which is where I’m hosting my site. Find out more by visiting the project on GitHub.

Learning to be a code alchemist, one experiment at a time.

What is A Sobel Operator?