View on GitHub

Kiya-Shiota Laboratory

Rinko Seminar (2021)

Table of Contents

Chapter 1: Basics

The following 5 questions are just to warm you up in programming with Python.

Q.1: FizzBuzz

Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.

Q.2: Quick Sort

Implement a quick sort (divide-and-conquer) algorithm. You can read about it here.

Sample Input: [9, 8, 7, 5, 6, 3, 1, 2, 4]

Sample Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]

Q.3: Missing Element

You are given two arrays. One is a shuffled version of another one, but missing one element. Write a program to find a missing element.

Sample Input: [2, 3, 4, 5, 6, 7, 5, 8], [6, 8, 7, 4, 5, 2, 3]

Sample Output: 5

Q.4: Pair Sum

You are given an array. Write a program to output all possible pairs that sum to a specific value k.

Sample Input: [1, 3, 2, 2], k = 4

Sample Output: (1, 3) (2, 2)

Q.5: Multiplication Table

Write a program that outputs a multiplication table like the following picture.

Multiplication Table

The following 5 questions are for NumPy. If you have used MATLAB, you will be just fine. The question are extracted from numpy-100 and you can practise more if you have time.

Q.6

Create a checkerboard 8x8 matrix using the tile function.

Q.7

Normalize a 5x5 random matrix.

Q.8

Multiply a 5x3 matrix by a 3x2 matrix (real matrix product)

Q.9

Given a 1D array, negate all elements which are between 3 and 8, in place.

Q.10

Consider two random array A and B, check if they are equal.

Chapter 2: Data Processing

In this chapter, we will do the first 10 knocks of image processing. If you don’t understand, please read the original image processing knocks which are originally in Japanese.

Q.1: Channel Swapping

Change the channel order from RGB -> BGR.

Input Output

Q.2: Grayscale

Convert a color image to a grayscale one. The linear formula is

Y = 0.2126 R + 0.7152 G + 0.0722 B

Input Output

Q.3: Binarization

Binarize an image given the threshold is 128.

Input Output

Q.4: Binarization of Otsu

This is an automatic thresholding algorithm by minimizing intra-class intensity variance or maximizing inter-class variance.

Input Output

Q.5: HSV Conversion

RGB -> HSV and HSV -> RGV

In this case, invert the hue H (add 180) and display it as RGB and display the image.

Input Output

Q.6: Discretization of Color

Quantize the image as follows.

val = {  32  (0 <= val < 63)
         96  (63 <= val < 127)
        160  (127 <= val < 191)
        224  (191 <= val < 256)
Input Output

Q.7: Average Pooling

Perform an average pooling of 128x128 image by 8x8 kernel.

Input Output

Q.8: Max Pooling

Perform a max pooling of 128x128 image by 8x8 kernel.

Input Output

Q.9: Gaussian Filter

Implement the Gaussian filter (3 × 3, standard deviation 1.3) and remove the noise of a noisy image.

Input Output

Q.10: Median Filter

Implement the median filter (3x3) and remove the noise of a noisy image.

Input Output

Chapter 3: Neural Networks

Q.1: Linear Regression

Q.2: Softmax Regression

Q.3: Multilayer Perceptrons

Q.4: Regularization

We do not want the model to memorize the training data. We will use generalization techniques to improve generalization of the model. Implement the following reqularization techniques by using a synthetic dataset or any dataset you like. Compare training with regularization and without regularization.

Chapter 4: Convolutional Neural Networks

Q.1: 2D Convolution

Q.2: Edge detection

Q.3: Padding

Q.4: Stride

Q.5: Pooling

Q.6: LeNet

Q.7: AlexNet

Comparison of LeNet and AlexNet Reference

Q.8: VGG

VGG implements the idea of using blocks. Reference

Q.9: NiN Blocks

Q.10: GoogleNet (Inception)

Q.11: Batch Normalization

Q.12: ResNet

Q.13: DenseNet

Chapter 5: Recurrent Neural Networks

Q.1 Text Preprocessing

Q.2 Reading Sequence Data

Q.3 Character-level Language Model by RNN

Q.4 Gated Recurrent Units (GRU)

Q.5 Long Short-Term Memory (LSTM)

Chapter 6: Generative Models

Q.1 Autoencoders

Q.2 Variational Autoencoders (VAE)

Q.3 Generative Adversarial Networks (GAN)

Q.4 Deep Convolutional GAN (DCGAN)

Main features of DCGAN are:

Q.5 Wasserstein GAN (WGAN)

Reference Paper: WGAN

Q.6 Conditional GAN

Reference Paper: Conditional GAN

In the following task, you need to train a GAN with CelebaA dataset. Then use that GAN for controllable generation as follows. Here is an example of controllable generation with a pre-trained classifier.

Q.7 Cycle GAN

Reference Paper: Cycle GAN

Q.8 GAN Evaluation

Explain how to evaluate GANs.

Explore the following metrics for evaluating GANs.

Explain how they works and evaluate your previously trained GAN with CIFAR-10 with the above metrics.

Chapter 7: Attention Mechanisms

Q.1 Attention Pooling

Reference code and example

Q.2 Attention Scoring Functions

Reference code and example

Q.3 Bahadanau Attention

To understand this attention mechanism, we need to consider a language translation problem (encoder-decoder architecture).

Reference code and example

Q.4 Multi-Head Attention

Reference code and example

Q.5 Self Attention and Positional Encoding

Reference code and example

Q.6 Transformer

Reference code and example

Q.7 Vision Transformer

Reference Paper: ViT