부스트캠프 4주차 학습 일지 - Computer Vision Basics

사용한 기술 스택들:

10/11 화

학습한 것들:

AI consists of cognition&percention, memory&inference, decision making, reasoning

Using multi-modal association for perception

Vision is important because 75% of data comes from vision data

Computer vision is the inverse of computer rendering

Using good and bad of our visual perception to make a CV model that compensates for the imperfection

Old machine learning used feature extraction by people, but deep learning does not require feature extraction by human

CVPR is a top 5 publication in STEM and it is getting a lot of attention from companies

since we cannot memorize all the data in the world, we cannot use simple tools like k nearest neighbors

A single fully connected layer network cannot account for new data

CNN: looking at part of an image to extract feature

- as it shares parameters, it is flexible to change in location

Dataset is always almost biased

- training datasets and real data always have a gap

To fill this gap, a technique of increasing the number of datasets called data augmentation is used

Applying various image transformations to the dataset: crop, brightness, rotate, flip, affine transformation, cut mix

A technique called RandAug randomly uses transformation and finds out which one works the best

Annotating data is very expensive, so using the pretrained model can solve the problem

- Knowledge from one dataset can be used for another dataset

Approaches to Transfer Learning:

Freezer other layers and train only the last fully connected layer
1. preserves the knowledge from pretrained data
Set a low learning rate for other layers and a high learning rate for the last fully connected layer
Teacher-student learning: use a pretrained model as a teacher to teach to a model that is not trained. this can be done to do unsupervised learning.
1. For labeled data, use soft label(percentage) to make a loss for both teacher and student

Softmax with temperature: allow for non-extreme outcomes

Normal softmax: $\frac{exp(z_i)}{\sum_{j} exp(z_j)}$
Softmax with temperature: $\frac{exp(z_i/T)}{\sum_{j} exp(z_j/T)}$

회고:

Computer Vision 열심히 배워서 나중에 써먹어야지

10/12 수

학습한 것들:

Deeper networks learn more powerful features

However, the deeper the layer, the gradient vanishes or explodes, it is complex to compute, and it degrades

GoogLeNet: implement inception module that uses convolution of different filters in parallel

- 1x1 conv changes channel size

Auxiliary classifier: classifier in the middle of the layer to solve the problem of vanishing/exploding gradient

Degradation problem: as network depth increases, accuracy gets saturated

The solution to the degradation problem: add input X to the target function so that the identity remains the same

- called a residual block that uses a shortcut connection

- has $ O(2^n)$ paths

In the dense block, every output of each layer is concatenated along the channel axis to account for the vanishing gradient problem.

회고:

배운게 진짜 많아서 차근차근 복습을 해봐야 할 것 같다.

10/13 목

학습한 것들:

Semantic segmentation: classifying each pixel of an image into a category

- Do not care about the object, but only about the semantic category

Fully Convolutional Networks(FCN): no fully connected layer

A fully connected layer outputs a fixed dimensional vector that discards spatial coordinate

The fully convolutional layer outputs a classification map that has spatial coordinate

1x1 convolution layer classifies every feature vector of the convolutional feature map

- To solve the problem of having a low-resolution predicted score map, use upsampling to the size of the input image

Methods of upsampling:

Transposed convolution: inversed convolution operation
1. checkerboard artifact due to uneven overlapping
Upsampling+convolution: interpolation followed by convolution

Adding a skip connection to the convolutional network can preserve higher spatial resolution

U-Net: FCN that predicts a dense map by concatenating feature maps from contracting path

- more precise segmentations

- repeatedly applying 2x2 up-convolution

- as expanding, the contracted layer is concatenated to the layer

Conditional Random Fields post-processes a segmentation to be refined to follow image boundaries

Dilated convolution: inflate the kernel by inserting spaces between the kernel element

- enable exponential expansion of the receptive field

Depthwise separable convolution: depthwise convolution + pointwise convolution

회고:

Segmentation에 대해서 배우고 팀원들도 모았다. 왠지 느낌이 좋은데?

10/14 금

학습한 것들:

Instance segmentation: even if the object is the same, it is classified as a different instance

Phanoptic segmentation: semantic segmentation+instance segmentation

Object detection: classification + box localization

- useful for autonomous driving, OCR

Traditional method - Selective search: over-segmentation, iteratively merging similar regions, extracting candidate boxes

Two-stage detector: region proposal + image classification

R-CNN: region proposal, warp each region, CNN, classify regions

Fast R-CNN: recycle a pre-computed feature for multiple object detection

- convolution feature map from the original image, region of interest feature map extraction, class and box prediction for each ROI

Faster R-CNN: end-to-end object detection by neural region proposal

- uses a metric called Intersection over Union(IoU): area of overlap/area of union

- feature map, neural region proposal, classification, remove other boxes with IoU>=0.5

One-stage detector: no RoI pooling

You Only Look Once: S x S grid on input, class probability map + bounding boxes&confidence

Single Shot MultiBox Detector: use multiple feature maps to model diverse spaces of box shapes

Class imbalance problem: more negative bounding boxes than positive bounding boxes

- use Focal loss for solution: improved cross-entropy loss

- give less weight to easy ones

RetinaNet: feature pyramid network + class/box classification networks

DETR: transformer for object detection

회고:

이번주도 끝이 났다. 다음주도 열심히 달려보자!

'잡다한 것들 > 부스트캠프 AI Tech 4기' 카테고리의 다른 글

6주차 학습 일지 - CV 기초대회 (0)	2022.10.24
부스트캠프 5주차 학습 일지 - Computer Vision Basics (0)	2022.10.18
부스트캠프 3주차 학습 일지 - Deep Learning Basics (0)	2022.10.03
부스트캠프 2주차 학습 일지 - Pytorch Basics (0)	2022.09.26
부스트캠프 1주차 학습 일지 - Python & AI Math (0)	2022.09.19

10/11 화

학습한 것들:

회고:

10/12 수

학습한 것들:

회고:

10/13 목

학습한 것들:

회고:

10/14 금

학습한 것들:

회고:

'잡다한 것들 > 부스트캠프 AI Tech 4기' 카테고리의 다른 글

티스토리툴바