6주차 학습 일지 - CV 기초대회

사용한 기술 스택들:

10/24 월

학습한 것들:

In competition, it is good to know the direction of where this is going

- read the overview carefully

Problem definition is important.

- what is the problem that I need to solve?

- What is the I/O of the problem?

- Where is this solution being applied?

Have the heart of solving the problem not increasing the rank.

Domain understanding - Data Mining - Data Analysis - Data Processing - Modeling - Training - Deploy

EDA(Exploratory Data Analysis): effort to understand the data

- do the data analysis for the task

Data Science is 80% pre-processing: bad data

- not all images have the perfect data we want

- varying image size requires resizing

- good pre-processing can increase the performance of the model

High bias: underfitting

High Variance: overfitting

There is no absolute in the pre-processing world, so do it according to the type of problem that we want to solve, experimentally.

Look for bottlenecks in training to maximize performance

- could be data loading that is holding the speed back

- more data augmentation/transform = more time to load the data

PyTorch is good because it is low-level.

Every layer in PyTorch follows nn.Module class

10/25 화

학습한 것들:

As all models inherit from nn.Module, running forward once runs it for all.

parameter(): tensors without key

state_dict(): tensors with key

Every parameter has data, grad, and requires_grad variables

ImageNet dataset contributed to increasing the performance of the model

High similarity and small dataset fed to the pretrained model: freeze backbone

Low similarity fed to the pretrained model: do not freeze anything

loss function=cost function = error function

loss.backward( ) updates the grad of the parameter, and optimizer.step() applies that gradient

Focal loss: on class imbalance, give less loss for class with a high percentage while giving high loss for class with a low percentage

Label smoothing loss: do not use the class target label as one hot, but give a little weight to other classes too

StepLR: lowers LR every few steps

CosineAnnealingLR: drastically change LR in the shape of the cosine function

ReduceLROnPlateau: lower the LR when there is no improvement in performance

Do not be fooled by the accuracy of the model because it can be misleading

use F1-score for class imbalance and use accuracy for class balance

it is possible to not update gradient every iteration by not calling optimizer.step() and optimizer.zero_grad()

Ensemble: use multiple models for better performance

- model averaging(voting): hard voting(based on absolute results), soft voting(based on percentage)

In order to use a validation set for training, a technique called cross-validation is used.

Stratified K-Fold Cross Validation: use all train data as validation data once by doing cross-validation, and think about the class distribution too

Test Time Augmentation(TTA): use data augmentation for the test set to make it figure out the class with noise

- ensemble those variations of image

Using Optuna allows optimizing parameters easily

Weight and Bias is like a GitHub for deep learning

'잡다한 것들 > 부스트캠프 AI Tech 4기' 카테고리의 다른 글

부스트캠프 8주차 학습 일지 - AI 서비스 개발 기초 (0)	2022.11.07
CV 기초대회 최종 회고 (0)	2022.11.04
부스트캠프 5주차 학습 일지 - Computer Vision Basics (0)	2022.10.18
부스트캠프 4주차 학습 일지 - Computer Vision Basics (0)	2022.10.11
부스트캠프 3주차 학습 일지 - Deep Learning Basics (0)	2022.10.03

10/24 월

학습한 것들:

10/25 화

학습한 것들:

'잡다한 것들 > 부스트캠프 AI Tech 4기' 카테고리의 다른 글

티스토리툴바