11/7 월
학습한 것들:
MLOps: ML +Ops(operations)
- 업무 자동화
- Machine Learning engineering + data engineering + cloud + infrastructure
- 빠른 시간 내에 가장 적은 위험을 부담하며 아이디어 단계부터 production 단계까지 진행할 수 있도록 기술적 마찰 줄이기
Research ML vs Production ML:
- static data vs dynamic data
- good performance vs fast inference with good performance
- SOTA vs stable
- offline vs online
MLOps components:
- Model
- Data, feature
- CPU, GPU, Memory
- scalability
- cloud, server
- Batch serving/online serving
- experiment, model management
- feature store
- data validation
- continuous training
- monitoring
- AutoML
Serving: 머신 러닝 모델을 앱이나 웹에서 사용할 수 있게 만드는 과정
Online serving: use HTTP protocol and API to handle requests
- servers need to be able to hold those requests
- API opens the request mechanism to the public
Single data point: serving a single data
There can be a server for preprocessing and a separate one for the ML model
Ways to make online serving:
- make own API
- use cloud service
- use serving libraries
Batch serving: inference on every certain amount of time or every determined amount of data
Machine learning project flow:
- define problem
- need to know what we are solving exactly
- design the product
- check the validity of the value of the project
- look for existing data or model
- choose good lose function
- machine learning is good if there is a pattern and it is complex, repetitive
- gather data first if there is not one
- set the right goal and new objective
- the goal needs to be ethical
- the objective can be multiple so attention is needed
- get the constraint and the risk of the project
- make a baseline and prototype to start off
- set the right metric to evaluate
- monitor after deployment
- performance
- which part went wrong
Business model: look for where the output can be used in the company
11/8 화
학습한 것들:
Voila: make a notebook to prototype web
- originally made for the dashboard
- easy to experiment with stuff
ipywidget: allows for interactive notebook
- slider allows for interactive control of value
- text allows for integer and string input
- checkbox
- dropdown
- file upload
- on_click
- observe
Streamlit: allows for web service in minor modifications of a python code
- no frontend needed
- many easy components
- runs every time when a change happens
11/9 수
학습한 것들:
virtualization: a template that is used for research and production environment
- to solve a different state of the environment in local, test, and production
virtual machine: use image to create an environment with OS
- it is heavy to run an OS on another OS
docker: use containers to lighten VM
- use docker image which is read-only to make a docker container that makes a copy of the image
- allows using other people's software instantly with the same setting and environment
container registry: like GitHub for container images
docker commands:
- pull: download image
- images: list of images
- run: run image to make container
- ps: current running container
- exec: go into the container
- container: stop running the container
- rm: remove the container
- volume mount allows connecting the file of the host and the container
- build: make the image
MLflow: allows for management of lifecycle in experiments and allows reproduce the experiment result
- experiment management & tracking
- model registry/versioning
- model serving
- project code versioning
MLflow commands:
- experiments create: make experiments that are like the theme of the project
- run: runs code 1 time: logs source, version, start&end time, parameters, metrics, tags, artifacts
- server: selects where to store the tracking data
실제 회사 환경에서는 어떤 서비스를 원하는지만 있기 때문에 어떤 데이터를 모아야 하는지 먼저 판단 해야한다.
서비스의 품질이 좋아야 하기 때문에 offline test 결과보다 online test가 더 잘 나오게 설계를 해야한다
Creative Commons License: 저작권 라이센스의 종류
- BY: 저작자 표시
- ND: 변경 금지
- NC: 비영리
- SA: 같은 조건의 CCL 적용의무
데이터의 bias가 결과에서도 반영이 된다
Proxies: unintentional discrimination
Masking: intentional discrimination
11/10 목
학습한 것들:
마스크 분류 모델을 streamlit과 streamlit cloud로 웹서비스를 만들어서 배포를 해봤다.
https://mask-classification.streamlit.app/
'잡다한 것들 > 부스트캠프 AI Tech 4기' 카테고리의 다른 글
부스트캠프 12주차 학습 일지 - 데이터 제작 (0) | 2022.12.05 |
---|---|
부스트캠프 9주차 학습 일지 - Object Detection 1 (0) | 2022.11.14 |
CV 기초대회 최종 회고 (0) | 2022.11.04 |
6주차 학습 일지 - CV 기초대회 (0) | 2022.10.24 |
부스트캠프 5주차 학습 일지 - Computer Vision Basics (0) | 2022.10.18 |