[paper]Two-stage Model for Automatic Playlist Continuation at Scale

keywords : playlist continuation, collaborative filtering, CNN, gradient boosting

two-stage model
- 1-stage : 빠른 검색에 최적화
  - 2.2M 노래 검색 공간에서 더 작은 후보군들로 이루어진 세트들로 이루어진 검색공간으로 줄였음.
  - collaborative filtering(CF)와 딥러닝 모델을 사용해서 20K 후보군을 검색할 수 있도록 함. (2백만개에서 2만개로 후보군을 줄였다는거임)→90% recall
  - top-1k songs는 60% recall에 가까움.
  - recall값은 가장 관련된 음악들이 검색된 세트안에 있다는 것을 보장할 수 있음.
- 2-stage : 추천된 리스트의 가장 높은순위의 정확도를 최대화하도록 1단계에서 뽑은 후보군을 재순위 매기는 것
  - (playlist, song)쌍으로 relevance score에 매핑하는 pairwise model을 만들었음
  - playlist와 song의 feature들을 합쳐서 input으로 사용하고, 이 모델은 쌍으로 관계를 포착함.
data partitioning
training
inference procedures
Model architecture
- 1st stage
  - latent model: Weighted Regularized Matrix Factorization(WRMF)
  - neightbor-based model : user-user, item-item
    - user-user : estimate relevance by computing similarity between columns of R(playlist-song matrix)
    - 인기있는 곡들에게 편향되는 경향이 있어서 편향을 줄일 수 있도록 처리가 필요neightbor-based model : user-user, item-item
  - temporal pattern: cnn으로 임베딩해서 Model score 계산(오 ,,, cnn이 속도도 빠르고 병렬적인 계산을 잘해서 rnn말고 cnn을 사용하는구나. 그치그치 훨씬 웨이트도 적고 학습할 파라미터가 적으니 속도가 빠를수밖에없고 병렬적계산도 ㅇㅈ 그래서 gpu랑 잘맞네)
    - documents=playlists and words=songs
  - blend model : 모든 모델 스코어는 linear weighted combination으로 concatenate해서 두번째 stage의 input으로 사용함
- 2nd stage
  - gradient boosting model은 후보군들의 재순위를 매김
  - feature extraction
    - input from first stage
    - playlist features : 플레이리스트 컨텐츠 정보 요약, 어떤 종류의 곡들인지, (플레이리스트 이름과 길이, 곡-아티스트-앨범의 인기도의 평균, song homogeneity(노래 동질성? wrmf, cnn으로 평가함) 특정한 장르의 곡들로 이루어진 플레이리스트면 더 쉬운 태스크임. song homogeneity가 곡의 다양성을 판단하기 좋은 점수임.
    - song features: 노래 문맥 정보, 아티스트-앨범-타이틀의 정보, 곡의 길이, 플래이리스트 통계값
output : 노래들의 최종 순위
test playlists
- 0 songs(clod start) to 100 songs
evaluation
- r-precision
- NDCG
- clicks
baseline : SVD → cold start일 때, 꽤 괜찮은 성능 보임. 플레이리스트 이름이 굉장히 중요하다는 것을 알 수 있음

추가 논문

Large-scale user modeling with recurrent neural networks for music discovery on multiple time scales.
Automatic playlist generation based on tracking user’s listening habits.

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

낭만과 지성

[paper]Two-stage Model for Automatic Playlist Continuation at Scale

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역