본문 바로가기
728x90
반응형

AI-LAB26

[1] Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts ContributionThey propose a novel Multi-gate Mixture-of-Experts model which explicitly models task relationships.They conduct control experiments on synthetic data. They report how task relatedness affects training dynamics in multi-task learning and how MMoE improves both model expressiveness and trainability.They conduct experiments on real benchmark data and a large-scale production recommenda.. 2025. 3. 23.
[3] Switch Transformers: Scaling to Trillion Parameter Modelswith Simple and Efficient Sparsity GShardMoE Tranformer encoder with device placementlimitationstok-k experts lead to load imbalance problem for trainingexpensive network routinggating function should not choose the same few experts Switch TransformerContributionsSwitch Transformer simplifies and improves over MoE.Scalilng properties and a benchmark against T5 model, with 7x + pre-training speedups and same FLOPS over token.Succe.. 2025. 3. 12.
꽤 괜찮은 논문 요약 프롬프트 Ver.11) 논문의 abstract를 요약2) 논문이 제기한 문제상황3) 논문이 제안하는 해결책과 핵심 아이디어4) 논문이 제안한 것을 구현하기 위한 실험 설계 과정 및 방식5) 논문에 사용된 데이터나 모델아키텍처6) 논문의 novelty7) 향후 연구 방향8) 논문의 결론을 정리해서 단계별로 정확하게 기술해줘. 2025. 3. 8.
[2] Recommending What Video to Watch Next: A MultitaskRanking System(2019 RecSys) 깃허브 : https://github.com/JsuccessJ/youtube_ranking_system GitHub - JsuccessJ/youtube_ranking_system: Recommending What Video to Watch Next: A Multitask Ranking System(2019 RecSys)Recommending What Video to Watch Next: A Multitask Ranking System(2019 RecSys) - JsuccessJ/youtube_ranking_systemgithub.com 1. Introduction유튜브(YouTube)와 같은 대규모 비디오 추천 시스템에서 “다음에 시청할 영상을 추천”하는 문제를 다룹니다. 기존의 대규모 추천 시스템은 다.. 2025. 3. 4.
728x90
반응형