Nick Sorros Presents:
Extreme Multilabel Classification in the Biomedical NLP Domain
Extreme multilabel classification refers to cases where the prediction space of a multilabel classifier is in the thousands of millions of labels which is an order of magnitude more than typical problems. The scale of such problems brings some unique challenges that one has to work around with such as memory, model size, train and inference time. This talk will discuss 1) how you can overcome those challenges, 2) relevant state of the art architectures for this problem 3) learning from the development of an transformers based nlp model to tag biomedical grants with 29K MeSH tags
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
00:00 Welcome!
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...