Image Recognition
Reddit Vote Flip Share 0 SharesChatGPT and other large language models (LLMs) have shown impressive generalization abilities, but their training and inference costs are often prohibitive. New entity types are continually emerging. Supervised NER models also show poor generalizability for new domains and entity types since they are trained on pre-specified entity types and domains. They put up the biggest and most varied NER benchmark to date (UniversalNER benchmark), which consists of 43 datasets from 9 different disciplines, including medical, programming, social media, law, and finance. In contrast, UniversalNER outperforms Vicuna by over 30 absolute points in average F1 and achieves state-of-the-art NER accuracy across tens of thousands of entity types in the UniversalNER benchmark.