Testing the Generalization Power of Neural Network Models Across NLI Benchmarks

Näytä kaikki kuvailutiedot



Pysyväisosoite

http://hdl.handle.net/10138/304485

Lähdeviite

Talman , A J & Chatzikyriakidis , S 2019 , Testing the Generalization Power of Neural Network Models Across NLI Benchmarks . in T Linzen , G Chrupała , Y Belinkov & D Hupkes (eds) , The Workshop BlackboxNLP on Analyzing and Interpreting Neural Networks for NLP at ACL 2019 : Proceedings of the Second Workshop . The Association for Computational Linguistics , Stroudsburg , pp. 85-94 , 2019 ACL Workshop BlackboxNLP , Florence , Italy , 01/08/2019 .

Julkaisun nimi: Testing the Generalization Power of Neural Network Models Across NLI Benchmarks
Tekijä: Talman, Aarne Johannes; Chatzikyriakidis, Stergios
Muu tekijä: Linzen, Tal
Chrupała, Grzegorz
Belinkov, Yonatan
Hupkes, Dieuwke
Tekijän organisaatio: Department of Digital Humanities
Language Technology
Julkaisija: The Association for Computational Linguistics
Päiväys: 2019-08-01
Kieli: eng
Sivumäärä: 10
Kuuluu julkaisusarjaan: The Workshop BlackboxNLP on Analyzing and Interpreting Neural Networks for NLP at ACL 2019
ISBN: 978-1-950737-30-7
URI: http://hdl.handle.net/10138/304485
Tiivistelmä: Neural network models have been very successful in natural language inference, with the best models reaching 90% accuracy in some benchmarks. However, the success of these models turns out to be largely benchmark specific. We show that models trained on a natural language inference dataset drawn from one benchmark fail to perform well in others, even if the notion of inference assumed in these benchmarks is the same or similar. We train six high performing neural network models on different datasets and show that each one of these has problems of generalizing when we replace the original test set with a test set taken from another corpus designed for the same task. In light of these results, we argue that most of the current neural network models are not able to generalize well in the task of natural language inference. We find that using large pre-trained language models helps with transfer learning when the datasets are similar enough. Our results also highlight that the current NLI datasets do not cover the different nuances of inference extensively enough.
Avainsanat: 113 Computer and information sciences
6121 Languages
Vertaisarvioitu: Kyllä
Tekijänoikeustiedot: cc_by
Pääsyrajoitteet: openAccess
Rinnakkaistallennettu versio: publishedVersion


Tiedostot

Latausmäärä yhteensä: Ladataan...

Tiedosto(t) Koko Formaatti Näytä
W19_4810.pdf 335.9KB PDF Avaa tiedosto

Viite kuuluu kokoelmiin:

Näytä kaikki kuvailutiedot