SEER - Advanced Academic Search

Next-Generation Academic Search

SEER combines semantic retrieval, task-aware parsing, and explainable AI to revolutionize how you discover research.

Search Results

Sort by:

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova • 2019

Transformers Pre-training NLP

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT representations can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications.

12,540 citations GLUE, SQuAD Classification, QA

98% Match

Why this matches your search

This paper matches your search for "transformer models for NLP tasks" because it introduces BERT, a transformer-based architecture that achieves state-of-the-art results on multiple NLP benchmarks. The paper specifically addresses question answering and text classification tasks you mentioned.

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar, et al. • 2017

Attention Sequence Models Machine Translation

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train.

45,320 citations WMT 2014 Machine Translation

92% Match

Why this matches your search

This foundational paper matches your interest in "transformer architectures" as it introduces the original Transformer model. While focused on machine translation, the architecture has become fundamental to NLP tasks like those you're researching.

SEER's Unique Capabilities

Semantic Understanding

SEER goes beyond keywords to understand research concepts, methodologies, and findings at a deeper level.

Task-Aware Filtering

Find papers relevant to your specific research task, whether it's classification, generation, or QA.

Dataset/Model Search

Search for papers that use specific datasets or model architectures like BERT or GPT-3.

Explainable Matches

Understand why each paper was matched to your query with clear, interpretable explanations.

Dynamic Reranking

Results improve as you provide feedback, learning your preferences over time.

Research Graph

Visualize connections between papers, datasets, and methodologies in your field.

Next-Generation Academic Search

Search Results

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Why this matches your search

Attention Is All You Need

Why this matches your search

SEER's Unique Capabilities

Semantic Understanding

Task-Aware Filtering

Dataset/Model Search

Explainable Matches

Dynamic Reranking

Research Graph

Analyzing Your Query