What are Decoders or autoregressive models in transformers?

This recipe explains what are Decoders or autoregressive models in transformers.

Recipe Objective - What are Decoders or autoregressive models in transformers?

Decoders, also known as autoregressive models, are trained on the traditional language modelling problem of guessing the next token after reading the preceding ones. They correspond to the original transformer model's decoder, and a mask is applied to the entire phrase so that the attention heads can only perceive what came before in the text, not what comes after. Although these models can be fine-tuned to produce excellent outcomes for a variety of tasks, text production is the most natural use. The GPT model is a good example of this type of paradigm.

Access Avocado Machine Learning Project for Price Prediction

Types of Decoders or autoregressive models:

* Original GPT
* GPT-2
* CTRL
* Transformer-XL
* Reformer
* XLNet

For more related projects -

/projects/data-science-projects/tensorflow-projects
/projects/data-science-projects/keras-deep-learning-projects

What Users are saying..

profile image

Savvy Sahai

Data Science Intern, Capgemini
linkedin profile url

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More

Relevant Projects

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

Ola Bike Rides Request Demand Forecast
Given big data at taxi service (ride-hailing) i.e. OLA, you will learn multi-step time series forecasting and clustering with Mini-Batch K-means Algorithm on geospatial data to predict future ride requests for a particular region at a given time.

Build a Logistic Regression Model in Python from Scratch
Regression project to implement logistic regression in python from scratch on streaming app data.

Mastering A/B Testing: A Practical Guide for Production
In this A/B Testing for Machine Learning Project, you will gain hands-on experience in conducting A/B tests, analyzing statistical significance, and understanding the challenges of building a solution for A/B testing in a production environment.

Build Portfolio Optimization Machine Learning Models in R
Machine Learning Project for Financial Risk Modelling and Portfolio Optimization with R- Build a machine learning model in R to develop a strategy for building a portfolio for maximized returns.

Build a Face Recognition System in Python using FaceNet
In this deep learning project, you will build your own face recognition system in Python using OpenCV and FaceNet by extracting features from an image of a person's face.

AWS MLOps Project for Gaussian Process Time Series Modeling
MLOps Project to Build and Deploy a Gaussian Process Time Series Model in Python on AWS

Multi-Class Text Classification with Deep Learning using BERT
In this deep learning project, you will implement one of the most popular state of the art Transformer models, BERT for Multi-Class Text Classification

OpenCV Project for Beginners to Learn Computer Vision Basics
In this OpenCV project, you will learn computer vision basics and the fundamentals of OpenCV library using Python.

Build a CNN Model with PyTorch for Image Classification
In this deep learning project, you will learn how to build an Image Classification Model using PyTorch CNN

OSZAR »