Dev Observability
Product
Pricing
Docs
Resources
Blog
Company
Debug Wordle

question-mark

Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add Vision Transformer to `keras.applications`

See original GitHub issue

If you open a GitHub issue, here is our policy:

Vision Transformer[1] is model which uses transformers-like architecture. A sequence of vectors is created by dividing a picture into fixed-size patches, linearly embedding each one, adding position embeddings, and then feeding the assembled vectors to a conventional Transformer encoder.

Who will benefit from this feature?

Keras users using keras.applications in their projects.

Will this change the current api? How? keras.applications.ViTBase, keras.applications.ViTLarge, ...

Contributing

Do you want to contribute a PR? (yes/no): Yes
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing): I have implemented base model of vision transformer from scratch here

References

[1] https://arxiv.org/abs/2010.11929

@fchollet @LukeWood

Issue Analytics

State:
Created 9 months ago
Comments:5

Top GitHub Comments

1reaction

innatcommented, Dec 11, 2022

Check. https://github.com/keras-team/keras-cv/issues/668

0reactions

IMvision12commented, Dec 12, 2022

closing this issue…

Read more comments on GitHub >

Top Results From Across the Web

Image classification with Vision Transformer

This example implements the Vision Transformer (ViT) model by Alexey Dosovitskiy et al. for image classification, and demonstrates it on the ...

Vision Transformer - Keras Code Examples!! - YouTube

This video walks through the Keras Code Example implementation of Vision Transformers !! I see this as a huge opportunity for graduate ...

Vision Transformer with TensorFlow

This post is a deep dive and step by step implementation of Vision Transformer (ViT) using TensorFlow 2.0. What you can expect to...

Vision Transformer in TensorFlow | notebooks

The publication of the Vision Transformer (or simply ViT) architecture in An Image is Worth 16x16 Words: Transformers for Image Recognition ...

Vision Transformer (ViT)

A [CLS] token is added to serve as representation of an entire image, which can be used for classification. The authors also add...

Top Related Medium Post

No results found

Top Related StackOverflow Question

No results found

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Top Related Reddit Thread

No results found

Top Related Hackernoon Post

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Top Related Hashnode Post

No results found

TextClassifier

Support returning indices in text_dataset_from_directory