question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add Vision Transformer to `keras.applications`

See original GitHub issue

If you open a GitHub issue, here is our policy:

Vision Transformer[1] is model which uses transformers-like architecture. A sequence of vectors is created by dividing a picture into fixed-size patches, linearly embedding each one, adding position embeddings, and then feeding the assembled vectors to a conventional Transformer encoder.

Who will benefit from this feature?

Keras users using keras.applications in their projects.

Will this change the current api? How? keras.applications.ViTBase, keras.applications.ViTLarge, ...

Contributing

  • Do you want to contribute a PR? (yes/no): Yes
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing): I have implemented base model of vision transformer from scratch here

References

[1] https://arxiv.org/abs/2010.11929

@fchollet @LukeWood

Issue Analytics

  • State:closed
  • Created 9 months ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
innatcommented, Dec 11, 2022
0reactions
IMvision12commented, Dec 12, 2022

closing this issue…

Read more comments on GitHub >

github_iconTop Results From Across the Web

Image classification with Vision Transformer
This example implements the Vision Transformer (ViT) model by Alexey Dosovitskiy et al. for image classification, and demonstrates it on the ...
Read more >
Vision Transformer - Keras Code Examples!! - YouTube
This video walks through the Keras Code Example implementation of Vision Transformers !! I see this as a huge opportunity for graduate ...
Read more >
Vision Transformer with TensorFlow
This post is a deep dive and step by step implementation of Vision Transformer (ViT) using TensorFlow 2.0. What you can expect to...
Read more >
Vision Transformer in TensorFlow | notebooks
The publication of the Vision Transformer (or simply ViT) architecture in An Image is Worth 16x16 Words: Transformers for Image Recognition ...
Read more >
Vision Transformer (ViT)
A [CLS] token is added to serve as representation of an entire image, which can be used for classification. The authors also add...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found