What's the difference from DARTS?
See original GitHub issueThanks for sharing the code.
I have a question about the implementation difference from DARTS. The training code looks like very similar to DARTS(https://github.com/quark0/darts).
As you mentioned in the paper, “2. Instead of using the whole DAG, GDAS samples one sub-graph at one training iteration, accelerating the searching procedure. Besides, the sampling in GDAS is learnable and contributes to finding a better cell.”
But in the forward function of MixedOp, the output is just the weighted sum of all ops, same as DARTS.
def forward(self, x, weights): return sum(w * op(x) for w, op in zip(weights, self._ops))
So, can you point out the code that “samples one sub-graph at one training iteration”? Thanks.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:8 (3 by maintainers)
Top Results From Across the Web
How To Choose a Dart - 9 Things To Consider - DartHelp.com
The most common dart weights are between 16 to 26 grams; however, modern rules allow darts to weigh up to 50 grams. What...
Read more >Differences in Darts (Soft & Steel)
With the exception of wooden darts, both soft and steel-tip darts are made of the same materials. The major differences that will be...
Read more >What Is The Difference Between Soft Tip And Steel Tip Darts?
In a nutshell, steel tip darts are made using metal points, and are made to be used on bristle based dart boards, whereas...
Read more >Steel or soft tip dart? | Shot Darts Discover blog - Shot Darts
Some darts are only sold as steel tip versions not soft tips, and vice versa. The other key difference is in weight. For...
Read more >Do Expensive Darts Make a Difference? [Pros & Cons]
Expensive darts are made of better and higher quality materials like tungsten. The higher the percentage of tungsten makes the dart more expensive....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Sorry for the confusion. That file is DARTS instead of our algorithm, we did not release the searching codes of GDAS. The main difference between GDAS and DARTS is that we use Gumbel-softmax with an acceleration trick to allow only one candidate CNN is used during forwarding, while can still back-propagate to the architecture parameters.
@D-X-Y
In the quoted text above inside gdas paper, I have few questions :
argmax
operation is not differentiable in pytorch ?W
training) and validation (forA
training) datasets ?