A different way of doing the similarity/comparison task?
See original GitHub issueHey! Thanks for the awsome work. I was wondering if I could use and update finetune to do the following:
Instead of using (Start, Text1, Delim, Text2, Extract) and (Start, Text2, Delim, Text1, Extract) as in the paper, can we use (Start, Text1, Extract) and (Start, Text2, Extract) separately through the transformer?
This could be thought of as obtaining sentence/document embeddings for Text1 and Text2 separately. Upon doing that, I would like to compare their similarity using a distance metric such as cosine distance. (i.e. train the transformer as a siamese network.)
Would you suggest I build such a model on top of a fork of finetune?
Issue Analytics
- State:
- Created 5 years ago
- Comments:11 (6 by maintainers)
Top Results From Across the Web
Comparing and Contrasting - UNC Writing Center
This handout will help you determine if an assignment is asking for comparing and contrasting, generate similarities and differences, and decide a focus....
Read more >Tools to Compare and Contrast: Some Alternatives to the You ...
When you are trying to compare and contrast several things, a matrix chart is really helpful. It's basically like a spreadsheet, with several...
Read more >Compare and Contrast | English Composition 1
Compare and contrast is a rhetorical style that discusses the similarities and differences of two or more things: ideas, concepts, items, places, etc....
Read more >Activities for Identifying Similarities and Differences
Pickering, and Jane Pollock present four “forms” of identifying similarities and differences: comparing, classifying, creating metaphors, and creating analogies ...
Read more >Writing for Success: Compare/Contrast | English Composition 1
The key to a good compare-and-contrast essay is to choose two or more subjects that connect in a meaningful way. The purpose of...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Hey @chaitjo, glad you’ve been hacking away over the weekend! I’ve opened up a PR into development for you – don’t worry if things don’t run yet I’ll just use it as a space to leave code comments for now.
At a high level things look good, seems like there are just a few things to clean up. Perhaps most importantly, I think we can find a way around overriding many of the
BaseModelmethods by structuring things more like the comparison class (and maybe inheriting from that instead of theBaseModelclass.)As far as data goes, a sampled version of the Quora similarity dataset might be a good place to start. We’ve got some scripts in there for training already, but note that you’ll probably have to modify those scripts to convert the pandas Series passed as inputs to numpy arrays – that’s on my backlog to patch up. E.g.:
Thanks for the response! Will keep you posted on the progress.