question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Random Deletion Layer - Data Augmentation

See original GitHub issue

I’ve created this issue to specifically discuss the Random Deletion Layer while we figure out how to incorporate WordNet for Synonym Replacement

I’ve adapted the design mentioned by @mattdangerw here for the same

class RandomDeletion(keras.layers.Layer):
    """Augments input by randomly deleting words

    Args:
        probability: probability of a word being chosen for deletion
        max_replacements: The maximum number of words to replace
        stop_word_only: Only deletes stopwords

    Examples:

    Basic usage.
    >>> augmenter = keras_nlp.layers.RandomDeletion(
    ...     probability = 0.3,
    ... )
    >>> augmenter(["dog dog dog dog dog"])
    <tf.Tensor: shape=(), dtype=string, numpy=b'dog dog dog dog'>
    """
    pass

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
mattdangerwcommented, Jun 1, 2022

@aflah02 that makes sense to me. In that case, let’s call this RandomWordDeletion, it’s more specific to what we are doing here, and leave us room to grow our offering.

2reactions
mattdangerwcommented, May 27, 2022

@aflah02 yeah, definitely we need to support the bached 2D case at a minimum. Potentially this could be done with RaggedTensor and no map function? Something like…

inputs = tf.constant(["this is a test", "this is another test"])
ragged_words = tf.strings.split(inputs)
mask = tf.random.uniform(ragged_words.flat_values.shape) > 0.25
mask = ragged_words.with_flat_values(mask)
deleted = tf.ragged.boolean_mask(ragged_words, mask)
deleted = tf.strings.reduce_join(deleted, axis=-1, separator=" ")

Would that work?

Read more comments on GitHub >

github_iconTop Results From Across the Web

A Complete Guide to Data Augmentation | DataCamp
Learn about data augmentation techniques, applications, and tools with a TensorFlow and Keras ... Random erasing: delete some part of the initial image....
Read more >
Random Erasing Data Augmentation
Abstract. In this paper, we introduce Random Erasing, a new data aug- mentation method for training the convolutional neural net- work (CNN).
Read more >
These are the Easiest Data Augmentation Techniques in ...
Random Swap: Randomly choose two words in the sentence and swap their positions. Do this n times. Random Deletion: Randomly remove each word...
Read more >
How to Configure Image Data Augmentation in Keras
Because the augmentations are performed randomly, this allows both modified images and close facsimiles of the original images (e.g. almost no ...
Read more >
Data Augmentation in Python: Everything You Need to Know
Random Erasing – delete a part of the initial image; Mixing images – basically, mix images with one another. Might be counterintuitive, but...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found