Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

LSTM for Hate Speech

See original GitHub issue

I am trying to train a classifier which can classify hate speech mainly detect toxic comments from any data request. The final work of the code is that once i integrate it the system (a blog) i will use it to classify all the text that is toxic and deny posts…Seems fair enough 😄

The Statement : i am still leaning into the system and as far as i know LSTM is best used for training text data (suggestions welcome) . I am using Toxic Training data set train the network. The CSV file contains data as text and tags as, Toxic,Insult,Severe…, After cleaning the data and getting it into JSON format for API’s the final training data looks like:

#Removed the hard language in the toxic example #The Processed Training set is almost 150,000

 {
        "input": "You, sir, are my hero. Any chance you remember what page that's on?",
        "output": "safe"
    },
    {
        "input": "Congratulations from me as well, use the tools well "",
        "output": "safe"
    },
    {
        "input": "DONT PISS AROUND ON MY WORK",
        "output": "toxic"
    },
    {
        "input": "Your vandalism to the Matt Shirvington article has been reverted.  Please don't do it again, or you will be banned.",
        "output": "safe"
    }

CODE:

//Choosing  Net
const net = new brain.recurrent.LSTM();

const readdata = readJson('dataset-01.json');
function readJson(datafile) {
    var obj = JSON.parse(fs.readFileSync(datafile));
    return obj;
}

//Train
net.train(readdata, {
    iterations: 1000,  
    errorThresh: 0.005,  
    log: true,          
    logPeriod: 10,      
    learningRate: 0.3,    
    momentum: 0.1,        
    callback: null,       
    callbackPeriod: 10,   
    timeout: Infinity 
});
net.run('I Hate you');

Problem

Its taking a huge time for learning , not that it would matter to me, as save to JSON/function are there but the main issue is that despite of giving output in two forms ‘Safe’/‘Toxic’ it return garbage values For example " /H. Kaks". Any Help? I used another data form in which output was given in array and each index indicate a data point but again it wont get the desired output

[
    {
        "input": "D'aww! He matches this background colour I'm seemingly stuck with. Thanks.  (talk) 21:51, January 11, 2016 (UTC)",
        "output": [
            0,
            0,
            0,
            0,
            0
        ]
    },
    {
        "input": "Hey man, I'm really not trying to edit war. It's just that this guy is constantly removing relevant information and talking to me through edits instead of my talk page. He seems to care more about the formatting than the actual info.",
        "output": [
            0,
            0,
            0,
            0,
            0
        ]
    },
    {
        "input": "Dude, I hate your face",
        "output": [
            0,
            0,
            1,
            0,
            0
        ]
    }

Issue Analytics

State:
Created 4 years ago
Comments:17 (1 by maintainers)

Top GitHub Comments

7reactions

SpeedyCraftahcommented, Jul 19, 2019

@equan4647 dont be fat share it with us

3reactions

CamK06commented, Oct 19, 2019

Hmm, still after a lot of training I am getting random stuff. But I think I know your issue now that I read over the code again, it is interpreting it as if you want to replicate that text and thus it is doing just that. I have to go now so I cannot directly help but just look at this github page, its where I learned how to do what you’re doing https://github.com/bradtraversy/brainjs_examples/blob/master/02_hardware-software.js

If you’re curious, here’s 25000 iterations, libe MY WORKs aellatinnone MY WORKs aellatinnone MY WORKs aellatinnone MY WORKs aellatinnone

Top Results From Across the Web

Hate Speech Detection using Attention-based LSTM

This paper describes the system we developed for EVALITA 2018, the 6th evaluation campaign of Natural Language. Processing and Speech tools for Italian,...

LSTM Approach to Hate Speech Detection in English and ...

We created a simple LSTM model and applied it to all tasks - detecting hate speech, determining aggression, and determining targeted or general ......

Detection of Hate Speech and Offensive ... - Springer Link

This system describes a contemporary approach that employs word embeddings with LSTM and Bi-LSTM neural networks for the identification of hate ...

LSTM model for hate speech detection. - ResearchGate

This paper reports an increment to the state-of-the-art in hate speech detection for English-Hindi code-mixed tweets. We compare three typical deep learning ...

Hate Speech Detection Using a Convolution ... - NTU > IRep

Hate Speech Detection Using a Convolution-LSTM Based Deep. Neural Network. Ziqi Zhang. Nottingham Trent University. Nottingham ziqi.zhang@ntu.ac.uk.