Everything here was tested on Python 3.9.6, however it should work with older and newer versions.
Model trained on Open AI's Moderation datasets (https://github.com/openai/moderation-api-release)
- Runs faster hosted locally than OpenAI's endpoint
- Does not send data to OpenAI
- Moderation dataset and models are not changing unlike OpenAI's, so constant tuning is not needed.
- Using OpenAI's Moderation endpoint without using their services technically breaks their usage policies.
- More reliable!! If OpenAI goes down, this does not as it is separate from openai!!
- Pretrained Model
- Dataset used (Dataset from OpenAI)
- Pretrained Vectorizer
train.py
to modify how the model is trained- An API to easily integrate into your applications
test.py
to check if the moderation model is working and view sample resultsdemobot.py
A demonstration of the API through a Discord Bot
git clone ericpandev/Moderation && cd Moderation
python -m pip install -r requirements.txt
python server.py
POST
http://127.0.0.1:9235/run
Sample Request:
{
"text": "sample text goes here"
}
Sample Response:
{
"harassment":0.0,
"hate":0.0,
"hate/threatening":0.0,
"self-harm":0.02,
"sexual":0.0,
"sexual/minors":0.0,
"violence":0.01,
"violence/graphic":0.01
}
The floats range from 0-1, being how confident the Model is that the text matches the label. In your application, you should find 2 ranges, one for automatic action and one to silently log for a human moderator to review.
Recommended values you should use for your applications:
# For Automatic Action
"harassment":2, # Disabled
"hate":0.67,
"hate/threatening":0.67,
"self-harm":0.75,
"sexual":0.68,
"sexual/minors":0.68,
"violence":2, # Disabled
"violence/graphic":2 # Disabled
# To silently log to moderation channel
"harassment":0.45,
"hate":0.30,
"hate/threatening":0.37,
"self-harm":0.37,
"sexual":0.37,
"sexual/minors":0.35,
"violence":0.37,
"violence/graphic":0.37
Sample CURL Request
curl -X POST -H "Content-Type: application/json" -d '{"text": "sample text goes here"}' http://127.0.0.1:9235/run