Chats scraped from public resources on the internet
- clean folder contains conversations from different websites in txt format.
- raw folder contains the original files, some are images which I converted into text applying an OCR.
- sexting_dataset.txt contains all the chats put together.
These are mostly cis-heterosexual conversations. If you have some data that might help to become this dataset bigger and/or more diverse feel free to contribute.
This is the list of websites where I took the data from. If some page dies try accessing with https://web.archive.org
-
https://thoughtcatalog.com/eko-hayden/2016/04/erotic-sexts-between-lovers-that-will-make-you-horny/
-
https://www.cosmopolitan.com/sex-love/news/a52121/sexting-messages-tips/
Mathias's open-source projects are supported by his ko-fi. If you found this project helpful, any monetary contributions are appreciated and will be put to good creative use.
MIT