Dark Web ChatGPT Unleashed:Meet DarkBERT
A group of researchers from South Korea has released a chatGPT-like large language model, trained on the fringes of the dark Web for science. The launch of chatGPT, created by OpenAI, has taken the world by storm. As we know, the advanced chatGPT itself can be used to develop highly efficient malware.
ChatGPT is based on Large Language Models (LLM) aimed explicitly at Dark Web research and threat identification. Hence, their numbers will only grow over time, with each applied LLM having a distinct area of expertise and being trained using carefully selected data for a particular objective.
Whereas one of these applications: one that was trained using information from the dark Web itself, recently launched.
DarkBERT, created by South Korean developers, is based on the RoBERTa architecture, based on an AI approach back in 2019. The time it was released showed the model was severely undertrained when released.
To train the model, a group of researchers worked so hard to crawl the Dark Web through the anonymizing firewall of the Tor Network. Later, it filters the raw data techniques such as deduplication, category balancing, and pre-processing to produce a Dark Web database.
In addition, the Roberta Large Language Model can analyze fresh Dark Web content written in its dialects and with heavily coded messages. Later it extracts crucial information which was created by using that database.
DarkBERT is a collection of hidden internet files only accessible by a specific web browser. Darkweb uses English as a native language, which wouldn’t be incorrect. On the other hand, it is a typical enough concoction that the researchers believe a particular LLM needs to be trained on.
Further training is going on to train the DarkBERT to make it more perfect and improve its results. Hence, it is under development, and training is going on to make it more accessible.