April 16, 2025

•

Min Read

Can ChatGPT Be Detected? Tools, Methods, and Limits

Q: Can professors tell if you use ChatGPT?

Many educational institutions use similar tools with integrated AI detection. While not infallible, these systems can flag AI-assisted writing.

Illustration of a robot and a person interacting on a mobile screen with question marks, symbolising the question can ChatGPT be detected

Yes, ChatGPT-generated content can be detected using a combination of statistical analysis, machine learning classifiers, and linguistic pattern recognition tools.

‍

As the use of large language models, such as OpenAI's ChatGPT, becomes increasingly common in academia, content marketing, education, and journalism, the challenge of distinguishing between human-written and AI-generated text has taken on new urgency.

‍

This article explores how ChatGPT-generated content can be detected, the tools available, and the evolving arms race between generative AI and detection technologies.

Learn here how to build your first Neural Network

What is AI-Generated Content?

AI-generated content refers to text written by large language models (LLMs) like GPT-4, developed by OpenAI. These generative pre-trained transformers are trained on massive datasets and use probability to predict the next word in a sequence, producing highly fluent and often human-like text.

‍

Because LLMs are optimised for coherence and grammatical accuracy, their output can appear nearly indistinguishable from human writing. This raises concerns about plagiarism, misinformation, and the authenticity of written communication.

‍

Generating text or speech in a natural language using AI software is the focus of Natural Language Generation (NLG), a subfield of Natural Language Processing (NLP). NLG involves computational linguistics, Natural Language Understanding (NLU), and Natural Language Processing (NLP).

‍

You can use natural language generation from chatbots and virtual assistants to customer service and content generation. You can also use it to produce written content like reports, summaries, and descriptions.

‍

NLG systems use machine learning algorithms trained on large datasets to generate human-sounding text. Recurrent Neural Networks (RNNs) and Transformers are two examples of deep learning methods that power some of the most advanced NLG systems.

‍

The most common type of AI language model is a neural network-based model, which consists of multiple layers of interconnected nodes. These nodes are trained on large datasets, such as Wikipedia or news articles, to learn patterns and relationships between words and phrases in human language. Once trained, the AI language model can generate new text by predicting the most likely next word or phrase based on the context of the previous words.

‍

ChatGPT, OpenAI's large GPT-4-based language model (for now!), is one of the most popular AI tools. The system has been trained with a lot of data so that it can understand and make up language that sounds like what people say. In other words, ChatGPT is a computer programme that is made to talk to people, answer their questions, give them information, and to create chatbots and virtual assistants.

‍

Chat GPT is also intelligent enough to pass prestigious graduate-level exams but without particularly high marks. The powerful AI chatbot tool recently passed both the law bar and the medical board exams.

‍

Because of their ability to generate human-like text, Chat GPT and other AI language models have raised concerns about their potential misuse. Elon Musk has been vocal about his dissatisfaction with OpenAI since stepping down from its board in February 2018, culminating in an open letter calling for the organisation to pause AI work on more powerful systems. Still, despite some of the concerns stated, Musk has been an advocate for the research and development of AI technologies such as ChatGPT, recognising their enormous potential.

‍

So, determining whether a human or machine has written text is a growing challenge, but can aid in the prevention of misinformation and malicious content spread, especially in journalism, cybersecurity, and finance.

‍

4 Strategies to Improve the Relevance of your Business using Data Science call to action

Why is AI-Generated Text Detection Important?

Researchers have experimented with several methods to identify text produced by AI. This is important since recent NLG models have improved machine-generated text diversity, control, and quality. But the ability to create unique, manipulable, human-like text with unprecedented speed and efficiency makes NLG model abuses like phishing, disinformation, fraudulent product reviews, academic dishonesty, and toxic spam harder to detect. To maximise the benefits of NLG technology while minimising harm, trustworthy AI must address abuse risk.

‍

Real-world abuse of generative language models is emerging. One AI controversy involved an AI researcher who made a computer programme that writes things like real people on a message board called 4chan. The message board's users taught the programme to say mean and hurtful things, producing many board posts, including objectionable ones, from its training data. He made the programme available for download and viewing, but many websites banned it because it could say mean things. Many AI leaders—scientific directors, CEOs, and professors—condemned this model's deployment.

‍

One of the potential dangers associated with these models is their accessibility to advanced threat actors, as evidenced by ChatGPT's user-friendly web interface. A prime example is GPT-3, which assists Jasper, an AI writing assistant, in generating content through human collaboration. Thanks to Jasper's capabilities, users without technical expertise can furnish the model with prompts, keywords, and voice tone to create vast amounts of blog and website content. This process can easily be replicated using open-source models to produce limitless amounts of targeted misinformation designed for popular social media sites and load it onto grey-market account automation tools.

‍

The ability to detect machine-generated content is essential for several reasons:

‍

Academic integrity: Preventing students from submitting AI-written assignments.
Content trust: Publishers and marketers want to ensure their content reflects genuine thought leadership.
Search engine compliance: Google has made it clear that high-quality content matters, regardless of who writes it, but undisclosed AI usage can raise red flags.
Ethical transparency: Readers have a right to know whether what they're reading was written by a human or a machine.

‍

Ultimately, future NLG research will bring new wonders, but bad actors will also use it. To maximise the benefits of this technology while minimising its risks, humans must predict and defend against abuses.

How to Detect AI-Generated Text?

AI detection tools rely on a combination of linguistic analysis, statistical modelling, and machine learning to identify text generated by models like ChatGPT. Below are the most common techniques:

‍

a. Perplexity and Burstiness

Perplexity measures how predictable a piece of text is for a language model. ChatGPT-generated content tends to have lower perplexity because it follows more uniform, statistically likely word patterns. Human writing, by contrast, often features unexpected phrasing or varied sentence structures.

‍

Burstiness refers to how much variation exists between sentence lengths. Human writing typically shows more burstiness — some short, some long, some complex — whereas AI tends to produce more evenly structured sentences.

‍

Example:
AI output: “The economy is recovering. Inflation is slowing. Jobs are increasing.”
Human output: “While the economy shows signs of recovery, ongoing inflation and market shifts complicate the outlook — though employment is rising.”

‍

Tools like GPTZero assess both perplexity and burstiness to determine if content is likely AI-generated.

‍

b. Watermarking Techniques

Watermarking is an experimental approach developed by OpenAI and others, where invisible signals are embedded in the text itself by subtly adjusting token selection. These patterns don’t alter the meaning but are statistically detectable in bulk.

‍

The benefit of watermarking is that it allows platforms to verify whether content originated from a known model. However, this technique is not yet widely deployed and can be neutralised through paraphrasing or partial rewriting.

‍

c. Machine Learning Classifiers

Detection tools like Copyleaks and Turnitin use supervised machine learning classifiers trained on large datasets of AI- and human-written content. These models learn subtle differences in syntax, grammar, pacing, and coherence.

‍

Some classifiers are tuned to specific writing contexts — for example, academic essays or journalistic pieces — and can adjust their predictions accordingly.

‍

The key limitation is that classifiers may produce false positives, especially with non-native English speakers or structured content like lists and summaries, which resemble AI text.

What are the Tools Used to Detect AI-Generated Text?

Here are some tools and manual methods to determine if an AI wrote a text:

‍

AI Detector

AI Detector has been trained using billions of data pages. It can test up to 25,000 characters (nearly 4000 words).

‍

To use the tool, copy and paste your writing into the detection field before submitting it for detection. In seconds, you'll see a human content score (indicating how likely it is that a human wrote a sample of text) and a line-by-line breakdown of suspicious or obvious AI.

‍

Screenshot from AI Detector website — AI Detector

‍

Artificial intelligence predicts by recreating patterns. AI generators are taught to recognise patterns and generate results that "fit" them. Text that corresponds to pre-existing formats is more likely to be AI-generated.

‍

The differences between AI output and human writing are evaluated through predictability, probability, and pattern scores. Human writing is unpredictable because it does not always follow patterns. Human outcomes vary more and are more inventive. AI writing, on the other hand, only recognises patterns.

‍

Originality.ai

The only non-official AI content detection tool that works with ChatGPT and GPT 3.5 is Originality (the most advanced generative language tool). Originality is a top content checker that detects artificial intelligence and plagiarism. This tool determines content predictability using GPT-3 and other natural language models trained on massive amounts of data.

‍

You get a professional, industry-level content detection checker, which effectively checks copies at the production level.

‍

The tool uses a modified version of the BERT classification model to figure out if a piece of text was written by a human or made by AI. The core of the tool is a pre-trained language model with a new architecture built on 160GB of text data and fine-tuned with millions of samples from a training dataset. This model finds short texts that are hard to understand and is reliable for texts with more than 50 tokens.

‍

To use Originality, paste the content into the checker and scan it.

‍

Unlike Content at Scale, Originality saves scans in your account dashboard. This is excellent for frequently returning to multiple pieces of content.

‍

The AI detection score, not the percentage, indicates the likelihood that the selected writing is AI.

‍

a) Scores for Detection

‍According to the CEO of Originality, content that consistently ranks below 10% is safe! Only when content contains 40-50% AI should you be suspicious of its origins.

Larger sample sizes improve detection accuracy, but accuracy does not imply reliability! The more content you read by a writer, the better you can tell if it is genuine.

Keep an eye out for false positives and negatives. Evaluating a writer/service based on a series of articles rather than a single one is preferable.

‍

b) Complete Sites

‍If detection scores are consistently high or low, AI-written content is most likely. A single article cannot demonstrate that a website or multiple documents were written with the assistance of AI. These detection tools should only be used with extreme caution. More articles from a single source will increase your statistical sample. Still, detection involves many factors beyond what a website can do. The following sections will go over syntax, repetition, and complexity. Originality has implemented a site-wide checker.

‍

Giant Language Model Test Room

The Giant Language Test Room (GLTR), developed by three researchers from the MIT-IBM Watson AI lab and Harvard NLP, is an excellent free tool for detecting machine-generated text (or GLTR, for short). GLTR is currently the simplest way to predict whether or not casual portions of text were written with AI. Copy and paste the text into the GLTR input box, then click "analyse." This tool might be less powerful than GPT-3-based methods because it is based on GPT-2.

‍

The tool estimates the AI origin of the text: the context to the left determines the likelihood of each word being the predicted word. The top ten predicted words are green, the top 100 are yellow, the top 1000 are red, and the remaining are violet. The colour of AI-generated content is green.

‍

Image showing how GLTR AI Detector works. — Giant Language model Test Room

‍

Again, not perfect, but a very good predictor. GLTR is a useful visual tool for evaluating AI content but does not provide a score: you will not be given a percentage or a number that says, "Yeah, this is probably AI." By pasting text, you can estimate how likely an AI wrote it, but you should make the final decision.

‍

AI Content Detector at Writer.com

Although the parameters for detecting AI content could be more explicit, Writer.com provides a free and straightforward AI writing detection tool. You can check text by URL or directly paste writing into their tool to run scans.

‍

The detector includes 1500 characters of AI content that can be checked for free anytime. It detects ChatGPT-generated writing reasonably well.

‍

Writer.ai

‍

DetectGPT

The DetectGPT method is based on computing the text's (log-)probabilities. If an LLM makes text, each token has a different chance of appearing based on the tokens that came before it. Multiply all of these conditional probabilities together to get the whole text's probability.

‍

The DetectGPT method then messes with the text. If the probability of the new text is much lower than the probability of the original text, then the original text was made by AI. Otherwise, if it's about the same, humans made it.

‍

Image showing how DetectGPT AI Detector works. — DetectGPT

‍

GPTZero

GPTZero is a simple linear regression model that estimates how hard the text is to understand.

‍

The confusion has to do with the log probability of the text that was mentioned above for DetectGPT. The exponent of the negative log probability is used to figure out the perplexity. Large language models learn to maximise the text probability, which minimises the negative log probability and minimises perplexity. So, the less confusing a text is, the less random it is.

‍

Then, GPTZero uses the idea that sentences that are easier to understand are more likely to be made by an AI. GPTZero also reports the so-called "burstiness" of text, which is another way of saying how confusing the text is. The burstiness is a graph of how hard each sentence is to understand.

‍

Image showing how GPTZero AI Detector works. — GPTZero

‍

Here are the main features of each tool:

AI Detector

Detection Methodology: Unknown (basic NLP heuristics)

Strengths: Simple interface, immediate results

Limitations: Limited accuracy, lacks transparency

Best Use Case: Casual users seeking quick checks

Originality.ai

Detection Methodology: ML classifier + probability scoring

Strengths: Designed for web publishers, site-wide audits

Limitations: Paid only, may flag heavily edited human content

Best Use Case: SEO agencies, content marketers

Giant Language Model Test Room (GLTR)

Detection Methodology: Perplexity-based statistical scoring

Strengths: Transparent methodology, open access

Limitations: Requires technical understanding, limited UI

Best Use Case: Researchers and developers

Writer.com AI Content Detector

Detection Methodology: Predictive NLP classification

Strengths: Real-time scoring, team workflow integration

Limitations: Lower accuracy on short or informal content

Best Use Case: In-house content creation teams

DetectGPT

Detection Methodology: Log-probability deviation analysis

Strengths: Academic rigour, identifies subtle statistical cues

Limitations: Requires access to original model output probabilities

Best Use Case: Research and educational analysis

GPTZero

Detection Methodology: Burstiness and perplexity scoring

Strengths: Built for educators, scalable for institutional use

Limitations: Sensitive to short content, occasional false positives

Best Use Case: Academic submissions, classroom use

‍

Technical indicators

Another way to tell if AI-generated content is through technical aspects of writing. Look deeply at the content if you need help with the previous tools or want to break down further writing you've seen. Take a look at these:

‍

1. Short sentences are common in AI-generated content. The AI attempts to write like humans but has yet to master complex sentences. This is obvious when reading a technical blog with code or instructions. AI has yet to pass the Turing test. You're in good shape if GLTR or Originality show creative, one-of-a-kind content. Examine the confidently shady technical content.

‍

2. Another method for identifying AI-generated content is repetition. Because it doesn't know what it's talking about, the AI fills in the blanks with relevant keywords. As a result, an article written by an AI is more likely to repeat the same word, like keyword-stuffed articles and spammy AI-generation SEO tools. Keyword stuffing is the use of unnaturally repeated words or phrases. Some articles include their keyword in nearly every sentence. It will take your attention away from the article. It also turns off readers.

‍

3. Lack of analysis. AI-written articles are deficient in complex analysis. Machines are excellent at gathering data but need to improve at interpreting it. If an article reads like a list of facts without analysis, it was most likely written by artificial intelligence. AI-generated writing excels at static writing (history, facts, etc.) but needs to improve at creative or analytical writing. With more information, AI writes and manipulates better.

‍

Agencies and in-house teams are using tools like Originality.ai to verify that content has been written by humans, especially for YMYL (Your Money Your Life) content, where trust is crucial.

‍

There’s also a growing trend of using these tools to blend AI-generated drafts with human editing — aiming to pass detection while scaling production. However, this remains a grey area for search engines and ethics policies.

What are the Challenges in Detecting AI-Generated Text?

While there are techniques for detecting AI-generated text, they have limitations, such as:

‍

With short paragraphs, AI text detectors can be unreliable. As a result, ensure that the text contains at least 1000 characters.
‍
Sometimes, the AI text detector needs to be more trustworthy and claims that the text was generated by AI even if humans wrote it.
‍
While some language models can generate text in multiple languages, these AI text detectors are currently only available in English.
‍
Text detectors can detect text generated by other language models, but they work best with ChatGPT text.
‍
They may fail to detect AI-generated text if humans later edit it.
‍
An advanced enough AI language model may be indistinguishable from human-written text if the language model has access to large amounts of data to learn from.
‍
Additionally, some AI language models are specifically designed to mimic human behaviour and intentionally generate text that is difficult to distinguish from the human-written text. These are known as "adversarial" models and can be incredibly challenging to detect.

‍

So, to summarise:
‍

False Positives: Non-native English writers and students using structured formats may be wrongly flagged.

False Negatives: AI content that is heavily edited or cleverly prompted may evade detection.

Tool Sensitivity: Most detectors struggle with shorter texts or mixed content (part AI, part human).

Adaptation Lag: As language models evolve, existing detectors need constant retraining.

‍

The detection space is in an ongoing arms race with generative AI since every improvement to ChatGPT or similar tools introduces new challenges for detection systems.

Conclusion

So, can ChatGPT be detected? Yes, but with caveats. While detection tools have become more sophisticated, they are not foolproof. Educators, marketers, and publishers must balance detection results with human judgment and policy.

‍

As generative AI becomes embedded in daily workflows, transparency and tool literacy will be key. The future of AI detection may rely not only on algorithms but on industry standards, ethical disclosures, and intelligent human oversight.

‍

If you're interested in learning more about our data science services, including AI and NLP, contact us. Our expert team is committed to providing cutting-edge solutions to help you harness the power of data and AI in your business.

You can also watch here Imaginary Cloud's workshop about "A Watermark for Large Language Models" :

‍

‍

Artificial intelligence solutions call to action

Alexandra Mendes

Alexandra Mendes is a Senior Growth Specialist at Imaginary Cloud with 3+ years of experience writing about software development, AI, and digital transformation. After completing a frontend development course, Alexandra picked up some hands-on coding skills and now works closely with technical teams. Passionate about how new technologies shape business and society, Alexandra enjoys turning complex topics into clear, helpful content for decision-makers.

Vítor Bernardes

Data scientist passionate about data science and watchful of its ethical implications. Besides work, I love nerding out on music and reading a good story.

Data Science

4 strategies to improve your business using Data Science

Companies all over the world are building big data strategies to gain a competitive advantage. Here are the 4 reasons for you to start building the future of your business using data science.

Anjali Ariscrisnã

March 10, 2022