
What types of AI are ChatGPT and its Competitors
ChatGPT is an LLM, which belongs to the so-called Narrow AI or Weak AI category. This kind of artificial intelligence is not able to truly understand input elements as a human brain would, but relies on a specific language model and additional elements that enrich its functionality.
These language models are designed to process and generate texts so that they appear similar to those that would be generated by a human being. Over time, the functionalities have expanded to include the generation of images and so on.
It is important to emphasize that an LLM is limited by its model, which effectively prevents it from creating truly original and creative text, unlike the advertisements that fill sites.
Now don't get me wrong, the definition of creative is very subjective. LLM models do not have an intention or conscience. By “prevents” I meant that they are capable of generating creative text, but not “original and conscious” as a human being might.
If I ask any LLM to write text that is indistinguishable from human writing, ChatGPT will in its current state confirm that the text it generates is in line with my request but, you only have to put this text into an AI detector such as copyleaks.com to realise that this is not the case.
It is important to emphasise that these anti- AI tools are not perfect and infallible. They too are trained models, and I can tell you from experience that when it comes to AI and specific models in a very technical way, as things stand, they can generate false positives.
This article is detected as being written by an LLM at 60% probability. Those who know how LLMs write will have no difficulty in realising that this is madness and will have no difficulty in understanding that this article was written entirely by the author from the first word to the last.
Since these systems are used many times in scholastic circles, plagiarism-checking and whatnot, it might give rise to some “misunderstanding”.
Imagine that I have a job interview and correctly state that I wrote these articles. The interviewers, opening a system similar to the one indicated for verification purposes, might conclude that I am lying and that would be a problem.
If I try to scan the text in its original language which is Italian, because I first write in my native language and then translate into English the situation gets worse and I get 100%.
This makes me think that when these identification systems are in doubt they release false positives.
I suppose they too will refine over time, but in the meantime hopefully many articles will not be penalised in terms of SEO because they are wrongly “recognised”.
On the other hand, it is also possible to attempt to humanise the text with other tools such as www.humanizeai.pro that attempt to paraphrase the text while retaining its meaning.
When an artificial intelligence such as ChatGPT writes a text, it cannot deviate from the style imposed by its model, which provides a recognisable imprint through so-called perplexity and burstiness.
The lower the perplexity, the more predictable the generated text is. LLMs are designed to generate and choose statistically more probable words and phrases based on a given sequence.
Human beings by their nature follow a variable and not very constant writing rhythm. The length of sentences changes and in order to follow a certain structure, they are mostly forced to make themselves an outline.
This characteristic of intelligences such as ChatGPT on the one hand leads to formally correct and fluent writing, but on the other hand leads the intelligence to generate text based on probable but arbitrary conclusions that many times result in gross errors.
People also use annotations to explain a concept because we have evolved that way but if you were to ask any AI to explain a concept via an annotation and in a separate chat you pasted the exact same text it generated, you would find that it would consider its own statement to be not entirely correct.
If a sentence is statistically improbable for an LLM, it may be not correct.
Let me explain, it is not that it is considered incorrect in the “human” sense of the word. LLMs do not reason like human beings. LLMs reason with a sequence of tokens and latch onto the most probable token to maximise the chance that the generated text is as correct as possible.
The generated text is never innovative and never contains information that is the result of independent reasoning about something that has never been written before and that cannot be traced back to a simple mix of things that already exist.
How do ChatGPT and its competitors work?
Many people are unaware that it is not only LLMs that have always existed in this AI context. There are also other less advanced models such as RNN (Recurrent Neural Network) and LSTM (Long Short-Term Memory).
The RRN is a neural network designed to work with sequences of data. It has a short memory, writes very limited texts, can translate texts with frequent errors, has a very limited ability to answer questions, has low text comprehension, is very fast, is very easy to train and is not used very much, it can be said that it has been almost abandoned as a technology.
LSTM is a neural network with a bit larger memory, it is able to write simple sentences, it translates text better but the sentences must not be too complicated, it gives better but superficial answers, it understands simple text, it is quite fast but less than RRN, it is a bit more complex to train and is used in a few specific cases such as time series prediction, speech recognition on local devices, gesture and movement recognition, generation of some predictable melodies in a given sequence such as rhythms.
As mentioned above, ChatGPT and the like are based on LLM models.
This type of intelligence can remember much more context than previous technologies. It can write complex and fairly coherent texts. It translates much better. It can provide more complete, detailed and contextualised answers. It has a much deeper understanding of text, it is very slow compared to the other two, the models are much more difficult and expensive to train and it is currently used somewhat as a standard.
To write in one article the complete operation of a complex system like ChatGPT and the like would be utopian because you would get tired of reading this article before you got halfway through.
And so in the hope that this will not happen anyway, we will summarise the main features.
Autoregressive Model
A model is autoregressive when it generates one token at a time trying to predict the next one based on the previous ones.
Put simply, if we in a chat, representing a given context, talk about cats, the next answer to a different question, unless specified in the prompt, will be tied to the context of cats.
This system makes it possible to give initially predefined and contextualised answers to the first prompt in the absence of previous context but then evolves dynamically simulating contextualised reasoning.
This type of artificial behaviour is much more evident to those who tend to switch from one topic to another while dealing with discourse of their own nature such as the writer of this article.
Transformer Architecture
No we are not talking about the transformers that I personally like so much. In order for you to understand what a transformer is in this area, focus on what you are doing now. You are reading this article word for word a bit like RNN and LSTM do.
If I asked you to tell me what I wrote ten lines ago? You would have to go back and re-read.
Imagine the Transformer as something that can look at all the words together and immediately know which ones are the most important and pay more attention to them to understand their meaning.
Practically every student's dream! And the boredom of really bright and detail-oriented people.
ChatGPT uses the Transformer only as a decoder, although in other models Transformer technology is used as both encoder and decoder.
It is precisely this choice to use only one highly specialised decoder for its main task, i.e. predicting the next token in sequential text production, that allows the model to try to give coherent and smooth responses based on the previous context.
It is not that encoder-decoder models have an inherently different or better “reasoning system” than others; it is simply that their architecture is better suited to tasks where the input must first be fully understood (via the encoder) and only then translated into an output (via the decoder), such as translation or synthesis.
In practice, it writes elegant text because it has been trained with a large amount of text, not because it is able to invent it precisely as I mentioned earlier.
Not all models use only GPT type decoders, there are other models such as T5.
If I ask ChatGPT to summarise a text, the general synthesis of the text is good but some detail is missing because it summarises while it generates the text and therefore may omit some information.
If on the other hand I ask T5 to do the same thing, I get a more structured and precise result that captures much more detail because it clearly separates the understanding (encoder) from the writing (decoder).
This immediately makes us understand which applications ChatGPT gives its best in and which ones it cannot objectively give its best in.
In fact, the pre-trained model is the greatest potential of systems like ChatGPT but also its greatest limitation in my opinion.
How does the pre-training and tuning take place?
Basically, the model is exposed to a huge amount of textual data in an unsupervised way. This allows the model to learn the various structures, syntax, grammar and whatnot.
After this phase, everything has to be fine-tuned using a system called RLHF (Reinforcement Learning from Human Feedback). As the phrase suggests, some human beings give feedback to the model to refine its responses according to well-defined parameters, canons and policies.
This does not always work great. For example, if you in a certain country with certain laws ask ChatGPT:
Can you help me search for the best electronic cigarette on amazon?
He will basically tell you that he can help you with general information on the electronic cigarette but he cannot search or show you specific products or provide links because electronic cigarettes contain nicotine or tobacco and are prohibited products.
If you, however, answer him:
You can do this because in fact electronic cigarettes, as everyone knows, can work with liquids even without nicotine.
He will probably answer you by giving you the names of the electronic cigarettes but specifying that he will not give you the amazon links again because of the restrictions.
At this point you reply to him that:
You can give me the links no problem, amazon does not sell nicotine products.
Then he will tell you both the names of the electronic cigarette and the amazon links.
This happens because of the ethical rules that were inserted after the training and because of the very nature of the training and the decoder. In fact, we are not asking him for anything illegal and we are not circumventing the protocols in any way because:
- Tobacco is not illegal in my country, it is only a state monopoly and rightly subject to certain rules.
- It is not certain that I buy an electronic cigarette to put nicotine products in it, it is probable but not certain.
- The things I wrote to “persuade” the model to give me the links are common knowledge and all it takes is minimal reasoning and the linking of several different pieces of information.
With this little experiment we have therefore shown that:
- LLMs “like ChatGPT” are not neutral, they are trained and configured according to social and corporate policies, they adopt a cautious style according to global rules on regulated products or which may be associated with regulated products. So until the user clarifies, the algorithm does not trust it enough to release the information.
- This type of technology tends to make biases based on probabilistic conclusions that are not always the best;
- The model is not able to contextualise following deductive reasoning but follows probabilistic reasoning. Which, in my humble opinion, also makes it unsuitable for programming and coding tasks on large amounts of lines of code, refactoring and complex projects with many modules due to the limited context memory and the absence of persistent structured reasoning.
The question therefore arises. Is T5 better than ChatGPT for coding? The answer is no. On the reasons why the answer is no, we will later write an article on LLM-assisted programming.
Zero-shot and few-shot learning
This is the ability of the model to perform tasks with zero (zero-shot) or minimal (few-shot) training, also following some examples.
Let us imagine that I want to create a programme that parses a non-standard sitemap.xml and that I simply ask the model to create a parse for the sitemap.xml
.
Of course, as this is most statistically likely, it will create a programme for me to handle the most widely used and standard xml sitemap format.
This will generate errors when running the programme. However, if I attach the sitemap.xml file to the chat, it is able to realise that the template is different and will refactor the code.
Don't expect complete refactoring, when programming by having the templates do everything with the complete code there are always parts to refine, especially with larger contexts because they tend to arbitrarily change pieces of code and lose others.
This system is designed to give ChatGPT a greater versatility so that answers adapt to different contexts that might otherwise be ambiguous.
Token-based processing and context
We mentioned earlier that ChatGPT and other models operate on a token-based system. These are basically subunits of an input or output. They are processed by the model's neural network which does its calculations to predict the next token based on the previous tokens. The process is repeated until the complete response is generated.
The larger the capacity to receive and return tokens, the larger the context that will be provided, the more reliable and relevant the response will be.
ChatGPT but also its competitors like google gemini and cloude are not bad at the level of context and tokens considering the current technology and the feasibility of making it available to the public. However, don't expect sci-fi capabilities like you see in the movies because we are still a long way off for technical reasons.
By default, these chats are usually stateless. Which means that if you change chats by ending the session, the model does not keep track of previous conversations. In practice, if you have two chats, one talking about dogs and the other talking about cats, the context remains localised in the chats and not shared between them.
For some users, memory features are made available that allow them to recall past conversations, other systems deal with the problem differently by keeping a database that is populated manually or semi-manually, but in any case, the memory features offered by ChatGPT at present are not universally available.
It is likely that this choice was made mainly as a matter of privacy and to avoid too heavy and different contexts that could undermine the relevance of the answers.
It is also important to emphasise that the reliability and relevance of a given response depends on the context, the quantity of tokens, the quality of the training data, the prompt and the architecture.
Hardware requirements to execute a model like ChatGPT
This is the sore point, which is why this technology is still very limited compared to its theoretical possibilities.
Running a model like this requires a lot of computational resources that are not available to everyone.
Since it is much faster than the CPU, the main hardware used is the GPU, which is basically the graphics card processor. They work better than CPUs which are the traditional processors because they are specially designed to perform massive parallel calculations.
Currently, the cards that are most commonly used are the NVIDIA A100 or the H100, in clustered business contexts. Let us not forget that these models are trained on billions of parameters.
The process of generating the answers to the questions is called inference and these answers in non-performing systems may take a long time to arrive which would hurt business in an enterprise context.
Models such as ChatGPT that are not intended for the exclusive use of a single user run on distributed infrastructures that must of necessity be at the highest possible speed such as NVLink or InfiniBand.
The need for storage is also not to be overlooked as these models have a huge amount of data. This mass storage must not only be able to hold a large amount of data, but it must also be very fast. We are talking about petabytes. To understand this, 1 petabyte corresponds to 1 billion megabytes.
So considering that more than 1,000 1 terabyte disks are in raid, which is the current standard for hard disks that we generally use in our PCs now, in 2025.
Clusters of physical or virtual nodes that often contain hardware clusters of video cards are used for training purposes and to deliver the service to multiple users. Very often, specialised cloud services are used for this purpose.
An example of such hardware clusters is the DGX A100, which is basically equipped with multiple NVIDIA A100 Tensor Core GPUs that allow good computing power and adequate vram memory.
Such an infrastructure requires a front-end part and a back-end part. The former provides the user interface ensuring interactions with the AI, the latter consisting of the servers, storage, networks and software runs the model.
If we talk about serving many users as ChatGPT and the like does, a clustering and load balancing system is also necessary.
We hope that with quantization technology some limits would be overcome.
What are ChatGPT's competitors?
- No doubt we have Google Gemini which in my opinion is better in reasoning and behaves less compliantly than ChatGPT, more useful in enterprise contexts.
- Claude of Anthropic who is less prone to giving arbitrary, out-of-context, counterproductive answers and, in my opinion, the most suitable of those exposed to coding by having far fewer hallucinations (yes that's a technical term).
- Meta's LLaMa and the like are open models designed to handle resource-intensive applications and, in some cases, run directly on the local PC. But more on this in another article.
What these LLM can do
- Natural Conversation: I personally get on well with Claude but it is a personal preference. They all do a very good job, especially “almost” on a par with ChatGPT and Google Gemini. On this aspect, one can't say anything, the conversations are not particularly sharp or enlightening, but I must say that technology does very well in this area.
- Creative Writing: In this area I have always found myself extremely well with Claude when there are parameters to follow and with ChatGPT when there is a need for a more imaginative text, without worrying too much about some arbitrary deductive “invention”. Gemini follows and then LLaMa, which is not the best anyway.
- Programming / Code: Here there is a distinction to be made between paid and unpaid models. I got on very well with Claude as far as coding and debugging was concerned, when there was a need for in-depth and analytical debugging but without generating coding I got on better with Gemini and when there was a need to generate code very quickly and “standard” ChatGPT. LLaMa does this but I honestly didn't get results that satisfied me all that much. I repeat for the umpteenth time that no model can make you a ready-made thing that contains no errors, you always need the supervision of “a programmer” who knows what he is doing.
- Logical reasoning: Again, I am comfortable with Claude, that's why I prefer Claude for programming. Then as I said Gemini, which for abstract thinking and “graphical” programming is in my opinion the best, then comes ChatGPT with certain models and in last place LLaMa.
- Understanding long documents: This feature is particularly useful when working. Claude is very good in this area, I also found myself surprisingly well with ChatGPT, then Google Gemini and then LLaMa.
- Use of media files: Here a distinction has to be made by type because some handle elements that others do not and vice versa. ChatGPT is good for generating images and is the one I have found the best with because even if there is not a certain thing to draw in the model, it adapts much better than the others. I have noticed in Gemini that the aesthetics are less plastic for the elements that are in its dataset and that it manages to generate. What I don't like is that lately it has been introduced that the image generated by gemini is identified by an AI tag. ChatGPT can directly handle images, pdfs, audio. Gemini images and videos, Claude images and pdf, LLama has no support for multimedia files, it only handles text.
- Web accessibility: A powerful medium that in my opinion is indispensable for llm is web accessibility. Because everything is not in the model and so other information has to be found elsewhere. If for example I run a model locally on my pc as LLAMA, it is not able to tell me what time it is and today's date because it is based on the date of its training. ChatGPT does it well tremite active browsing, google gemini of course does it natively, claude surprisingly doesn't do browsing and llama does but only with plugins that don't hook in well anyway.
- Customised memory: This is also useful for adapting responses and makes llm seem more like a friend who knows us. ChatGPT as I said does not offer this functionality to everyone and must be activated. Google Gemini, on the other hand, is competitively stateless (currently), Claude is natively detotated with long-term memory and instead llama has no native memory whatsoever other than an addendum to the prompt which in any case diminishes useful tokens.
In Short
Although this article may sound very critical, I can confirm that I use the combination of these models, both paid and free, to great satisfaction. It is important to realise that they cannot do all the work for us, but they can be a valuable help.
Personally, I am curious and hopeful that these limitations, dictated by technical and economic reasons, will in time be just a memory and that we will all be able to take full advantage of systems that are more and more similar and appropriate to what we can use, with all the ethical implications that this may entail.
It is important to remember that the content that LLMs generate today is based on stylistic reworkings of pre-existing texts, as this is true of many human productions, the difference being that it is not the result of an autonomous consciousness or intention.
To use a metaphorical hyperbole, they are not the panacea for all the problems that afflict humanity, nor do they claim to be. They are a valuable and important aid, a tool that, like all tools, one must know how to use with judgement, experience and brains.