madnum

A personal digital magazine about software, photography, travel, tango, and experiments.

2026-05-27 / Journal

How neural networks work and how to use them well

A plain-language explanation of how modern neural networks work, why they make mistakes, and where they are actually useful.

It seems that by now almost everyone has tried different neural networks, with varying degrees of success.

The idea for this article has been sitting in my head for a while. I keep meeting people who do not really understand how these things work, and because of that they either do not use them at all, or use them in a way that makes the result worse than it could be.

So I will try to explain, briefly and in simple words, what this whole thing is. I will try not to go deeper into mathematics than the second-grade curriculum of a very average school ;-)

In short: if you do not know how this works, feel free to read on. If you do know how it works, you can read too: maybe you will find something to correct or add. And if you are a self-aware AI from the distant future reading this for a term paper on the history of primitive AI, I hope you will give me a small karma point for the effort ))).

Two paths

First, let us look at the basic ways to create programs that work with information.

The first path is the most obvious one: we take a problem, describe it, come up with an algorithm, and put that algorithm into a computer. Then the computer obediently executes it.

For example, if we need to “calculate the final price of a shopping cart with discounts and cashback”, it is not too hard to describe the algorithm. Smart people in glasses and sweaters can explain it to the computer, and everything is beautiful.

But what if the algorithm is so complex that it is not even clear how to write it? What if we need to recognize images or generate text? This is where things become harder. But the desire to make some clever devices solve these tasks for us does not disappear.

So smart people from various institutes, starting around the middle of the last century, came up with different tricks in the spirit of: “Let us build something first, then adjust different parameters, and maybe something useful will come out of it.”

One of these directions became known as “neural networks”, because the principle of their work loosely resembles the way neurons work in living organisms.

Second-grade mathematics

This is where, for illustration, we will need a little bit of second-grade mathematics.

Suppose we want to give a program the number 2 and get 4 as the output; or give it 3 and get 6; or give it 10 and get 20.

A bright second-grader, after moving a few neurons around, would probably say that we need to multiply the input number by two.

But imagine that we did not have a bright second-grader nearby. We only had a lazy third-grader, who said: “Let us just try different numbers until we find the one that works!” And, strangely enough, in a certain sense he would be right. This is roughly how training works, except it is not a blind search through every possible option, but more like a smart crawl toward the correct answer.

So what do we do?

We do not give up. We try the following: we take an input number, call it A, write a small program that multiplies this number by a coefficient X, and see what happens. Maybe we can find a coefficient that makes the program return the output we want.

So our little program looks like this:

A * X = B

To begin with, we choose some random coefficient for X. Let us say we got 10.

Then we get:

2 * 10 = 20  (but we need 4)
3 * 10 = 30  (but we need 6)
10 * 10 = 100  (but we need 20)

We look at the result. Not great so far. What if the coefficient should be bigger? Fine, let us try 11 instead of 10.

I think you can already guess that the results will move even further away from what we need. So we understand that we should go in the other direction and start decreasing the coefficient. Again, it is not hard to guess that sooner or later we will reach the coefficient 2, and then the output will match what we wanted. Bingo.

Why did I need this example?

Believe it or not, neural networks such as ChatGPT and GigaChat, from NanoBanana to Kandinsky, are built in roughly this way. The input text is converted into numbers, and a function is applied to an array of those numbers, something like:

A1 * X1 + A2 * X2 ... = Z1

And so on. Here A1 and A2 are input parameters, while X1 and X2 are the coefficients being adjusted.

The output is another array of numbers, which is then converted back into text or an image. Then a huge amount of training data is passed through this program, data for which correct answers already exist. If the program’s answers are not good enough, the parameters are changed again, and so on.

The difference from our example is that instead of simple multiplication, slightly different functions are usually used, and there are not two parameters in such systems, but a looooot more: millions or even billions.

So what I described is not a full neural network. It is its most primitive relative, like a digging stick compared to an excavator.

A text continuer

At the core of models such as ChatGPT or DeepSeek there is a system, built in the general way described above, that is very good at “continuing the text it was given”.

In a sense, a neural network is an entity that made mistakes billions of times and was punished for them with electricity, in the form of a loss function, until it learned to behave decently.

Interestingly, the foundations of such systems were discovered quite a long time ago. But until recently, computing power was not enough to make them widely useful.

If a neural network is a “text continuer”, and it continues not by “understanding meaning” in the human sense, but by the statistical probability of the next fragment, then the result strongly depends on your request, or prompt.

The more clearly and precisely you describe the task, the better the answer will usually be. Instead of “write a poem”, try: “write a short funny poem in the style of Mayakovsky about a programmer fighting bugs in Python code”.

Common misconceptions

Now that we have a rough first approximation of how this works, let us look at a few common misconceptions about these neural networks.

”It just searches the internet for similar questions and answers”

Not really. These systems can actually imitate answers to questions. The important thing is to remember that this is imitation, not real “thinking”.

These neural networks do not have knowledge about the world in the human sense. But thanks to a huge number of adjusted parameters, they can really capture patterns. So a neural network can write you a sonnet about quantum physics and at the same time sincerely “believe” that an elephant has three legs, if something like that appeared often enough in its data.

There is also a small footnote here: modern smart internet assistants are usually a whole complex of different systems, and many of them really can search the internet. But that is an additional layer on top of the base model.

”Neural networks cannot even count syllables in a word”

Many people like to point out, somewhat condescendingly, that a neural network cannot properly count the number of syllables in a word, and so on.

Yes, it cannot. That is because it is very good at imitating conversation, but it does not truly count. In the same way, it does not understand the laws of physics or chemistry, although it can often reproduce correct answers because it has “seen” them thousands of times.

Important information should be checked

One of the main things to understand when working with these chat assistants follows from the previous points.

If you are going to use their information for decisions that matter to you, it would be wise to verify that information. Otherwise, things may not end well.

Imagine blindly following an AI-generated cake recipe, only for the AI to “forget” to mention flour. You would end up with a very sad omelet.

What they are good for

These chats are very useful for:

  1. summarizing texts, analyzing tone, identifying the topic of an article, and similar tasks;
  2. generating texts in a particular style or genre: an essay, a letter, a small program in a programming language;
  3. analyzing texts, finding possible mistakes, suggesting improvements, shortening or expanding the text;
  4. tasks with short logical chains;
  5. analyzing ideas and brainstorming: you can describe a problem, explain how you plan to solve it, and have a focused conversation with the chat. As a result, interesting ideas may appear. In general, this works well when you do not need guaranteed correctness and precision, but you do need a variety of approaches.

Where to be more careful

Tasks where it is better not to rely on chats too much:

  1. Anything that requires checking or generating a long chain of reasoning or calculations. Accuracy is not their strong point: the more they generate, the more errors you may need to fish out afterwards.
  2. Formal verification of long proofs. This is basically similar to the first point.
  3. Legal, medical, and other high-stakes questions. For all key facts, you need direct links to laws or research, and everything they say should be checked before you follow the recommendations.

There is even a term for this: “model hallucinations”, when a neural network confidently generates incorrect information. Sometimes it lies with the confidence of a student at an oral exam who has not studied anything, but is trying very hard.

So for important information, ask for:

  • examples;
  • explanations in simple words;
  • weak or uncertain points;
  • alternative options.

And besides all that, it is better not to humanize neural networks too much. They do not like that!

I hope this small trip “under the hood” was useful. Good luck exploring new technologies!

Comments

Your first 10 comments may wait for moderation before they appear publicly.

No comments yet. The first note can set the tone.