AI
- Last Update:
Dec 15, 2024

From Proteins to Silicon

Our Crown is Getting Heavy

Remember when being the smartest species on Earth was our whole personality? Well, we've created algorithms that are starting to show us up in some pretty specific tasks. We alone created languages, solved complex problems, wrote literature, devised intricate mathematics, and built skyscrapers, satellites, and the internet. In the last few decades, however, this exclusivity has started to fade. Machines, once limited to printing “Hello World” on a screen, now throws a "Sorry, i can not assist with that" for every prompt you give to teach you how to end the world (JK).

Today, artificial intelligence (AI) can detect diseases from medical images with higher accuracy than many doctors, drive cars more reliably than some humans, and compose text that can pass as our own. I would be lying if I say I didn't try to write this post with ChatGPT, but at the end, it couldn't do it as i like and that's why you should not worry about AI which is the reason i started this post in the first place.

So before you start planning for the robot apocalypse, let's break down what's actually happening in these digital beings we've created...

"A.I." is Actually a Broad Term

We start with a big-picture view: “Artificial Intelligence” is the general name we call machines doing tasks that normally require human intelligence. Which means they are "smarter" than an average algorithm.
This "smart-ness" has different levels that have been divided based on their level of sophistication, or in other words "being more similar to us".
Within AI, we have “Machine Learning” (ML), algorithms that learn patterns from data rather than following explicit instructions.
Nested within ML is “Deep Learning” (DL), which uses neural networks (basically a digital version of neurons we have in our brains) with many layers to extract increasingly abstract representations from raw inputs. (we will get to that later)
And inside this deep learning world, we find “Large Language Models” (LLMs) which we are getting familiar with since November 30, 2022 when the ChatGPT borned. and other specialized architectures. The big paradigm shift in human history...

AI can be literally any product on this map


These layers aren’t random marketing terms, they help us understand how computers evolved from simple rule-following devices (like a traffic light that changes color with a timer) to systems capable of recognizing faces, translating languages, and generating coherent essays. Just as humans rely on multiple levels of cognition senses, intuition, and reasoning, AI stacks various levels of abstraction to achieve remarkable feats. Actually this is how we are going to move forward in this post, comparing human beings to AI.

How the mind works?

Before getting deeper into how machines “think,” let’s revisit on our own cognition first. We process a continuous stream of sensory data sights, sounds, smells, tastes, and touches and these signals travel into our minds. Psychologists like Daniel Kahneman describe our thinking in terms of two systems:
System 1 and System 2.

System 1 is fast, intuitive, and emotion-driven, often working automatically without conscious effort. System 2 is slower, more deliberate, and logical, engaging when we need to carefully reason through a problem, focus attention, or handle more complex tasks. While this framework was originally intended to help us understand human cognition, it’s useful to map these ideas into how AI systems and computational models might operate.

I suggest you to learn more about these two systems in this video because it explains it much better than i can, but for the sake of the conversation, here are key takeaways related to our topic:

System 1 (Unconscious Mind)

Intuition and Pattern Recognition: In humans, System 1 quickly recognizes faces, reads simple words, and makes snap judgments based on familiarity and emotion. For machines, a similar effect is the use of trained AI model that can rapidly infer a result once trained, think about AI that can quickly identify an image (like Face ID in your phone), identify a spoken word (like auto generated subtitles on youtube), or suggest the next word in a sentence (like ChatGPT). They don’t “deliberate”; they simply apply the patterns they have previously learned. (and that's why sometimes they don't work properly)

Statistical “Intuition”: Just as our brain’s System 1 relies on heuristics gleaned from past experience, a trained neural network relies on statistical patterns learned from data. Once trained, the network’s forward pass is similar to a System 1 response: it takes an input (for example an image) and quickly produces an output (labeling it as “cat” vs. “dog”) based on vast amounts of prior training. This is fast and efficient but not reflective or logical in a deep sense—it’s recognition, not reasoning.

Heuristics and Biases: Human System 1 is prone to certain biases and errors due to its reliance on heuristics. Similarly, AI models can exhibit biases based on their training data. They may rapidly produce an answer, but if the data is skewed or not representative, the answer might be systematically biased. Like System 1, these models don’t question their reasoning process; they just apply what they’ve internalized.

Yes, you probably didn't notice that there is an extra "THE" in the sentence.

System 2 (Conscious Mind)

Slower, More Focused Processing: In humans, System 2 is what we engage when we solve a math problem, plan a route without GPS, or consider the pros and cons of an important decision. In machines, System 2 analogs appear in processes that involve more explicit reasoning steps, such as search algorithms, symbolic reasoning engines, or “chain-of-thought” prompting in large language models.

Logical Inference and Long-Chain Reasoning: Consider a system that uses a knowledge graph or logical inference rules to solve a puzzle. Rather than instantly producing an answer from statistical associations, it methodically examines possibilities, applies logical constraints, and eventually arrives at a well-grounded conclusion. This is a form of machine-based System 2 thinking—slower, more resource-intensive, but capable of handling complexity and ambiguity better than a fast pattern-recognition system.

Explainability and Step-by-Step Reasoning: One hallmark of human System 2 is that we can explain our reasoning—how we arrived at a conclusion. Certain AI approaches can similarly provide “rationales” or at least a reasoning trail. For instance, a planning algorithm that enumerates different paths before selecting one can show its steps. This makes it closer to System 2, as it “knows” the chain of decisions it took.

Meta-Cognition in Machines: Humans engage System 2 not only for complex tasks but also for monitoring and correcting System 1’s outputs. In AI, there are now techniques where a model’s quick answer (System 1) can be critiqued by another layer or component (System 2), which can verify, refine, or correct the initial guess. This meta-process is reminiscent of how a person might catch a “gut feeling” error by calmly reasoning through details.

You can't solve this maze with intuition, you need to think step by step. But your"gut feeling" quickly tells you that this image is not centered properly, you don't need a ruler for that...

Thinking hurts if you do it correctly

Humans handle complexity by using both systems. System 1 is like a muscle memory for your brain, giving quick judgments based on previous experience (like calculating 2x5). System 2 is a careful problem-solver, stepping in when precision and reasoning are needed (like calculating 14x17). That little pressure you feel in your brain when the calculation gets harder, is called Cognitive Load. Machines feel the same pressure, but on their CPUs and GPUs. Some algorithms rely purely on pattern recognition like System 1 and sometimes they do it messy but quickly (Like when you say "Hey Siri" and your friends phone answers instead. It works, but still has flaws), while others incorporate more deliberate, logical steps and planning like System 2 (for example when you ask AI how many "r" is in strawberry).

You feel less cognitive load when you want to use the picture guid and that's why you prefer that method, even if the written one offers less steps.

The more you use your System 1 in a task, The better result you get, and a better experience your System 2 gets, to later do the task faster but also better!

Same is true about the AI, The more training resources (data, computation power, and time) you provide, the faster and smarter model you have. That's why you see the models are getting smarter and unexpectedly cheaper as well.

To wrap up this topic, I leave you with my favorite quote from Daniel Kahneman before we move on:

"Nothing in life is as important as you think it is when you are thinking about it."
- Daniel Kahneman (Thinking, Fast and Slow)

Acknowledging the World

Our brain creates this experience which we call "life". This experience happens in a reality we call the "world", and we are continuously in interactions with it. We humans naturally group the world’s endless complexity into manageable chunks. We notice colors, shapes, and movements instantly, thanks to pre-attentive attributes and gestalt principles. These principles are the reason you can say one of the charts are random, two of them has some sort of meaning and the other one is definitely fake:

Gestalt principles guide our visual perception and help explain how we effortlessly make sense of complex visual environments.

The principles of Gestalt Theory have enhanced our understanding of human perception related to visual design and perceptual grouping. In the 1920s, Gestalt psychologists in Germany studied how people make sense of discrete visual elements by subconsciously organizing them into groups or patterns (System 1 in action again). The German word gestalt means "shape or form." One of its founders, psychologist Kurt Koffka, described the Gestalt Theory as "the whole is something else than the sum of its parts" which means the unified whole takes on a different meaning than the individual parts.

Pattern recognition is a fundamental capability that underlies both human intelligence and artificial intelligence. While humans excel at intuitive pattern recognition, machines approach it through systematic analysis of different things. We see faces in clouds and patterns in noise. AI, similarly, uses algorithms to detect patterns and anomalies. While we see a cat at a glance, a machine might see a grid of pixels. But by extracting features (edges, curves, textures), AI models learn to recognize objects just as reliably sometimes more so. Now they don't need Gestalt principles for that, it's just for us to describe how we naturally perceive and organize visual information, AI does it with "Data" and calculations...

Simple Pattern Recognition in Machines to Classify Handwritten Numbers

Feeding the Machines

"Data! Data! Data!" he cried impatiently. "I can't make bricks without clay."
-Sir Arthur Conan Doyle (Sherlock Holmes)

As I mentioned, in our human experience, perception doesn’t happen in a vacuum. We rely heavily on our five senses. These senses deliver information about the world: colors, sounds, textures, flavors, and scents. This raw information is the foundation upon which we build understanding, discover patterns, and form memories. Without data without something to perceive our capacity to learn and reason would be inert.

Similarly, artificial intelligence systems require data as their fundamental input. However, “data” in the context of AI looks quite different from human sensory experience. Instead of the rich tapestry of human sensations, machines typically process data as numbers, symbols, and encoded signals. For example:

Visual Data: For an AI, an image is a grid of pixel values, each pixel represented by a set of numerical intensities. Where you see a cat’s whiskers and a glint in its eye, the machine sees a matrix of brightness levels and color channels.

Audio Data: While humans hear melodies and voices, an AI “hears” sound as digital waveforms amplitudes and frequencies sampled at rapid intervals.

Textual Data: Humans read meaningful sentences and understand their implications. A machine breaks text down into characters, words, or tokens, often turning them into vectors of numbers that capture statistical relationships rather than semantics (we will talk about that later).

Sensor Data: Robots and IoT devices rely on temperature readings, pressure sensors, GPS coordinates, and other measurements. Each sensor output is a number, a piece of data AI can interpret and use for tasks like navigation, monitoring, or prediction.

As the modern world becomes increasingly digitized, everything from financial transactions and social media posts to satellite images and medical scans turns into an endless supply of data. This explosion in volume, variety, and velocity of data is what we now call Big Data (Which means data that is being generated faster than it can processed). Just as a human child learns more by having rich, varied experiences (touching, seeing, hearing new things), an AI system grows “smarter” as it’s fed larger, more diverse datasets. More data helps AI models identify more subtle patterns, make more accurate predictions, and adapt to new circumstances.

Yet, there’s a key difference: humans have evolved sophisticated filtering and attention systems. Even as you read this, you’re bombarded by countless stimuli slight noises, faint smells most of which never fully register in your conscious mind. We focus on what matters, guided by instincts and reasoning. AI, on the other hand, doesn’t automatically know what to ignore. Without careful data selection, preprocessing, or algorithms designed to highlight the right features, an AI model might attempt to learn from irrelevant or noisy data.

Still, the parallels remain. Humans and AI both rely on raw inputs from their environment to learn. Humans convert sensory experiences into perceptions and memories. AI transforms raw numeric inputs into patterns and models. One does it organically with neurons and synapses; the other does it with algorithms and memory chips. While you can smell the gas, AI and detect it with it's sensors, same deal. Data is the thread that connects both forms of intelligence to the world around them.

This understanding of data as the “food” of AI sets the stage for how learning occurs. Just as a child’s mind matures from raw sensation to conceptual understanding, AI systems progress from raw data inputs to meaningful pattern recognition. With a clearer grasp of the importance of data, we can move on to next part on how these machines learn...

AI learns just like humans

When we talk about “learning” in the context of AI, especially in the analogy of System 1 and System 2, it’s worth dissecting what’s really happening behind the scenes. Machine learning does not learn in the human sense, where we integrate knowledge into a rich tapestry of experience and context. Instead, machines adjust parameters or manipulate symbolic representations to better perform a given task. But the process of learning is very similar to humans learning and this process has multiple steps.

Before any learning can happen, a machine needs data. As humans, we automatically start forming internal representations of what we see or hear. A newborn baby doesn’t understand language yet, but by continuously receiving audio and visual input, they gradually discern patterns, like which sounds are associated with a parent’s face.

For a machine, data could be a collection of images, text documents, sensor readings, or historical financial transactions. This raw data is analogous to the sensory stream the human infant receives. Without it, there’s no foundation from which the machine can learn.

When a human infant encounters the world, there are no labels attached to objects. Before a parent ever says “This is a dog,” the infant’s brain is clustering shapes, sounds, and motions into rough categories. In the world of AI, we call that Unsupervised Learning, which means grouping similar data points without labels, like a baby noticing that round objects go together.

For example, a clustering algorithm might discover that a batch of images naturally separates into groups: one group of round objects (balls), one of four-legged animals (dogs), another of leafy shapes (trees). There’s no label “dog” here—just a recognition that certain patterns reoccur together.

When a parent points to a dog and says, “Dog,” the human child connects the sound/word “dog” to that pattern of fur, four legs, tail-wagging. The child refines their mental model: not only are these shapes and movements one category, but now they have a name. For machines, Supervised Learning is the same exact concept. An image of a dog comes with the label “dog.” The model uses these pairs (input, label) to incrementally adjust the algorithm so that it can predict “dog” for any similar image in the future. Over time, the machine builds a powerful mapping from visual features to the concept of “dog.” Without these labels, the model might understand groups of similar images but not what they represent.

Now we all know that these are not the only ways we learn in life. We mostly learn from different experiences we have, and to be more specific, our mistakes. You can see a baby crawling towards the red shiny thing on the table, they reach out to touch it just to find out that it hurts their hands because it was a hot cup of tea. And that's how they learn they should not touch the things that steam coming out from them. Same happens when they say "Mama" for the first time and see their parents laughing and kissing him so it will do it more and more. In the example that touching a hot mug is painful, there’s no label, just an action (touching) followed by a consequence (pain). Gradually, the child learns to avoid th behavior In the world of AI, we call this method Reinforcement Learning.

An AI agent tries actions in an environment and receives rewards or penalties. Over time, it learns a policy, an internal mapping from states to actions that maximize long-term reward. Instead of associating images to labels, it associates situations and behaviors to outcomes. This is more similar to System 2 engagement because it often involves planning and foresight: to achieve a long-term goal, the agent might need to take a series of steps and reflect on consequences, much like a human might strategize several moves ahead in a board game.

With practice, even a simple machine can get better generation by generation.

There are more learning methods in the the real world, for both humans and machines, and for the last example in this post we have a one of the most used methods. Humans can often learn a new skill more easily if it’s related to something they already know. For instance, if you’ve learned one romance language, picking up another is easier. Or when you go to medical school, you first learn about the fundamentals and then you will pick a major that you want to get specialist in. In AI terms, Transfer Learning allows a model trained on one task (e.g., recognizing text and ability to write one) to be repurposed for another (e.g., writing in your style or generate insights using your own knowledge base) with less data and time. The model’s internal representations serve as a starting point—just as your knowledge of Spanish helps you tackle Portuguese with fewer lessons. Actually in "ChatGPT" the P in GPT is referring to this exact concept. GPT is short for Generative Pre-trained Transformer, these models are trained on a large body of text data beforehand. This means it starts with a broad understanding of language, which can then be adapted (or “Fine-Tuned”) to specific tasks or domains with relatively small amounts of additional data.

Now that we have covered the most methods and techniques in Machine Learning, let's discuss about the process of actually using them...

Understanding the problem is half of the answer

The best way of teaching this part is to start with few real life examples. Real-life examples help ground abstract concepts in familiar contexts, making them easier to understand and relate to.

Imagine that we have found a chest in the beach and we decide to open it and see whats inside. Good news is that we have found lots of coins, but bad news is that they are so old and rusty that we can’t say what coins are they.

So we decide to sort them by the size just to find out how many types of coins we have found, it seems to be a good start. This trick is very similar to the unsupervised learning method, because we can’t say label them yet, we are just categorizing them. Once we finish with the grouping, we can see we have two groups of small and big coins. But it’s still very wage, there are lots of small and big coins, so we need another parameter to the coins and divide each group by their weights.

Now we have two new group of coins, heavy ones and light ones.Now that we feel more confident in finding the type of coins using these parameters, it's time to introduce the labels. We bring out the coin collection we have and start measuring the same parameters which are size and weight. We have found 4 coins that are in the range of the coins we have found. So we bring these 4 clean coins to use them as labels for our rusty coins. by placing all of the coins (rusty and new one) on the same chart, we can see they are a perfect match, so we assume that the rusty ones should be the same exact coin as the new ones we added to their groups. This part of the solution is exactly what we see in Supervised Learning approach in Machine Learning.

Congrats, you have just solved the mystery. Now in real life scenarios when we are dealing with raw data, it's more complicated. There might be more noise into the data, more candidates for the labels and having more parameters than two. But overall, this was a simple example of what we are doing in machine learning projects.

We usually have two types of problems in Machine Learning: You either trying to predict discrete labels (like tagging the emails with "spam" or "not-spam" labels) which we call Classification, or you are trying to predict continuous values (like estimating traffic) which we call Regression.

What we did in the coins example was a simplified classification problem, We saw how adding new dimensions to our data helped us with the solution. Now lets see a simple Regression Example...

Let’s say we are a farmer and we want to price our oranges for selling in the market. If we price it very low we might lose some profits and if we price it really high there is a chance that nobody buys it. So how we can find the optimal price for our oranges? First thing we can do is data gathering. We go to the market and buy one of every type of orange we find at the market and write down the prices. This helps us to figure out how other people are pricing their oranges. Once we have them all we start the same process we did with the coins, dividing them by their parameters.

First we start by sorting them by their size, and since we want to find the price, we also sort them by their price. Then as you see it becomes very obvious that larger the oranges get, their price also rises. So there is a direct correlation between the size and the price. So once we draw a line that we saw in the chart, we can just add our orange in its place in the sorting, and we can find the price range using the other axis.

Other real life regression patterns

Bad Learning (over fitting)

Now we saw how adding new parameters to our data makes it multi dimensional and easier to understand.

Now in the classification we mentioned its useful for detecting spams, but how? we have to do the same thing we did with the coins

Blog

Latest Updates and Articles

More Blogs
Contact

Get in Touch

Ready to bring your ideas to life? Get in touch to discuss your project and see how we can create something amazing together.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.