The other day, I found myself in a conversation trying to explain generative AI (GenAI) to a colleague who, like many, was grappling with the concept. As we talked, I reached for an analogy that I hoped would make the complex idea more tangible. I asked, "Do you remember playing with LEGO bricks as a kid?" Instantly, their face lit up with recognition.
We talked about how, with just a handful of LEGO bricks, you could build nearly anything your imagination could conjure—houses, cars, spaceships and beyond. The creative possibilities felt limitless, even with just a few pieces. And that’s when I made the connection for them: "Think of generative AI like a vast collection of LEGO bricks," I said. "Each brick represents a piece of knowledge, a concept, a pattern the AI has learned from its training. When you ask it to generate something—a piece of writing, an image, an idea—it’s like giving it the freedom to build with those bricks."
But here’s where it gets interesting. If you handed someone a box of LEGO pieces and asked them to recreate the famous LEGO Death Star without instructions, it would be a monumental task. They might get close, but chances are, the result would be something new, something unique. And that’s how GenAI works—it doesn’t just copy what it’s seen; it constructs something original each time, based on the vast array of "bricks" it has access to.
So, for instance, we know that there are 915,103,765 different possible combinations for arranging six eight-stud LEGO bricks. That’s nearly a billion results from just six LEGO bricks. Six! The mathematician who figured out this number, Søren Eilers, went on with his quest and tried to define the possibilities with 25 bricks.
According to Eilers, “With the current efficiency of our computer programs, we further estimate that it would take us something like 130,881,177,000,000,000,000,000,000,000,000,000,000,000 years to compute the correct number. After some 5,000,000,000 years, we will have to move our computer out of the Solar system, as the Sun is expected to become a red giant at about that time.”
This was in 2017, so computation has advanced somewhat since then. Let’s say (and be generous) that the computation available to us has increased by 10 orders of magnitude. So now we only must wait 13,088,117,700,000,000,000,000,000,000,000,000 or 1.30881177×1037 years. Sweet. For reference, that is more than the number of stars estimated to exist in the observable universe. Current estimates suggest there are approximately 1022 to 1024 stars in the universe.
So now imagine the LEGO Death Star. That pack-, piece-, knee- and back-destroying set has 3,803 pieces. I would argue this number and its combinations are and will be for some time, incalculable. I would guess it’s larger than 10*80—the estimated number of elementary particles of matter that exist in the visible universe.
Now consider this. You are given these LEGO bricks and told to recreate the LEGO Death Star in exactitude. No missing pieces, all in the correct place. Whilst possible, I’d imagine it would take more time than there has or potentially will be for you to do so.
But what if I told you to create anything you wanted: house, boat, plane, car, whatever—no limitations—just with what you have there? You could, right? Easy. You have all the bricks you could want, and you know how they go together and with your memory, you could put something together.
This is how to view GenAI. It has a model of the world from its training: the bricks. These go together to create countless potential answers, including the final build. The LEGO Death Star is in there, and academically, you could, with the right prompt, get it to recreate it—but you must know that the GenAI was trained on this. You’d have to know the specific prompt, like having the instruction manual to generate it. It’s possible, but not what the model was designed to do.
Here’s another way of looking at it. If you were to answer a question or retell a story for someone, you’d draw from the vocabulary, experiences and knowledge you’ve built up over your lifetime. The GenAI builds its vocabulary and conceptual capabilities based on what you feed it, just like how we consume info every day since birth to establish our own “knowledge base.” When we answer that question, we aren’t pulling from one single book or article we’ve read—we are pulling from the amalgamation of everything we’ve ever learned on the topic to formulate an educated response. GenAI functions in much the same way. It determines its response based on the sum of its training data—not one particular reference, with specific words from a single source. Unless, of course, you ask it to.
Why am I talking about this? What’s with the LEGOs, Death Stars and atoms? Well, I am reminded that there is still a lot to learn about GenAI, LLMs and this new generation of AI tools but we should start with understanding the LLMs and GenAIs. There is a misconception that it will plagiarise the works it has been trained on. But that isn’t the case. It simply fills the gap in front of it with the most appropriate output (text, image, video, etc) based on its prompt, and the model. Now we can improve on this output. We can provide it with more context, pages of the Death Star manual, to make it go in a certain direction or give us the result we are looking for. But unless we tell it to, it doesn’t simply copy what it has learned or can “see” online. I like to describe it (inaccurately and probably annoyingly to those who know more than I do on the topic) as a weighted random number generator, where the weights come from the training and the finished model, the context for the generation coming from the prompt with possibly other data supplementing the prompt to improve the accuracy. But it still has that random element. That’s why it hallucinates—but also why it isn’t copying or plagiarizing works.
So, the next time you think about the potential of GenAI, imagine those billions of LEGO combinations. Remember that while the AI might have the pieces, it’s the prompt—your own creativity—that ultimately shapes the final build. In that sense, GenAI is less about copying and more about unlocking the vast possibilities of what can be created from an expansive foundation of training data.
If you want to take advantage of this emerging technology but want to lessen hallucinations, trust the output for accuracy, govern your data that might be passed to your GenAI application and scale to enterprise levels, check out our Progress AI Solutions page.
A Large Language Model (LLM) is a type of artificial intelligence that can process and generate natural language, such as text or speech. LLMs are trained on huge quantities of data, such as web content, published works and periodicals. From that, the system learns the patterns and rules of language and can effectively predict words to complete sentences, answer questions, summarize text and compose lengthy responses. Examples include GPT-3 and BERT, used in applications such as chatbots and automated content creation. In addition, LLMs can be used to help programmers and developers write code. With it, they can write functions upon request, help users to review code to identify and solve faults or given some code as a starting point, they can finish writing a program or command.
enerative AI (GenAI) is a broader AI category that aims to produce original content across multiple media, including text, images and music. It encompasses LLMs and extends to models like DALL-E for image creation and Jukedeck for music generation. While the goal is original content, because GenAI sources information from existing data sets, outputs may include verbatim lifts from sources or fabricated responses, known as hallucinations.
Retrieval-Augmented Generation combines generative AI with contextually relevant data to improve accuracy and reliability. It reduces AI "hallucinations" by grounding responses in a structured knowledge graph and validating them against a comprehensive knowledge model. Key components include contextual data enrichment, knowledge graphs, prompt enhancement and response validation.
RAG improves accuracy, speed, cost-efficiency, scalability and security in AI applications, making it ideal for customer service, knowledge management and research and development. The Progress Data Platform supports RAG by integrating various data sources, harmonizing, classifying and contextualizing the data and maintaining high data quality and governance.
For more details on how our Semantic RAG approach to generative AI can help you in your organization, check the full blog.
The "Human-in-the-Loop" (HITL) concept integrates human oversight to provide accurate, reliable and repeatable data classification and AI outcomes. This approach leverages human judgment to review and validate AI outputs, improving quality control and addressing complex scenarios that AI alone might struggle to address. HITL is vital to many enterprise environments, where human intervention can prevent errors, promote governance and align decisions with societal values. By combining the strengths of both humans and machines, HITL creates a more adaptable, trustworthy and effective AI system, enabling technology to serve human needs responsibly and efficiently.
Philip Miller serves as the Senior Product Marketing Manager for AI at Progress. He oversees the messaging and strategy for data and AI-related initiatives. A passionate writer, Philip frequently contributes to blogs and lends a hand in presenting and moderating product and community webinars. He is dedicated to advocating for customers and aims to drive innovation and improvement within the Progress AI Platform. Outside of his professional life, Philip is a devoted father of two daughters, a dog enthusiast (with a mini dachshund) and a lifelong learner, always eager to discover something new.
Subscribe to get all the news, info and tutorials you need to build better business apps and sites