What is AI consulting?

AI consulting helps organizations design and implement artificial intelligence systems as core operational infrastructure. Unlike off-the-shelf software, custom AI consulting aligns AI capabilities directly with your business objectives, existing data architecture, and long-term scalability requirements.

Does Mosaic AI work with enterprises in Quebec and across Canada?

Yes. Mosaic AI is headquartered in Quebec, Canada and serves enterprise clients across the country. We offer bilingual service in English and French, and we are familiar with Quebec's specific data governance requirements under Loi 25 and federal PIPEDA obligations.

What types of AI systems does Mosaic AI build?

Mosaic AI builds decision support systems, autonomous AI agents, knowledge management platforms, and integrated data architectures tailored to each organization's operational needs.

How long does an AI consulting engagement typically take?

A focused AI readiness assessment or proof-of-concept typically takes 4 to 8 weeks. A full custom AI system deployment generally runs 3 to 6 months.

How does Mosaic AI handle data privacy and security?

Mosaic AI follows Quebec's Loi 25 and federal PIPEDA standards. We design AI systems with privacy-by-design principles and prefer on-premise or private cloud deployments where sensitive data is involved.

The History of AI Since the Transformer — Mosaic AI

On June 12, 2017, eight Google researchers published a paper titled Attention Is All You Need. After its release, barely a few hundred people in the world read it.

At that moment, almost no one outside specialized circles imagined what this text would set in motion. In companies, artificial intelligence was not a strategic topic. It was seen more as a research project — something interesting in theory, but still far from real operations.

This paper introduced the transformer. Before it, language models mostly processed text in order, one word at a time. That worked, but as soon as you pushed further, the limits showed quickly. It was slower, harder to train at scale, and less effective at capturing relationships between words that were far apart in a sentence or a text.

The transformer changes that. Instead of moving word by word, it allows a sequence to be processed much more efficiently and better measures the connections between different elements in the text. Put simply, this isn't a minor technical refinement. It's a different starting point. And it's on this foundation that everything that follows gets built.

In 2018, things start to move. Google releases BERT. OpenAI releases GPT-1. Models begin to appear that understand context better and deliver significantly better results on several language-related tasks.

In 2019, OpenAI presents GPT-2. The announcement makes more noise than usual, partly because the company initially explains it won't immediately release the full version of the model. In hindsight, the episode probably had more media impact than technical impact, but it helped bring AI back into public discussions.

Then GPT-3 arrives in 2020. For many people closely following the field, something genuinely changes. The model isn't simply better. It feels like a threshold has been crossed. It can write, summarize, translate, answer questions, generate code, rephrase text, adapt to a tone. And above all, it can do so with very few examples.

That's when many people begin to understand that these models won't just be better research tools. They will probably become general-purpose tools.

That said, GPT-3 remains relatively inaccessible. Using it goes mainly through an API. The general public doesn't really touch it, and inside companies, the topic remains fairly distant from day-to-day operations.

In 2021, other advances confirm that the pace is accelerating. DALL-E shows that images can be generated from text. GitHub Copilot begins assisting developers directly in their work. Diffusion models progress rapidly. Demonstrations become more convincing. Use cases start moving beyond the strictly experimental frame.

Despite this, in many organizations, people continue watching from a distance. They find it impressive, but not yet concrete enough to make it a real leadership topic.

Then comes ChatGPT, on November 30, 2022.

What changes at that moment isn't just the model. It's mostly the way you access it. Suddenly, anyone can open an interface, ask a question in plain language, and get a usable answer in seconds. No API needed, no development environment, no need to understand what's running underneath.

That's the shift that brings AI into everyday life.

Within days, everyone tries it. Students, developers, consultants, marketing teams, executives, support staff. Very quickly, the topic moves out of technical teams and up through organizations. People start asking what these models can concretely do, where they can be integrated, what they will change, and how fast.

In 2023, the market expands quickly. GPT-4 ships. Claude arrives. Meta releases LLaMA, which significantly broadens access to open-source models. Other players gain importance. What, a few years earlier, was mostly the domain of a small number of labs becomes a much larger market.

But the real change isn't only there. It's in the way organizations begin to approach the subject.

At first, AI was perceived as a theme to monitor. Then as something to test. Then as a risk to manage. Today, in the most advanced companies, it's already a matter of execution.

Since 2024, we see more and more systems built around these models. They no longer just produce text in a demo interface. They're connected to tools, databases, internal software, business processes. They sort, write, classify, analyze, assist, trigger actions and, in some cases, take over entire sections of a workflow.

That's where the gap begins to really widen.

Some organizations have already accumulated several years of experience. They've tested, failed, corrected, started again. They've learned what works, what doesn't, where the technology creates value and where it destroys it. They have a better understanding of how to integrate these tools into the reality of operations.

Others are just starting to ask the right questions.

And that gap doesn't close overnight. Because it doesn't depend only on access to models. It depends on accumulated experience, the quality of chosen use cases, execution speed, and the ability to turn new technology into concrete advantage.

In 2026, the dividing line is no longer between those who have heard of AI and those who haven't.

It's between those who have already started building with it, and those who are still discussing it.

The History of AI Since the Transformer

Ready to act on it?