Why do universities and businesses struggle to keep up with AI advancements?
Perhaps because the world is quickly moving into a “boutique” approach to technology, but institutions have a “suite” mindset, looking for one application that works for everything.
We paid the Office 365 or Google Workspace subscription, right? That’s good enough.
Not really.
Researching and teaching AI technology from an AI operations or content operations perspective is a Herculean task in universities these days.
The reason? The traditional university frameworks for technology procurement and funding are ill-suited to the rapidly evolving tech landscape.
For example, I want my students to experience ecosystems outside of Microsoft, because most people in our field don’t exclusively use Office 365.
Tech writers have told me they use up to 50 applications for their work.
This means piecing together your own workflow and customizing your app stack for specific purposes.
That’s what it means to be a writer or content creator these days.
And when it comes to AI, I'm convinced the future won't be dominated by one or two models.
Just as we value diversity in our human teams, we should value diversity in our AI models. A diversity of models means a diversity of perspectives, leading to more creative, robust, and inclusive solutions.
We'll see a flourishing ecosystem of hundreds of models, each with unique strengths and applications. To conduct meaningful research and teaching in AI operations, we need diverse app stacks, not monolithic suites.
This need diversity is particularly evident in generative AI. Yet, our university is pushing us towards Microsoft Copilot, simply because it's the 'secure' option that we're already paying for.
But is this really the best approach?
How we experimented.
In my recent classes, I've been working with students to use Microsoft Copilot. After all, it's the tool our university has invested in, or more accurately, the tool our students are already paying for.
Naturally, I want to make it work for them. However, despite being built on the same core technology as ChatGPT, Copilot has consistently fallen short in our exercises.
These shortcomings certainly provide teaching moments, but limit students' exposure to genAI models' full capabilities.
By restricting them to just one model, we risk giving them a narrow view of AI capabilities, even if it's the "best" one available.
To illustrate this, I conducted an experiment: use the same research curator prompt with both Microsoft Copilot and ChatGPT.
The prompt was designed to help writers generate ideas for microessays about the creator economy, a topic of interest in our classes.
[ROLE] You are an expert curator, who gathers research conflict management for first year studenst and communicates it in useful and engaging ways.
[GOAL] Our goal is to find research that will help students solve roommate problemsin a useful and engaging way.
[CONSTRAINTS] Do not write the essay. Just give me ideas.
[TASK] I will give you an article, summary, or synthesis of research and you will give me ideas on how to communicate that to my audience in a short microessay or LinkedIn post. Just give me advice no draft text.
Are you ready?
The purpose of this experiment was twofold: to compare the outputs of the two models and to demonstrate that different AI models, even those built on the same core technology, can produce vastly different outputs.
What I noticed.
In class, I used the same prompt for both models. Well, actually I was disappointed in Copilot’s performance and switched to ChatGPT. The difference was clear.
Here's what we found:
Focus and Structure: ChatGPT's output was more focused and readable, while Copilot's output was mostly an outline.
Lack of Responsiveness: Even when asked to write out each bullet as a paragraph of advice, Copilot essentially gave the same response, just shorter. ChatGPT expanded its focus on the audience with the additional prompt.
Repetition: Microsoft Copilot tends to repeat the same text more than ChatGPT, even when adjusting the temperature settings (which are less robust than ChatGPT's). This repetition makes the output redundant and less engaging.
Usability and Interface: The user experience differed significantly between the two models. With Copilot, my history disappeared when I went back for test screenshots. The inability to edit conversation parts, crucial for learning to prompt, was another limitation. However, Copilot's Notebook feature is interesting, but has even more repetitive results.
These results highlight the strengths and weaknesses of each model, emphasizing the importance of choosing the right model and leveraging its capabilities effectively.
Things we should teach students and research… which requires access to more models (or more ways to build models).
It’s not just about the core technology.
Both Microsoft Copilot and ChatGPT are built on transformer technology, a type of machine learning model that enables AI to process data more holistically.
But this is just one piece of the puzzle. The model's training, the data it's trained on, and its deployment all shape the output.
Copilot is trained to increase productivity, not creativity. And the goal is to keep people in the Microsoft ecosystem.
This approach may work for some people and uses, but not for everyone. It may not even be right for the majority.
ChatGPT seems trained on a wider dataset for more creative and broader use cases. This is the model I use the most, but maybe it doesn’t work for you?
That’s fine. That’s why we need access to multiple models or tools.
The core technology provides the foundation, but the training and the ecosystem shape the model's behavior.
At some point, we'll all likely have our own personal model, tailored to our personality and purposes. That’s what I’m aiming for.
Curious how I’m doing this? I gave a sneak peak to paid subscribers earlier this month. More to come!
Diversity is the name of the game.
In the world of AI, it's tempting to search for the "one model to rule them all" — a single, all-powerful AI that can handle any task. But my experiment with Microsoft Copilot and ChatGPT shows this approach may not be the best way forward.
Different models, trained on different data and fine-tuned with different guidelines, can produce vastly different outputs. And that's a good thing.
Just as we value diversity in our human teams, we should value diversity in our AI models. A diversity of models means a diversity of perspectives, leading to more creative, robust, and inclusive solutions.
So, instead of looking for one model that can do everything, let's embrace the diversity of AI models.
Let's explore different models for different tasks, and let's appreciate the unique strengths and perspectives that each model brings to the table.
In the end, the goal of AI development isn't to create a single model that can do everything.
It's to create a variety of models that can work together, complementing each other's strengths and compensating for each other's weaknesses.
That’s what universities should be funding. That’s what businesses should be supporting.