How Microsoft offers tailored AI Copilots for you

PLUS: Sketch + Text = Super Search

Howdy fellas! Greetings from our dynamic duo - Spark & Trouble!

Welcome to the inaugural edition of our newsletter, “The Vision, Debugged;”
April Fools' might be a joke, but our content here is the real deal 😉

Here’s a sneak peek into this week’s edition 👀

  • 🤖 Decoding Copilot GPTs

  • 🖼️ Combining doodles & complementary text for search

  • ❣️ Can AI feel the same emotions as humans now?

  • 🖌️ How will patterns and experiences evolve in a world shaped by AI?

Time to jump in!😄

PS: Got thoughts on our content? Share 'em through a quick survey at the end of every edition It helps us see how our product labs, insights & resources are landing, so we can make them even better.

Product Labs🔬: Decoding “Copilot GPTs”

Copilot GPTs, one of the latest additions to Microsoft’s suite of Copilot features, allow customization of the behaviour of the GPT by assigning it roles of your choice.

As of now, Microsoft offers 5 pre-defined GPTs:
🤖 Copilot - The OG one, technically it is also a GPT
🎨 Designer
🏋️‍♂️ Fitness
🌴 Vacation Planner
🍳 Cooking Guide

An extra slice of cake with icing for Pro users!
Microsoft goes a step further for Copilot Pro users, allowing the creation of custom GPTs beyond the ones listed above.

Product Labs: Decoding the AI Matrix - Copilot GPTs (created by the authors)
Tap the pic to get a better view

PS: Find the two tech revolutionaries hidden in the Product Labs!

What’s in it for you?

Copilot GPTs make life easier for lazy folks (like Spark 😛) who are not willing to use their brain cells to write a neat, detailed prompt and then choose to blame the AI for subpar results 🙊.

It’s very common for us to tweak our queries to get better answers, right? Copilot GPTs probe users for all details to understand the full context, before generating spot-on answers - this greatly solves the problem of the inability to write well-articulated prompts.

To illustrate this, imagine our buddies Spark & Trouble have invited a friend over for dinner. They turn to Copilot (the default OG one) for help deciding the menu.

OG Copilot’s response to “help me plan a dinner menu for 3 people”

Clearly, Copilot assumed that their interest in Indian cuisine & did not factor in dietary preferences. Not very impressive 😕

Instead, if they ask “Cooking Assistant” the same question, here’s what the interaction could look like…

Cooking Assistant’s response to “help me plan a dinner menu for 3 people”

See the difference? The insane level of detail this GPT dives into, to factor in aspects such as cuisine, dietary preferences, and skill level, before providing its recommendations - just looking like a wow! 🤩

Pro Tip: With the Instacart plugin, you can get those ingredients in a snap!

Make your own GPT!

This one’s for the Pros! 

It is a neat interface that allows users to further configure the GPT beyond the instructions. The actual prompt to create the GPT can be further tweaked to specify any nitty gritties. You can also upload files that this custom GPT can leverage as a knowledge bank while chatting.

Check out this cool tutorial to understand better how to create custom GPTs.

The interface to “Configure” your custom GPT in Copilot GPT Builder

Want to go crazy and tell your Teacher GPT to teach physics concepts only in limericks? Now you can!

If you’re proud of your creation and prey to the IKEA effect, you can also share the GPT you created with a simple link.

What are some other cool applications?

  • Custom Business Applications: Just feed GPT your biz docs, and it’ll crunch numbers and chat with customers like a pro. Share it with your team to boost productivity and maintain consistency. Tired of doing the same tasks on a loop? Voila, now you have the perfect companion. Of course, proofreading is highly recommended!

  • Niche Expertise: Another way to leverage trainability on custom data, is by providing very specific datasets, niche hobbies or research areas. Think an Astrophysics GPT could help write complex simulations & analyze patterns? Well, go ahead & try it 😉

Why is this intriguing?

The direct sparring partner for Copilot GPTs is ChatGPT’s Build Your GPT. OpenAI has enabled any degree of customization only for paid “Plus” users, while Microsoft opted for a tiered model - pre-defined GPTs for all and custom creations reserved for Pro users.

OpenAI has also rolled out a marketplace, for creators to showcase their GPTs.

OpenAI’s “Explore GPTs” Marketplace

What's cool is that creators can add links to their website during customization, turning their GPTs to act as a slick marketing tool. There is also a shared revenue model being worked on under the wraps.

Want to make a quick buck? Spin up a few custom GPTs and share ‘em in the marketplace!🚀💸

Will Microsoft also soon catch up and create a marketplace for its GPTs?
A more interesting question is how these companies devise the revenue-sharing model for the creators.
Spark & Trouble will be geared up to get the lowdown on this!

Hot off the Wires 🔥

We're eavesdropping on the smartest minds in research. 🤫 Don't miss out on what they're cooking up! In this section, we dissect some of the juiciest tech research that holds the key to what's next in tech.⚡

You know that annoying feeling when you’re searching for something online but can’t find the right words? 🤔 Imagine if you could just doodle that tricky part and add a bit of text to find exactly what you’re looking for!

Well, guess what? The brilliant minds at IIT Jodhpur & Microsoft have made this a reality with a new “CSTBIR” (short for Composite Sketch+Text Based Image Retrieval) system. It’s a new problem formulation wherein you retrieve relevant images from natural scenes using a combination of doodles & complementary text.

So, what’s the big deal?

Until now, image search relied on text, similar images, or super-detailed sketches. But CSTBIR lets you search with messy doodles and keywords, making it way faster and easier to find what you're looking for.

How cool is that!?😎

Imagine if you want to search for “a pair of markhors climbing cliffs on a sunny day” - you don’t know the word ‘markhor’, while the interaction ‘climbing cliffs on a sunny day’ is hard to draw

Under the hood…

The team has shared this awesome CSTBIR Dataset (108k images + 562k sketches + 2M text queries) publicly, along with their SoTA Sketch+Text Network (STNet) model.

This impressive multimodal transformer model uses a Vision Transformer & CLIP’s text encoder to understand your hand-drawn sketch & text in the query respectively, while relying on the CLIP-ViT image encoder to embed scene images.

STNet Model Architecture & Training

During training, the model is optimized across 5 different objective functions jointly:

  • Contrastive Training - pulls together sketches and descriptions that are similar and pushes dissimilar ones apart

  • Object Classification (w.r.t. text & image encodings) - the good old multiclass classification objective

  • Sketch-Guided Object Localization - finds the object in the scene that looks most like the sketch

  • Sketch Reconstruction - tries to recreate the original sketch from the scene image

Why this could be huge?

The paper clearly showcases how this technique outperforms several strong baselines. What’s most intriguing is the whopping margin of win on an open-category test set (object classes that are unseen during training), demonstrating its generalizability & robustness!

Comparing the performance of STNet against competitive baselines on an Open-Category test set of 750 samples containing 70 unseen object classes.

Well, Spark & Trouble find CSTBIR a game-changer! Imagine this:

  • You’re shopping on Myntra or Shein, and you find your dream dress just by sketching the neckline or sleeve design and describing the fabric or vibe of the dress. How cool would that be? 🛍️👗

  • On a more serious note, it could be a silver bullet for the police searching for missing people or identifying crime suspects from sketches & visual descriptors provided by witnesses.

It is evident that the future of search is multimodal, and CSTBIR-related integrations will only make it crazier (in a good way, of course, 😋)! And since STNet is one of the first models for this task, I can’t wait to see what comes next!

Want to learn more? Check out the paper & code implementation.

Whatcha Got There?!🫣

Buckle up, tech fam! Every week, our dynamic duo “Spark”  & “Trouble”😉 share some seriously cool learning resources we stumbled upon.

Spark’s Selections

😉 Trouble’s Tidbits

Spark 'n' Trouble Shenanigans 😜

We love memes! Here’s what AI thinks our favorite meme templates would look like, when zoomed out…

Thanks for reading 😊

See you next week with more mind-blowing tech insights 💻

Until then,
Stay Curious🧠 Stay Awesome🤩

PS: Do catch us on LinkedIn - Sandra & Tezan

Reply

or to participate.