How Microsoft offers tailored AI Copilots for you

PLUS: Sketch + Text = Super Search

Howdy fellas! Greetings from our dynamic duo - Spark & Trouble!

Welcome to the inaugural edition of our newsletter, ā€œThe Vision, Debugged;ā€
April Fools' might be a joke, but our content here is the real deal šŸ˜‰

Hereā€™s a sneak peek into this weekā€™s edition šŸ‘€

  • šŸ¤– Decoding Copilot GPTs

  • šŸ–¼ļø Combining doodles & complementary text for search

  • ā£ļø Can AI feel the same emotions as humans now?

  • šŸ–Œļø How will patterns and experiences evolve in a world shaped by AI?

Time to jump in!šŸ˜„

PS: Got thoughts on our content? Share 'em through a quick survey at the end of every edition It helps us see how our product labs, insights & resources are landing, so we can make them even better.

Product LabsšŸ”¬: Decoding ā€œCopilot GPTsā€

Copilot GPTs, one of the latest additions to Microsoftā€™s suite of Copilot features, allow customization of the behaviour of the GPT by assigning it roles of your choice.

As of now, Microsoft offers 5 pre-defined GPTs:
šŸ¤– Copilot - The OG one, technically it is also a GPT
šŸŽØ Designer
šŸ‹ļøā€ā™‚ļø Fitness
šŸŒ“ Vacation Planner
šŸ³ Cooking Guide

An extra slice of cake with icing for Pro users!
Microsoft goes a step further for Copilot Pro users, allowing the creation of custom GPTs beyond the ones listed above.

Product Labs: Decoding the AI Matrix - Copilot GPTs (created by the authors)
Tap the pic to get a better view

PS: Find the two tech revolutionaries hidden in the Product Labs!

Whatā€™s in it for you?

Copilot GPTs make life easier for lazy folks (like Spark šŸ˜›) who are not willing to use their brain cells to write a neat, detailed prompt and then choose to blame the AI for subpar results šŸ™Š.

Itā€™s very common for us to tweak our queries to get better answers, right? Copilot GPTs probe users for all details to understand the full context, before generating spot-on answers - this greatly solves the problem of the inability to write well-articulated prompts.

To illustrate this, imagine our buddies Spark & Trouble have invited a friend over for dinner. They turn to Copilot (the default OG one) for help deciding the menu.

OG Copilotā€™s response to ā€œhelp me plan a dinner menu for 3 peopleā€

Clearly, Copilot assumed that their interest in Indian cuisine & did not factor in dietary preferences. Not very impressive šŸ˜•

Instead, if they ask ā€œCooking Assistantā€ the same question, hereā€™s what the interaction could look likeā€¦

Cooking Assistantā€™s response to ā€œhelp me plan a dinner menu for 3 peopleā€

See the difference? The insane level of detail this GPT dives into, to factor in aspects such as cuisine, dietary preferences, and skill level, before providing its recommendations - just looking like a wow! šŸ¤©

Pro Tip: With the Instacart plugin, you can get those ingredients in a snap!

Make your own GPT!

This oneā€™s for the Pros! 

It is a neat interface that allows users to further configure the GPT beyond the instructions. The actual prompt to create the GPT can be further tweaked to specify any nitty gritties. You can also upload files that this custom GPT can leverage as a knowledge bank while chatting.

Check out this cool tutorial to understand better how to create custom GPTs.

The interface to ā€œConfigureā€ your custom GPT in Copilot GPT Builder

Want to go crazy and tell your Teacher GPT to teach physics concepts only in limericks? Now you can!

If youā€™re proud of your creation and prey to the IKEA effect, you can also share the GPT you created with a simple link.

What are some other cool applications?

  • Custom Business Applications: Just feed GPT your biz docs, and itā€™ll crunch numbers and chat with customers like a pro. Share it with your team to boost productivity and maintain consistency. Tired of doing the same tasks on a loop? Voila, now you have the perfect companion. Of course, proofreading is highly recommended!

  • Niche Expertise: Another way to leverage trainability on custom data, is by providing very specific datasets, niche hobbies or research areas. Think an Astrophysics GPT could help write complex simulations & analyze patterns? Well, go ahead & try it šŸ˜‰

Why is this intriguing?

The direct sparring partner for Copilot GPTs is ChatGPTā€™s Build Your GPT. OpenAI has enabled any degree of customization only for paid ā€œPlusā€ users, while Microsoft opted for a tiered model - pre-defined GPTs for all and custom creations reserved for Pro users.

OpenAI has also rolled out a marketplace, for creators to showcase their GPTs.

OpenAIā€™s ā€œExplore GPTsā€ Marketplace

What's cool is that creators can add links to their website during customization, turning their GPTs to act as a slick marketing tool. There is also a shared revenue model being worked on under the wraps.

Want to make a quick buck? Spin up a few custom GPTs and share ā€˜em in the marketplace!šŸš€šŸ’ø

Will Microsoft also soon catch up and create a marketplace for its GPTs?
A more interesting question is how these companies devise the revenue-sharing model for the creators.
Spark & Trouble will be geared up to get the lowdown on this!

Hot off the Wires šŸ”„

We're eavesdropping on the smartest minds in research. šŸ¤« Don't miss out on what they're cooking up! In this section, we dissect some of the juiciest tech research that holds the key to what's next in tech.āš”

You know that annoying feeling when youā€™re searching for something online but canā€™t find the right words? šŸ¤” Imagine if you could just doodle that tricky part and add a bit of text to find exactly what youā€™re looking for!

Well, guess what? The brilliant minds at IIT Jodhpur & Microsoft have made this a reality with a new ā€œCSTBIRā€ (short for Composite Sketch+Text Based Image Retrieval) system. Itā€™s a new problem formulation wherein you retrieve relevant images from natural scenes using a combination of doodles & complementary text.

So, whatā€™s the big deal?

Until now, image search relied on text, similar images, or super-detailed sketches. But CSTBIR lets you search with messy doodles and keywords, making it way faster and easier to find what you're looking for.

How cool is that!?šŸ˜Ž

Imagine if you want to search for ā€œa pair of markhors climbing cliffs on a sunny dayā€ - you donā€™t know the word ā€˜markhorā€™, while the interaction ā€˜climbing cliffs on a sunny dayā€™ is hard to draw

Under the hoodā€¦

The team has shared this awesome CSTBIR Dataset (108k images + 562k sketches + 2M text queries) publicly, along with their SoTA Sketch+Text Network (STNet) model.

This impressive multimodal transformer model uses a Vision Transformer & CLIPā€™s text encoder to understand your hand-drawn sketch & text in the query respectively, while relying on the CLIP-ViT image encoder to embed scene images.

STNet Model Architecture & Training

During training, the model is optimized across 5 different objective functions jointly:

  • Contrastive Training - pulls together sketches and descriptions that are similar and pushes dissimilar ones apart

  • Object Classification (w.r.t. text & image encodings) - the good old multiclass classification objective

  • Sketch-Guided Object Localization - finds the object in the scene that looks most like the sketch

  • Sketch Reconstruction - tries to recreate the original sketch from the scene image

Why this could be huge?

The paper clearly showcases how this technique outperforms several strong baselines. Whatā€™s most intriguing is the whopping margin of win on an open-category test set (object classes that are unseen during training), demonstrating its generalizability & robustness!

Comparing the performance of STNet against competitive baselines on an Open-Category test set of 750 samples containing 70 unseen object classes.

Well, Spark & Trouble find CSTBIR a game-changer! Imagine this:

  • Youā€™re shopping on Myntra or Shein, and you find your dream dress just by sketching the neckline or sleeve design and describing the fabric or vibe of the dress. How cool would that be? šŸ›ļøšŸ‘—

  • On a more serious note, it could be a silver bullet for the police searching for missing people or identifying crime suspects from sketches & visual descriptors provided by witnesses.

It is evident that the future of search is multimodal, and CSTBIR-related integrations will only make it crazier (in a good way, of course, šŸ˜‹)! And since STNet is one of the first models for this task, I canā€™t wait to see what comes next!

Want to learn more? Check out the paper & code implementation.

Whatcha Got There?!šŸ«£

Buckle up, tech fam! Every week, our dynamic duo ā€œSparkā€ āœØ & ā€œTroubleā€šŸ˜‰ share some seriously cool learning resources we stumbled upon.

āœØ Sparkā€™s Selections

šŸ˜‰ Troubleā€™s Tidbits

Spark 'n' Trouble Shenanigans šŸ˜œ

We love memes! Hereā€™s what AI thinks our favorite meme templates would look like, when zoomed outā€¦

Thanks for reading šŸ˜Š

See you next week with more mind-blowing tech insights šŸ’»

Until then,
Stay CuriousšŸ§  Stay AwesomešŸ¤©

PS: Do catch us on LinkedIn - Sandra & Tezan

Reply

or to participate.