- The Vision, Debugged;
- Posts
- How Microsoft offers tailored AI Copilots for you
How Microsoft offers tailored AI Copilots for you
PLUS: Sketch + Text = Super Search
Howdy fellas! Greetings from our dynamic duo - Spark & Trouble!
Welcome to the inaugural edition of our newsletter, āThe Vision, Debugged;ā
April Fools' might be a joke, but our content here is the real deal š
Hereās a sneak peek into this weekās edition š
š¤ Decoding Copilot GPTs
š¼ļø Combining doodles & complementary text for search
ā£ļø Can AI feel the same emotions as humans now?
šļø How will patterns and experiences evolve in a world shaped by AI?
Time to jump in!š
PS: Got thoughts on our content? Share 'em through a quick survey at the end of every edition It helps us see how our product labs, insights & resources are landing, so we can make them even better.
Product Labsš¬: Decoding āCopilot GPTsā
Copilot GPTs, one of the latest additions to Microsoftās suite of Copilot features, allow customization of the behaviour of the GPT by assigning it roles of your choice.
As of now, Microsoft offers 5 pre-defined GPTs:
š¤ Copilot - The OG one, technically it is also a GPT
šØ Designer
šļøāāļø Fitness
š“ Vacation Planner
š³ Cooking Guide
An extra slice of cake with icing for Pro users!
Microsoft goes a step further for Copilot Pro users, allowing the creation of custom GPTs beyond the ones listed above.
Product Labs: Decoding the AI Matrix - Copilot GPTs (created by the authors)
Tap the pic to get a better view
PS: Find the two tech revolutionaries hidden in the Product Labs!
Whatās in it for you?
Copilot GPTs make life easier for lazy folks (like Spark š) who are not willing to use their brain cells to write a neat, detailed prompt and then choose to blame the AI for subpar results š.
Itās very common for us to tweak our queries to get better answers, right? Copilot GPTs probe users for all details to understand the full context, before generating spot-on answers - this greatly solves the problem of the inability to write well-articulated prompts.
To illustrate this, imagine our buddies Spark & Trouble have invited a friend over for dinner. They turn to Copilot (the default OG one) for help deciding the menu.
OG Copilotās response to āhelp me plan a dinner menu for 3 peopleā
Clearly, Copilot assumed that their interest in Indian cuisine & did not factor in dietary preferences. Not very impressive š
Instead, if they ask āCooking Assistantā the same question, hereās what the interaction could look likeā¦
Cooking Assistantās response to āhelp me plan a dinner menu for 3 peopleā
See the difference? The insane level of detail this GPT dives into, to factor in aspects such as cuisine, dietary preferences, and skill level, before providing its recommendations - just looking like a wow! š¤©
Pro Tip: With the Instacart plugin, you can get those ingredients in a snap!
Make your own GPT!
This oneās for the Pros!
It is a neat interface that allows users to further configure the GPT beyond the instructions. The actual prompt to create the GPT can be further tweaked to specify any nitty gritties. You can also upload files that this custom GPT can leverage as a knowledge bank while chatting.
Check out this cool tutorial to understand better how to create custom GPTs.
The interface to āConfigureā your custom GPT in Copilot GPT Builder
Want to go crazy and tell your Teacher GPT to teach physics concepts only in limericks? Now you can!
If youāre proud of your creation and prey to the IKEA effect, you can also share the GPT you created with a simple link.
What are some other cool applications?
Custom Business Applications: Just feed GPT your biz docs, and itāll crunch numbers and chat with customers like a pro. Share it with your team to boost productivity and maintain consistency. Tired of doing the same tasks on a loop? Voila, now you have the perfect companion. Of course, proofreading is highly recommended!
Niche Expertise: Another way to leverage trainability on custom data, is by providing very specific datasets, niche hobbies or research areas. Think an Astrophysics GPT could help write complex simulations & analyze patterns? Well, go ahead & try it š
Why is this intriguing?
The direct sparring partner for Copilot GPTs is ChatGPTās Build Your GPT. OpenAI has enabled any degree of customization only for paid āPlusā users, while Microsoft opted for a tiered model - pre-defined GPTs for all and custom creations reserved for Pro users.
OpenAI has also rolled out a marketplace, for creators to showcase their GPTs.
OpenAIās āExplore GPTsā Marketplace
What's cool is that creators can add links to their website during customization, turning their GPTs to act as a slick marketing tool. There is also a shared revenue model being worked on under the wraps.
Want to make a quick buck? Spin up a few custom GPTs and share āem in the marketplace!ššø
Will Microsoft also soon catch up and create a marketplace for its GPTs?
A more interesting question is how these companies devise the revenue-sharing model for the creators.
Spark & Trouble will be geared up to get the lowdown on this!
Hot off the Wires š„
We're eavesdropping on the smartest minds in research. š¤« Don't miss out on what they're cooking up! In this section, we dissect some of the juiciest tech research that holds the key to what's next in tech.ā”
You know that annoying feeling when youāre searching for something online but canāt find the right words? š¤ Imagine if you could just doodle that tricky part and add a bit of text to find exactly what youāre looking for!
Well, guess what? The brilliant minds at IIT Jodhpur & Microsoft have made this a reality with a new āCSTBIRā (short for Composite Sketch+Text Based Image Retrieval) system. Itās a new problem formulation wherein you retrieve relevant images from natural scenes using a combination of doodles & complementary text.
So, whatās the big deal?
Until now, image search relied on text, similar images, or super-detailed sketches. But CSTBIR lets you search with messy doodles and keywords, making it way faster and easier to find what you're looking for.
How cool is that!?š
Imagine if you want to search for āa pair of markhors climbing cliffs on a sunny dayā - you donāt know the word āmarkhorā, while the interaction āclimbing cliffs on a sunny dayā is hard to draw
Under the hoodā¦
The team has shared this awesome CSTBIR Dataset (108k images + 562k sketches + 2M text queries) publicly, along with their SoTA Sketch+Text Network (STNet) model.
This impressive multimodal transformer model uses a Vision Transformer & CLIPās text encoder to understand your hand-drawn sketch & text in the query respectively, while relying on the CLIP-ViT image encoder to embed scene images.
STNet Model Architecture & Training
During training, the model is optimized across 5 different objective functions jointly:
Contrastive Training - pulls together sketches and descriptions that are similar and pushes dissimilar ones apart
Object Classification (w.r.t. text & image encodings) - the good old multiclass classification objective
Sketch-Guided Object Localization - finds the object in the scene that looks most like the sketch
Sketch Reconstruction - tries to recreate the original sketch from the scene image
Why this could be huge?
The paper clearly showcases how this technique outperforms several strong baselines. Whatās most intriguing is the whopping margin of win on an open-category test set (object classes that are unseen during training), demonstrating its generalizability & robustness!
Comparing the performance of STNet against competitive baselines on an Open-Category test set of 750 samples containing 70 unseen object classes.
Well, Spark & Trouble find CSTBIR a game-changer! Imagine this:
Youāre shopping on Myntra or Shein, and you find your dream dress just by sketching the neckline or sleeve design and describing the fabric or vibe of the dress. How cool would that be? šļøš
On a more serious note, it could be a silver bullet for the police searching for missing people or identifying crime suspects from sketches & visual descriptors provided by witnesses.
It is evident that the future of search is multimodal, and CSTBIR-related integrations will only make it crazier (in a good way, of course, š)! And since STNet is one of the first models for this task, I canāt wait to see what comes next!
Want to learn more? Check out the paper & code implementation.
Whatcha Got There?!š«£
Buckle up, tech fam! Every week, our dynamic duo āSparkā āØ & āTroubleāš share some seriously cool learning resources we stumbled upon.
āØ Sparkās Selections |
š Troubleās Tidbits |
Spark 'n' Trouble Shenanigans š
We love memes! Hereās what AI thinks our favorite meme templates would look like, when zoomed outā¦
Thanks for reading š Until then, |
Reply