Unlike lexical search, semantic search attempts to understand the intent and contextual meaning of a query.
Semantic Search can be a very powerful tool you can use within your SaaS to provide delightful search experiences, augmented with AI.
In this tutorial, we will learn how to add semantic search to your Makerkit SaaS using Supabase Vector. Our use-case consists in providing an enhanced search experience and automatic summarization for feedback submissions from users.
What is Supabase Vector?
Supabase Vector is a set of utilities and practices released by the Supabase team to make it easier to build AI applications on top of Supabase. As you may know, Supabase uses Postgres to power their database: thanks to Postgres extensions, more specifically PgVector, Supabase is a prime candidate to build AI applications with.
Semantic search use cases
What can we build that makes use of semantic search? Here are a few examples:
- enhanced search experiences (for text, images, etc.)
- customer support chatbots
- product search and recommendations
- content moderation
- automatic tagging
- and more!
Final Result
Here is the final result of what we will build in this tutorial.
The Feedback Submission Form
The Makerkit Popup plugin (which this article led to the creation of) allows you to collect feedback from your users. Here is a small demo:
The Submissions summary
We will generate a daily summary of the feedback submissions using the summarization
pipeline from @xenova/transformers
. Here is a how it looks like:
The banner "Today's Feedback" is generated using the summarization
pipeline from @xenova/transformers
.
The similar submissions list
Finally, the list of submissions that are semantically similar to the one being viewed is generated using the match_feedback_submissions
function from Supabase Vector.
Clicking on a similar submission will take you to the individual submission page.
The above is all generated using Supabase Vector and Hugging Face. Let's see how we can build it!
Adding Supabase Vector to your app
Our use-case will be to add semantic search to the feedback submissions from users. We will try to achieve the following:
- Creating Vectors with Supabase: When a user submits a feedback, we will automatically create a vector representation of the feedback text using Supabase Vector. This helps us to understand the intent and contextual meaning of the feedback and allows us to find similar feedbacks from the database from a semantic point of view.
- Searching submissions semantically: We will then use this vector representation to find similar feedbacks from the database.
- Reporting: We can then generate a daily report based the daily submissions that shows at a glance what users are saying about our product.
Preface: you don't need to be a Makerkit customer to follow along
Treat the UI components as implementation details, and focus on the code that interacts with Supabase. You will easily be able to adapt the code to your own use-case. If you're a Makerkit customer, then you just need your popcorn ready and follow along! 🍿
Creating the Database Table
The first thing we need to do is to create a new table in our database to store the feedbacks. We will use the following SQL query to create the table:
create extension if not exists vector with schema extensions;create type feedback_type as ENUM ('question', 'bug', 'feedback');create table feedback_submissions ( id serial primary key, user_id uuid references public.users(id) on delete set null, type feedback_type not null, screenshot_url text, text text not null, embedding vector (384), device_info jsonb, metadata jsonb, screen_name text, email text, created_at timestamptz not null default now());alter table feedback_submissions enable row level security;create or replace function match_feedback_submissions ( query_embedding vector(384), match_threshold float, match_count int)returns table ( id bigint, content text, similarity float)language sql stableas $$ select feedback_submissions.id, feedback_submissions.text, 1 - (feedback_submissions.embedding <=> query_embedding) as similarity from feedback_submissions where 1 - (feedback_submissions.embedding <=> query_embedding) > match_threshold order by similarity desc limit match_count;$$;
The table has the following columns:
id
: the primary key of the tableuser_id
: the user who submitted the feedback (if signed in)type
: the type of feedback (question, bug, feedback)screenshot_url
: the URL of the screenshot (if any)text
: the text of the feedbackdevice_info
: the device information (if any)metadata
: the metadata of the feedback (if any)screen_name
: the screen name of the user (if any)email
: the email of the user (if the type isquestion
)created_at
: the date the feedback was submitted, which is automatically set to the current dateembedding
: the vector representation of the feedback text
Finally, row level security is turned on for the feedback_submissions
table. This allows us to protect the data in the table, as we will provide access to the data only using the Supabase Service Role key.
The match_feedback_submissions
function uses the pgVector extension to store the vector representation of the feedback text. We can use this to retrieve similar feedbacks from the database: this open up various interesting applications, such as searching for similar feedbacks, asking questions to your own chatbot, and more.
<Alert.Heading>You may need to update the schema</Alert.Heading>
The schema above requires a table called public.users
to be present in the database. If you don't have one, you can use the auth.users
table provided by Supabase for the relationship.
Creating the Migration
Next, we create the migration file to create the table. We will use the following command to create the migration file:
supabase migrations new feedback
The command above will create a new migration file in the migrations
folder suffixed with the name feedback
. We will then add the SQL query above to the migration file.
When you run supabase start
, the migration will be automatically applied to your database.
Using Next.js API Routes to create a Feedback Submission
Throughout this tutorial, we will use Next.js API Routes to create a feedback submission.
We will abstract functions enough so you can adapt them to any framework, but the examples we show use Next.js API Routes and Server Actions for handling the feedback submissions and data fetching.
The UI is abstracted away from the tutorial for simplicity - you can use your own UI to display the feedback submission form and the feedback submissions.
Required Packages
We can use the package @xenova/transformers
to run pipelines using the HuggingFace models to add AI capabilities to our application.
Install it using the following command:
npm i @xenova/transformers qs zod
If you use Next.js, remember to add the following change to the experimental.serverComponentsExternalPackages
property in next.config.js
file:
const nextConfig = { experimental: { serverActions: true, serverComponentsExternalPackages: ['sharp', 'onnxruntime-node'], }}
This is required for the @xenova/transformers
package to work properly in Next.js.
Creating a Server Action to Create a Feedback Submission
Next.js Server Actions are extremely handy for submitting forms in Next.js. We will use a Server Action to create a feedback submission.
The Data Model of a Feedback Submission
We assume you have a form that allows users to submit feedbacks. The form should have the following fields:
- type: the type of feedback (question, bug, feedback)
- text: the text of the feedback
- metadata: the metadata of the feedback (if any)
- device_info: the device information (if any)
- screen_name: the screen name of the user (if any)
- email: the email of the user (if the type is
question
)
You can simplify the form as much as needed
NB: You can simplify the above as much as needed, in fact you only need the text
field to create a feedback submission if you need it to be that simple.
Creating the Server Action for the Feedback Submission
Now, I explain how we create the Server Action for the feedback submission.
- We create a schema using Zod to validate the data coming from the form submission
- We create a
submitFeedbackAction
function that will be called when the form is submitted (NB: theuse server
directive is required for Server Actions) - We create a
createEmbedding
function that will be used to create the vector representation of the feedback text - We create a
submitFeedback
function that will be used to submit the feedback to the database
NB: We assume you have created a Supabase SDK Action client. We import it as getSupabaseServerActionClient
.
If you use Next.js actions in as a form
action, you may need this library fairly often to send along nested objects or arrays.
'use server';import { z } from 'zod';import qs from 'qs';import getSupabaseServerActionClient from '~/core/supabase/action-client';import { pipeline } from '@xenova/transformers';// this is the interface returned by the server actioninterface FormStatus { success: boolean | undefined;}// this is the Zod schema to validate a form submissionconst submitFeedbackSchema = z .object({ type: z.enum(['bug', 'feedback', 'question']), text: z.string(), metadata: z.unknown().optional(), screen_name: z.string().optional(), device_info: z.unknown().optional(), email: z.string().optional(), }) .refine((data) => { // if the type is question, we require the email to be present return !(data.type === 'question' && !data.email); });// this is the function that will be called when the form is submittedexport async function submitFeedbackAction( _: FormStatus, data: FormData) { // we parse the FormData object using "qs" // so we can turn nested objects into JSON const parsed = qs.parse( new URLSearchParams(data as unknown as Record<string, string>).toString(), ); // we validate the data coming from the form submission const body = await submitFeedbackSchema.parseAsync(parsed); // we use the admin client (eg. using the service role key) // so that we can bypass the row level security and insert the feedback const adminClient = getSupabaseServerActionClient({ admin: true, }); // we verify the user is signed in // if yes, we can add the user ID to table and record who submitted the feedback // otherwise, we leave the user ID as null in the table and record the feedback as anonymous const user = await getSupabaseServerActionClient().auth.getUser(); const userId = user.data?.user?.id ?? null; console.info( { userId, }, `Submitting feedback`, ); // we generate the embedding of the feedback text const embedding = await createEmbedding(body.text); // we insert the feedback submission into the database const table = adminClient.from('feedback_submissions'); const response = await table.insert({ type: body.type, embedding, text: body.text, metadata: body.metadata, screen_name: body.screen_name, device_info: body.device_info, email: body.email, user_id: userId, }); // check if there was an error if (response.error) { console.error( { error: response.error, }, `Error submitting feedback`, ); return { success: false, }; } // all good! 🎉 we log the submission success // and return a success response console.info( { userId, }, `Feedback successfully submitted`, ); return { success: true, };}// this is the function that will be used to create// the vector representation of the feedback textasync function createEmbedding(text: string) { const generateEmbedding = await pipeline( 'feature-extraction', 'Supabase/gte-small', ); const output = await generateEmbedding(text, { pooling: 'mean', normalize: true, }); return Array.from(output.data);}
Assuming you're using the server action in a form
action, it would look lke the below:
import { experimental_useFormState as useFormState } from 'react-dom';const [status, formAction] = useFormState(submitFeedbackAction, { success: undefined,});<form action={submitFeedbackAction}>{/* ... */}</form>
The useFormState
hook is a new experimental React.js hook for handling the state of a form submission with Server Actions.
What does the "qs" library do?
The qs
library is a handy utility that can transform a FormData
object into a JSON object. We use it to transform the form data into a JSON object that we can use to create the feedback submission.
Checking that the submissions make it to the database
If you submit a feedback, you should see it in the database. To verify, navigate to your local Supabase Studio instance, locate the feedback_submissions
table, and verify it contains the feedback you submitted.
Focusing on the Embeddings Generation
Supabase allows us to generate embeddings without needing to use the OpenAI embeddings (as it is commonly done). This is a huge advantage, as we can generate embeddings without needing to use an external API.
We use the function createEmbedding
to generate the vector representation of the feedback text. We use the @xenova/transformers
package to generate the embeddings.
async function createEmbedding(text: string) { const generateEmbedding = await pipeline( 'feature-extraction', 'Supabase/gte-small', ); const output = await generateEmbedding(text, { pooling: 'mean', normalize: true, }); return Array.from(output.data);}
As a result, we can embed the vector representation of the submission text in the embedding
property and store it in the database. Thanks to pgVector
, we can then use the match_feedback_submissions
function to find similar feedbacks from the database.
This opens up interesting use-cases such as:
- Searching for similar feedbacks from the database
- Adding semantic search from the UI (for example, using a chatbot)
And more exciting stuff using @xenova/transformers
- such as generating a daily report of the feedback submissions using the summarization
pipeline, which we will implement when fetching the submissions.
Exciting, right? 🎉
Querying the Feedback Submissions
Now that we have created the feedback submission, we can query the feedback submissions from the database.
To do so, we use the Supabase Client. Additionally, we will use the Transformers.js library to generate a summary of the feedback submissions.
Let's get started!
Creating the Query to Fetch the Feedback Submissions
Since we will reuse the same query for fetching a list of feedback submissions and for fetching a single feedback submission, we will create a function that returns the query.
const QUERY = ` id, type, text, metadata, embedding, createdAt: created_at, userId: user_id, screenName: screen_name, deviceInfo: device_info `;
NB: If you have removed some of the properties, remove them from the query above too.
Fetching a paginated list of feedback submissions using the Supabase Client
We will use the Supabase Client to fetch a paginated list of feedback submissions from the database.
import { SupabaseClient } from '@supabase/supabase-js';import { Database } from '~/database.types';const QUERY = ` id, type, text, metadata, embedding, createdAt: created_at, userId: user_id, screenName: screen_name, deviceInfo: device_info `;export async function getFeedbackSubmissions( client: SupabaseClient<Database>, params: { query?: string; page: number; perPage: number; },) { const startOffset = (params.page - 1) * params.perPage; const endOffset = startOffset + params.perPage; let query = client .from('feedback_submissions') .select(QUERY, { count: 'exact', }) .limit(params.perPage) .order('created_at', { ascending: false }) .range(startOffset, endOffset); if (params.query) { query = query.textSearch('text', `${params.query}`); } return query;}
The above query accepts the following parameters:
client
: the Supabase Client, which you would normally inject into the query from the Server Component or API Routeparams
: the parameters of the query, which are:
query
: the query to search for (if any)page
: the page numberperPage
: the number of items per page
In this function we use the normal text search to search for the query in the text
column of the table. We also use the range
function to paginate the results.
What about semantic search?
We will add semantic search later on when displaying the feedback submissions similar to the one the user is viewing.
Loading Submissions within a Server Component
When using Server Components, we can load the feedback submissions within the Server Component itself.
interface FeedbackSubmissionsPageSearchParams { page?: number; query?: string; type?: FeedbackSubmission['type'];}async function FeedbackSubmissionsPage( { searchParams }: { searchParams: FeedbackSubmissionsPageSearchParams }) { const { submissions, count, perPage, page } = await loadFeedbackSubmissions(searchParams); // ... // iterate over the submissions and display them // count, page and perPage are useful for pagination}async function loadFeedbackSubmissions( params: FeedbackSubmissionsPageSearchParams,) { // we use the Admin Client to bypass the row level security const adminClient = getSupabaseServerClient({ admin: true, }); // we define some pagination parameters // perPage is the number of items per page // page is the page number (or 1 if not present) const perPage = 8; const page = params.page ?? 1; // we use the function we created above to fetch the feedback submissions const submissionsResponse = await getFeedbackSubmissions(adminClient, { query: params.query, page, perPage, }); // we check if there was an error if (submissionsResponse.error) { throw submissionsResponse.error; } const data = submissionsResponse.data; // we return the submissions, the count, the perPage and the page return { submissions: data, count: submissionsResponse.count, perPage, page, };}
Et voila! 🎉 Our Server Component can now load the feedback submissions from the database. Additionally, we provide some pagination parameters that we can use to paginate the results.
To reload the submissions, it's as easy as updating the search parameters, which will in turn re-fetch the submissions from the Server Component:
- the
page
parameter is used to paginate the results - the
query
parameter is used to search for a query in the feedback submissions
Creating a Daily Summary Report of Feedback Submissions
Now that we have a list of paginated feedback submissions, we can create a daily summary report of the feedback submissions.
To do so, we will fetch all the submissions from the DB made in the past day (eg. from today to yesterday), and generate a summary of the feedback submissions using the summarization
pipeline from @xenova/transformers
.
Let's how it's done!
Creating the Query to Fetch the Feedback Submissions
We can define the query in the same file we defined the query for fetching the feedback submissions. We name this function getSubmissionsSummary
.
This function will return the feedback submissions made in the past day (eg. from today to yesterday). We will pass a minDate
and maxDate
parameter to the query to fetch the feedback submissions.
export async function getSubmissionsSummary( client: SupabaseClient<Database>, params: { mindate: string; maxdate: string; },) { return client .from('feedback_submissions') .select(QUERY) .gte('created_at', params.minDate) .lte('created_at', params.maxDate);}
The above query accepts the following parameters:
client
: the Supabase Client, which you would normally inject into the query from the Server Component or API Routeparams
: the parameters of the query, which are:
minDate
: the starting date of the feedback submissions to fetchmaxDate
: the ending date of the feedback submissions to fetch
Now we can use this query to fetch the feedback submissions from the database, and then generate a summary of the feedback submissions.
Generating a Summary of the Feedback Submissions
To generate the summary of the feedback submissions, we use the summarization
pipeline from @xenova/transformers
.
Below is a function that, given a list of feedback submissions, generates a summary of the feedback submissions:
let summary: string;async function createSummary(submissions: FeedbackSubmission[]) { if (summary) { return summary; } const { pipeline } = await import('@xenova/transformers'); const generator = await pipeline('summarization'); if (!submissions.length) { return ''; } const text = submissions .map((submission) => { return `${submission.text}`; }) .join('\n'); const output = await generator(text); summary = output[0].summary_text; return summary;}
Finally, we can use the above function to generate a summary of the feedback submissions:
export default async function createDailySubmissionsSummary() { const client = getSupabaseServerClient({ admin: true, }); const today = new Date(); const yesterday = subDays(new Date(today), 1); // we use the function we created above to fetch the feedback submissions const { data: submissions, error } = await getSubmissionsSummary(client, { mindate: yesterday.toISOString(), maxdate: today.toISOString(), }); if (error) { throw error; } // we use the function we created above to generate a summary of the feedback submissions return createSummary(submissions ?? []);}
Here is the full source code:
import { subDays } from 'date-fns';import getSupabaseServerClient from '~/core/supabase/server-client';import { getSubmissionsSummary } from '~/plugins/feedback-popup/lib/queries';import FeedbackSubmission from '~/plugins/feedback-popup/lib/feedback-submission';let summary: string;export default async function createDailySubmissionsSummary() { const client = getSupabaseServerClient({ admin: true, }); const today = new Date(); const yesterday = subDays(new Date(today), 1); const { data: submissions, error } = await getSubmissionsSummary(client, { mindate: yesterday.toISOString(), maxdate: today.toISOString(), }); if (error) { throw error; } return createSummary(submissions ?? []);}async function createSummary(submissions: FeedbackSubmission[]) { if (summary) { return summary; } const { pipeline } = await import('@xenova/transformers'); const generator = await pipeline('summarization'); if (!submissions.length) { return ''; } const text = submissions .map((submission) => { return `${submission.text}`; }) .join('\n'); const output = await generator(text); summary = output[0].summary_text; return summary;}
Loading the summary from the Server Component
Now that we have created the summary, we can load it from the FeedbackSubmissionsPage
Server Component:
const [ { submissions, count, perPage, page }, summary] = await Promise.all([ loadFeedbackSubmissions(searchParams), createDailySubmissionsSummary(), ]);
Then, we can display summary
in the UI as we wish.
NB: what we did works, but it's not very efficient as it takes quite some time to generate the summary.
Making the Summary retrieval more efficient
Of course, fetching the summary every time the page is loaded is not very efficient.
You can use various techniques to make the summary retrieval more efficient, such as:
- Caching the summary in a database table and re-generating it when new feedback submissions are made
- Building an automatic cron job that generates the summary every day and stores it in a database table (you can use Postgres for this too)
- Caching the summary in a Redis cache and re-generating it when new feedback submissions are made
In short, there are many ways to make the summary retrieval more efficient. You can use the one that fits your use-case the best.
Querying an individual Feedback Submission
The individual feedback submission page can be used to view the details of a feedback submission, and list all the similar feedback submissions from the database.
Creating the Query to Fetch an Individual Feedback Submission
From the same file we created the other queries, we can add the function getFeedbackSubmission
to fetch an individual feedback submission from the database by its ID:
export async function getFeedbackSubmission( client: SupabaseClient<Database>, id: string,) { return client .from('feedback_submissions') .select(QUERY) .eq('id', id) .single();}
From the Server Component, we can fetch both this query and the query to fetch the similar feedback submissions from the database.
Loading a list of similar Feedback Submissions
We can use the match_feedback_submissions
function to fetch a list of similar feedback submissions from the database.
Given the text property of a feedback submission, we can use the match_feedback_submissions
function to fetch a list of similar feedback submissions from the database.
As parameters, we pass the following:
query_embedding
: the vector representation of the feedback textmatch_threshold
: the threshold to use to filter the resultsmatch_count
: the number of items to fetch
Feel free to play around with the match_threshold
and match_count
parameters to see how they affect the results.
async function loadFeedbackSubmission(id: string) { const adminClient = getSupabaseServerClient({ admin: true, }); const submissionsResponse = await getFeedbackSubmission(adminClient, id); if (submissionsResponse.error) { throw submissionsResponse.error; } const similarSubmissionsResponse = await adminClient.rpc( 'match_feedback_submissions', { query_embedding: submissionsResponse.data.embedding as unknown as string, match_threshold: 0.8, match_count: 5, }, ); if (similarSubmissionsResponse.error) { return { submission: submissionsResponse.data, similarSubmissions: [], }; } const similarSubmissions = similarSubmissionsResponse.data.filter( (submission) => { return submission.id !== submissionsResponse.data.id; }, ); return { submission: submissionsResponse.data, similarSubmissions: similarSubmissions ?? [], };}
Now, let's fetch this data from the Feedback Submission Server component:
interface FeedbackSubmissionsPageParams { id: string;}async function FeedbackSubmissionsPage({ params: { id },}: { params: FeedbackSubmissionsPageParams;}) { const { submission, similarSubmissions } = await loadFeedbackSubmission(id);}
The submission
variable contains the feedback submission, while the similarSubmissions
variable contains the list of similar feedback submissions.
We can iterate over the similarSubmissions
variable and display the similar feedback submissions in the UI. For example:
<div className={'flex flex-col space-y-4'}> <p className={'font-medium'}> Similar feedback submissions from other users: </p> <ol className={ 'flex flex-col space-y-2 list-decimal pl-4 text-sm' } > {similarSubmissions.map((submission) => ( <li> <Link key={submission.id} className={'hover:underline'} href={`/admin/feedback/${submission.id}`} > {submission.content.slice(0, 100)} ...{' '} </Link> </li> ))} </ol></div>
Conclusion
In this tutorial, we use Supabase Vector and Hugging Face to add AI capabilities to our simple feedback system.
We learned how to:
- Create a table to store the feedback submissions
- Create a Server Action to create a feedback submission
- Use Supabase pgVector to generate the vector representation of the feedback text
- Use Supabase pgVector to find similar feedback submissions from the database
- Use Hugging Face to generate a summary of the feedback submissions
- Use Next.js Server Components to fetch the feedback submissions from the database
In short - you now know everything you need to know to add AI capabilities to your Next.js SaaS using Supabase Vector and Hugging Face.
What's next?
You can think of many ways to improve the feedback system we built.
Here are a few ideas:
- Tagging: Automatically tagging the feedback submissions using the
zero-shot-classification
pipeline from@xenova/transformers
. For example, pre-assigning a topic to the feedback submission (eg. "billing", "feature request", "bug", etc.) - Q&A: Using the
question-answering
pipeline from@xenova/transformers
to answer questions from the feedback submissions - Auto Reply: Using the
text-generation
pipeline from@xenova/transformers
to generate possible responses to the feedback submission - Commentary: Using the
text-generation
pipeline from@xenova/transformers
to generate a more complex summary of the feedback submission than the one we did, perhaps as a commentary to the feedback submission - Sentiment Analysis: Using sentimental analysis to understand the sentiment of the feedback submission and generate a report of the sentiment of the feedback submissions using the
sentiment-analysis
pipeline from@xenova/transformers
And more!
This article led to the creation of the Feedback Plugin for the Makerkit Supabase Next.js SaaS Starter. Any customers will have access to the plugin, and can use it to add a feedback system to their SaaS!
If you have any questions, feel free to reach out to me on Twitter or on the Makerkit Discord.