If you are also starting up with building products (SaaS), then here are a few things you might consider from my learnings.
In this article, I will share the mistakes & challenges I faced while building a full-fledged AI app used by thousands of users.
I found this issue while building Summarify
We all are used to getting a response to any action we do in a web app in milliseconds, but AI in general takes a few seconds to even minutes to process even a single query.
And, it has two major issues:
- it can be really frustrating for users to wait that long
- the server might timeout during the API call
To handle the user’s pain of waiting such a long, the best you can do is to add some interactive animations from which users can get an idea of what’s going on and how much more they need to wait to get the result.
Next up, server timeout — the bigger issue (at least from an engineer’s perspective)!
There is a very high chance that your frontend (considering the NextJs application) is deployed on Vercel on the free plan. Now, in the free plan, Vercel allows a timeout limit of 10s for the function call i.e. you can await for any processing running in your API route for a maximum of 10 seconds, after that it’ll give you a 504 ERROR.
To deal with it, you can prefer any one of the following:
- either you have to upgrade to Vercel’s pro plan which gives a timeout limit of 15 minutes (a costly approach)
- you can take advantage of the callback function (webhook response), an option if supported by the AI services you are using to get your job done
- or you can write the function to process the AI-related query in a different project (typically called backend loll) and deploy & run it separately
To deploy a function separately, you can simply create an Express Application, do the whole processing, and create an API that makes a webhook call to your frontend project once the processing is done.
You can deploy this project in minutes in Digital Ocean, where you will get $200 of free credits upon signing up. You can also find the complete tutorial on how to host your NodeJs application in Digital Ocean here.
Here is a quick example of the same:
Let’s assume you are building an application to summarise the content of a given text using the OpenAI service.
So, we will take a request from the frontend, summarise the content using OpenAI Api, and send the output as a webhook call to the frontend.
1. Create an index.ts (server entry point) file for the express app:
import express, { Express, Request, Response } from "express";
import cors from "cors";
import { AppConfig } from "./src/interfaces/config/config";
import summaryRoute from "./src/modules/summary/route";
const CONFIG = AppConfig();
const app: Express = express();
const PORT = process.env.PORT || 6001;
app.use(express.json({ limit: "5mb" }));
app.use(express.urlencoded({ extended: true }));
app.use(cors());
app.use("/api/summary", summaryRoute);
app.get("/api", (req: Request, res: Response) => {
res.send("Api Working!");
});
app.listen(PORT, () => {
console.log(`⚡️[server]: Server is running at http://localhost:${PORT}`);
});
2. Now, create route.ts for the API endpoint:
import express, { Request, Response } from "express";
import { summaryService } from "./summarify.service";
const route = express.Router();
route.post(
"/",
async (req: Request, res: Response) => {
const response = await summaryService(req);
res.status(response["status"]);
res.send(response);
}
);
export default route;
3. Finally, create the driver function that will carry out the whole process. Name it summarify.service.ts :
import { Request } from "express";
import {
ChatCompletionCreateParamsNonStreaming,
ChatCompletionMessageParam,
} from "openai/resources";
import { OpenAI } from "openai";
import axios from "axios";
const openai = new OpenAI({
apiKey: process.env.OPENAI_KEY,
});
export const summaryService = async (
req: Request
): Promise<{ status: number; message: string }> => {
const { text, callbackUrl } = req.body;
// get summary
simulateSummaryProcessing(text)
.then(async ({ summary }) => {
// send summary as a callback response to the frontend application
await axios.post(callbackUrl, { summary });
})
.catch((error) => {
return {
status: 400,
message: error.message,
};
});
return {
status: 200,
message: "Getting your summary ready in a few seconds",
};
};
// Simulate a time-consuming process
async function simulateSummaryProcessing(
text: string
): Promise<{ summary: string }> {
try {
const prompt = "Give me a summary of the given content";
const messages: ChatCompletionMessageParam[] = [
{
role: "system",
content: prompt,
},
{
role: "user",
content: text,
},
];
// OpenAI call
const params: ChatCompletionCreateParamsNonStreaming = {
model: process.env.OPENAI_MODEL,
messages,
max_tokens: 1000,
temperature: 0.8,
};
const response = await openai.chat.completions.create(params);
const summary = response.choices[0].message?.content!;
return { summary };
} catch (error) {
throw error;
}
}
It will send you the response (summary) to your callback, i.e. the endpoint which will get this data and store it in your db or any other process you want to carry out.
That’s all, yaaayyy! 🥳 you have just created an API that makes all the heavy tasks in the backend & sends you the response accordingly!