”工欲善其事,必先利其器。“—孔子《论语.录灵公》
首页 > 编程 > 使用 Spring Boot、Google Cloud Vertex AI 和 Gemini 模型进行基于图像的产品搜索

使用 Spring Boot、Google Cloud Vertex AI 和 Gemini 模型进行基于图像的产品搜索

发布于2024-11-03
浏览:941

Introduction

Imagine you’re shopping online and come across a product you love but don’t know its name. Wouldn’t it be amazing to upload a picture and have the app find it for you?

Image-Based Product Search Using Spring Boot, Google Cloud Vertex AI, and Gemini Model

In this article, we’ll show you how to build exactly that: an image-based product search feature using Spring Boot and Google Cloud Vertex AI.

Overview of the Feature

This feature allows users to upload an image and receive a list of products that match it, making the search experience more intuitive and visually driven.

The image-based product search feature leverages Google Cloud Vertex AI to process images and extract relevant keywords. These keywords are then used to search for matching products in the database.

Technology Stack

  • Java 21
  • Spring boot 3.2.5
  • PostgreSQL
  • Vertex AI
  • ReactJS

We’ll walk through the process of setting up this functionality step-by-step.

Step-by-Step Implementation

Image-Based Product Search Using Spring Boot, Google Cloud Vertex AI, and Gemini Model

1. Create a new project on Google Console

First, we need to create a new project on Google Console for this.

We need to go to https://console.cloud.google.com and create a new account if you already have one. If you have one, sign in to the account.

If you add your bank account, Google Cloud will offer you a free trial.

Once you have created an account or signed in to an already existing account, you can create a new project.

Image-Based Product Search Using Spring Boot, Google Cloud Vertex AI, and Gemini Model

2. Enable Vertex AI Service

On the search bar, we need to find Vertex AI and enable all recommended APIs.

Image-Based Product Search Using Spring Boot, Google Cloud Vertex AI, and Gemini Model

Vertex AI is Google Cloud’s fully managed machine learning (ML) platform designed to simplify the development, deployment, and management of ML models. It allows you to build, train, and deploy ML models at scale by providing tools and services like AutoML, custom model training, hyperparameter tuning, and model monitoring

Gemini 1.5 Flash is part of Google’s Gemini family of models, specifically designed for efficient and high-performance inference in ML applications. Gemini models are a series of advanced AI models developed by Google, often used in natural language processing (NLP), vision tasks, and other AI-powered applications

Note: For other frameworks, you can use Gemini API directly at https://aistudio.google.com/app/prompts/new_chat. Use the structure prompt feature because you can customize your output to match the input so you will get better results.

3. Create a new prompt that matches your application

At this step, we need to customize a prompt that matching with your application.

Vertex AI Studio has provided a lot of sample prompts at Prompt Gallery. We use sample Image text to JSON to extract keywords that are related to the product image.

Image-Based Product Search Using Spring Boot, Google Cloud Vertex AI, and Gemini Model

My application is a CarShop, so I build a prompt like this. My expectation the model will respond to me with a list of keywords relating to the image.

My prompt: Extract the name car to a list keyword and output them in JSON. If you don’t find any information about the car, please output the list empty.\nExample response: [\”rolls\”, \”royce\”, \”wraith\”]

Image-Based Product Search Using Spring Boot, Google Cloud Vertex AI, and Gemini Model

After we customize a suitable prompt with your application. Now, we go to explore how to integrate with Spring Boot Application.

4. Integrate with Spring Boot Application

I have built an E-commerce application about cars. So I want to find cars by the image.

Image-Based Product Search Using Spring Boot, Google Cloud Vertex AI, and Gemini Model

First, in the pom.xml file, you should update your dependency:



    5.1.2
    26.32.0




  
      
          com.google.cloud
          spring-cloud-gcp-dependencies
          ${spring-cloud-gcp.version}
          pom
          import
      

      
          com.google.cloud
          libraries-bom
          ${google-cloud-bom.version}
          pom
          import
      
  




  
      com.google.cloud
      google-cloud-vertexai
  

After you have done the config in the pom.xml file, you create a config class GeminiConfig.java

  • MODEL_NAME: “gemini-1.5-flash”
  • LOCATION: “Your location when setting up the project”
  • PROJECT_ID: “your project ID ”

Image-Based Product Search Using Spring Boot, Google Cloud Vertex AI, and Gemini Model

import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration(proxyBeanMethods = false)
public class GeminiConfig {

    private static final String MODEL_NAME = "gemini-1.5-flash";
    private static final String LOCATION = "asia-southeast1";
    private static final String PROJECT_ID = "yasmini";

    @Bean
    public VertexAI vertexAI() {
        return new VertexAI(PROJECT_ID, LOCATION);
    }

    @Bean
    public GenerativeModel getModel(VertexAI vertexAI) {
        return new GenerativeModel(MODEL_NAME, vertexAI);
    }
}

Second, create layers Service, Controller to implement the find car function. Create class service.

Because the Gemini API responds with markdown format, we need to create a function to help convert to JSON, and from JSON we will convert to List string in Java.

Image-Based Product Search Using Spring Boot, Google Cloud Vertex AI, and Gemini Model

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.google.cloud.vertexai.api.Content;
import com.google.cloud.vertexai.api.GenerateContentResponse;
import com.google.cloud.vertexai.api.Part;
import com.google.cloud.vertexai.generativeai.*;
import com.learning.yasminishop.common.entity.Product;
import com.learning.yasminishop.common.exception.AppException;
import com.learning.yasminishop.common.exception.ErrorCode;
import com.learning.yasminishop.product.ProductRepository;
import com.learning.yasminishop.product.dto.response.ProductResponse;
import com.learning.yasminishop.product.mapper.ProductMapper;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;
import org.springframework.web.multipart.MultipartFile;

import java.util.HashSet;
import java.util.List;
import java.util.Objects;
import java.util.Set;

@Service
@RequiredArgsConstructor
@Slf4j
@Transactional(readOnly = true)
public class YasMiniAIService {

    private final GenerativeModel generativeModel;
    private final ProductRepository productRepository;

    private final ProductMapper productMapper;


    public List findCarByImage(MultipartFile file){
        try {
            var prompt = "Extract the name car to a list keyword and output them in JSON. If you don't find any information about the car, please output the list empty.\nExample response: [\"rolls\", \"royce\", \"wraith\"]";
            var content = this.generativeModel.generateContent(
                    ContentMaker.fromMultiModalData(
                            PartMaker.fromMimeTypeAndData(Objects.requireNonNull(file.getContentType()), file.getBytes()),
                            prompt
                    )
            );

            String jsonContent = ResponseHandler.getText(content);
            log.info("Extracted keywords from image: {}", jsonContent);
            List keywords = convertJsonToList(jsonContent).stream()
                    .map(String::toLowerCase)
                    .toList();

            Set results = new HashSet();
            for (String keyword : keywords) {
                List products = productRepository.searchByKeyword(keyword);
                results.addAll(products);
            }

            return results.stream()
                    .map(productMapper::toProductResponse)
                    .toList();

        } catch (Exception e) {
            log.error("Error finding car by image", e);
            return List.of();
        }
    }

    private List convertJsonToList(String markdown) throws JsonProcessingException {
        ObjectMapper objectMapper = new ObjectMapper();
        String parseJson = markdown;
        if(markdown.contains("```

json")){
            parseJson = extractJsonFromMarkdown(markdown);
        }
        return objectMapper.readValue(parseJson, List.class);
    }

    private String extractJsonFromMarkdown(String markdown) {
        return markdown.replace("

```json\n", "").replace("\n```

", "");
    }
}


We need to create a controller class to make an endpoint for front end


import com.learning.yasminishop.product.dto.response.ProductResponse;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.security.access.prepost.PreAuthorize;
import org.springframework.web.bind.annotation.*;
import org.springframework.web.multipart.MultipartFile;

import java.util.List;

@RestController
@RequestMapping("/ai")
@RequiredArgsConstructor
@Slf4j
public class YasMiniAIController {

    private final YasMiniAIService yasMiniAIService;


    @PostMapping
    public List findCar(@RequestParam("file") MultipartFile file) {

        var response = yasMiniAIService.findCarByImage(file);
        return response;
    }
}



5. IMPORTANT step: Login to Google Cloud with Google Cloud CLI

The Spring Boot Application can not verify who you are and isn't able for you to accept the resource in Google Cloud.

So we need to log in to Google and provide authorization.

5.1 First we need to install GCloud CLI on your machine

Link tutorial: https://cloud.google.com/sdk/docs/install
Check the above link and install it on your machine

5.2 Login

  1. Open your terminal at the project (you must cd into the project)
  2. Type: gcloud auth login
  3. Enter, and you will see the windows that allow you to login

gcloud auth login


Image-Based Product Search Using Spring Boot, Google Cloud Vertex AI, and Gemini Model

Image-Based Product Search Using Spring Boot, Google Cloud Vertex AI, and Gemini Model

Note: After you log in, credentials are saved in the Google Maven package, and you don’t need to log in again when restart the Spring Boot application.

Conclusion

So these implement above based on my project E-commerce, you can modify matching with your project, and your framework. In other frameworks, not spring boot (NestJs, ..), you can use https://aistudio.google.com/app/prompts/new_chat. and don’t need to create a new Google Cloud account.

You can check the detailed implementation at my repo:

Backend: https://github.com/duongminhhieu/YasMiniShop
Front-end: https://github.com/duongminhhieu/YasMini-Frontend

Happy learning !!!

版本声明 本文转载于:https://dev.to/duongminhhieu/image-based-product-search-using-spring-boot-google-cloud-vertex-ai-and-gemini-model-7io?1如有侵犯,请联系[email protected]删除
最新教程 更多>
  • 如何在Java中正确显示“ DD/MM/YYYY HH:MM:SS.SS”格式的当前日期和时间?
    如何在Java中正确显示“ DD/MM/YYYY HH:MM:SS.SS”格式的当前日期和时间?
    如何在“ dd/mm/yyyy hh:mm:mm:ss.ss”格式“ gormat 解决方案:的,请访问量很大,并应为procectiquiestate的,并在整个代码上正确格式不多: java.text.simpledateformat; 导入java.util.calendar; 导入java...
    编程 发布于2025-05-08
  • 如何干净地删除匿名JavaScript事件处理程序?
    如何干净地删除匿名JavaScript事件处理程序?
    删除匿名事件侦听器将匿名事件侦听器添加到元素中会提供灵活性和简单性,但是当要删除它们时,可以构成挑战,而无需替换元素本身就可以替换一个问题。 element? element.addeventlistener(event,function(){/在这里工作/},false); 要解决此问题,请考虑...
    编程 发布于2025-05-08
  • 您可以使用CSS在Chrome和Firefox中染色控制台输出吗?
    您可以使用CSS在Chrome和Firefox中染色控制台输出吗?
    在javascript console 中显示颜色是可以使用chrome的控制台显示彩色文本,例如红色的redors,for for for for错误消息?回答是的,可以使用CSS将颜色添加到Chrome和Firefox中的控制台显示的消息(版本31或更高版本)中。要实现这一目标,请使用以下模...
    编程 发布于2025-05-08
  • 如何使用Java.net.urlConnection和Multipart/form-data编码使用其他参数上传文件?
    如何使用Java.net.urlConnection和Multipart/form-data编码使用其他参数上传文件?
    使用http request 上传文件上传到http server,同时也提交其他参数,java.net.net.urlconnection and Multipart/form-data Encoding是普遍的。 Here's a breakdown of the process:Mu...
    编程 发布于2025-05-08
  • Go web应用何时关闭数据库连接?
    Go web应用何时关闭数据库连接?
    在GO Web Applications中管理数据库连接很少,考虑以下简化的web应用程序代码:出现的问题:何时应在DB连接上调用Close()方法?,该特定方案将自动关闭程序时,该程序将在EXITS EXITS EXITS出现时自动关闭。但是,其他考虑因素可能保证手动处理。选项1:隐式关闭终止数...
    编程 发布于2025-05-08
  • 如何使用Python理解有效地创建字典?
    如何使用Python理解有效地创建字典?
    在python中,词典综合提供了一种生成新词典的简洁方法。尽管它们与列表综合相似,但存在一些显着差异。与问题所暗示的不同,您无法为钥匙创建字典理解。您必须明确指定键和值。 For example:d = {n: n**2 for n in range(5)}This creates a dicti...
    编程 发布于2025-05-07
  • 如何使用Python的请求和假用户代理绕过网站块?
    如何使用Python的请求和假用户代理绕过网站块?
    如何使用Python的请求模拟浏览器行为,以及伪造的用户代理提供了一个用户 - 代理标头一个有效方法是提供有效的用户式header,以提供有效的用户 - 设置,该标题可以通过browser和Acterner Systems the equestersystermery和操作系统。通过模仿像Chro...
    编程 发布于2025-05-07
  • 如何将MySQL数据库添加到Visual Studio 2012中的数据源对话框中?
    如何将MySQL数据库添加到Visual Studio 2012中的数据源对话框中?
    在Visual Studio 2012 尽管已安装了MySQL Connector v.6.5.4,但无法将MySQL数据库添加到实体框架的“ DataSource对话框”中。为了解决这一问题,至关重要的是要了解MySQL连接器v.6.5.5及以后的6.6.x版本将提供MySQL的官方Visual...
    编程 发布于2025-05-07
  • CSS强类型语言解析
    CSS强类型语言解析
    您可以通过其强度或弱输入的方式对编程语言进行分类的方式之一。在这里,“键入”意味着是否在编译时已知变量。一个例子是一个场景,将整数(1)添加到包含整数(“ 1”)的字符串: result = 1 "1";包含整数的字符串可能是由带有许多运动部件的复杂逻辑套件无意间生成的。它也可以是故意从单个真理...
    编程 发布于2025-05-07
  • 如何检查对象是否具有Python中的特定属性?
    如何检查对象是否具有Python中的特定属性?
    方法来确定对象属性存在寻求一种方法来验证对象中特定属性的存在。考虑以下示例,尝试访问未定义的属性会引起错误: >>> a = someClass() >>> A.property Trackback(最近的最新电话): 文件“ ”,第1行, AttributeError: SomeClass ...
    编程 发布于2025-05-07
  • Java开发者如何保护数据库凭证免受反编译?
    Java开发者如何保护数据库凭证免受反编译?
    在java 在单独的配置文件保护数据库凭证的最有效方法中存储凭据是将它们存储在单独的配置文件中。该文件可以在运行时加载,从而使登录数据从编译的二进制文件中远离。使用prevereness class import java.util.prefs.preferences; 公共类示例{ 首选项...
    编程 发布于2025-05-07
  • 哪种方法更有效地用于点 - 填点检测:射线跟踪或matplotlib \的路径contains_points?
    哪种方法更有效地用于点 - 填点检测:射线跟踪或matplotlib \的路径contains_points?
    在Python Matplotlib's path.contains_points FunctionMatplotlib's path.contains_points function employs a path object to represent the polygon.它...
    编程 发布于2025-05-07
  • 如何使用FormData()处理多个文件上传?
    如何使用FormData()处理多个文件上传?
    )处理多个文件输入时,通常需要处理多个文件上传时,通常是必要的。 The fd.append("fileToUpload[]", files[x]); method can be used for this purpose, allowing you to send multi...
    编程 发布于2025-05-07
  • Python高效去除文本中HTML标签方法
    Python高效去除文本中HTML标签方法
    在Python中剥离HTML标签,以获取原始的文本表示Achieving Text-Only Extraction with Python's MLStripperTo streamline the stripping process, the Python standard librar...
    编程 发布于2025-05-07

免责声明: 提供的所有资源部分来自互联网,如果有侵犯您的版权或其他权益,请说明详细缘由并提供版权或权益证明然后发到邮箱:[email protected] 我们会第一时间内为您处理。

Copyright© 2022 湘ICP备2022001581号-3