前言
之前做搜索功能都是直接用MySQL的LIKE查询,数据量小的时候还行,一到大几万条数据就慢得不行
所以决定用Elasticsearch做全文检索,记录一下整合过程
其实Elasticsearch这个东西早就听说过,但一直没机会用,这次总算是有个实际项目可以用它了
环境准备
安装Elasticsearch
我用Docker安装的,比较方便:
1 2 3 4 5 6 7 8
| docker pull elasticsearch:8.8.0 docker run -d \ --name es \ -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \ -e "xpack.security.enabled=false" \ -p 9200:9200 \ -p 9300:9300 \ elasticsearch:8.8.0
|
安装完访问 http://localhost:9200 能看到JSON响应就说明成功了
添加依赖
1 2 3 4
| <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-elasticsearch</artifactId> </dependency>
|
配置
1 2 3 4 5
| spring: elasticsearch: uris: http://localhost:9200 connection-timeout: 10s socket-timeout: 30s
|
创建实体类
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| @Document(indexName = "product") @Data public class Product { @Id private Long id;
@Field(type = FieldType.Text, analyzer = "ik_max_word") private String name;
@Field(type = FieldType.Text, analyzer = "ik_max_word") private String description;
@Field(type = FieldType.Double) private Double price;
@Field(type = FieldType.Keyword) private String category;
@Field(type = FieldType.Date) private Date createTime; }
|
这里用了@Document注解标识这是一个ES的文档类
@Field注解可以指定字段类型和分词器,我用了IK分词器,对中文支持比较好
创建Repository
1 2 3 4 5 6 7 8
| public interface ProductRepository extends ElasticsearchRepository<Product, Long> {
List<Product> findByNameContaining(String name);
List<Product> findByPriceBetween(Double min, Double max);
List<Product> findByCategory(String category); }
|
继承ElasticsearchRepository就有基本的CRUD方法了,也可以按照方法命名规则自定义查询方法
基本CRUD操作
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
| @Service public class ProductService {
@Autowired ProductRepository productRepository;
public Product save(Product product) { return productRepository.save(product); }
public List<Product> saveAll(List<Product> products) { return productRepository.saveAll(products); }
public Product findById(Long id) { return productRepository.findById(id).orElse(null); }
public List<Product> findAll() { return (List<Product>) productRepository.findAll(); }
public void deleteById(Long id) { productRepository.deleteById(id); } }
|
复杂查询
如果需要更复杂的查询,可以用ElasticsearchRestTemplate:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87
| @Service public class ProductSearchService {
@Autowired ElasticsearchRestTemplate elasticsearchRestTemplate;
public List<Product> search(String keyword, Double minPrice, Double maxPrice) { BoolQueryBuilder query = QueryBuilders.boolQuery();
if (StringUtils.hasText(keyword)) { query.must(QueryBuilders.multiMatchQuery(keyword, "name", "description") .analyzer("ik_max_word")); }
if (minPrice != null || maxPrice != null) { RangeQueryBuilder rangeQuery = QueryBuilders.rangeQuery("price"); if (minPrice != null) { rangeQuery.gte(minPrice); } if (maxPrice != null) { rangeQuery.lte(maxPrice); } query.filter(rangeQuery); }
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder() .withQuery(query) .build();
SearchHits<Product> searchHits = elasticsearchRestTemplate.search(searchQuery, Product.class);
return searchHits.getSearchHits().stream() .map(SearchHit::getContent) .collect(Collectors.toList()); }
public Page<Product> searchPage(String keyword, int page, int size) { NativeSearchQuery searchQuery = new NativeSearchQueryBuilder() .withQuery(QueryBuilders.multiMatchQuery(keyword, "name", "description")) .withPageable(PageRequest.of(page, size)) .build();
SearchHits<Product> searchHits = elasticsearchRestTemplate.search(searchQuery, Product.class);
List<Product> products = searchHits.getSearchHits().stream() .map(SearchHit::getContent) .collect(Collectors.toList());
return new PageImpl<>(products, PageRequest.of(page, size), searchHits.getTotalHits()); }
public List<Map<String, Object>> searchWithHighlight(String keyword) { NativeSearchQuery searchQuery = new NativeSearchQueryBuilder() .withQuery(QueryBuilders.multiMatchQuery(keyword, "name", "description")) .withHighlightFields( new HighlightBuilder.Field("name"), new HighlightBuilder.Field("description") ) .build();
SearchHits<Product> searchHits = elasticsearchRestTemplate.search(searchQuery, Product.class);
List<Map<String, Object>> results = new ArrayList<>(); for (SearchHit<Product> hit : searchHits) { Map<String, Object> map = new HashMap<>(); Product product = hit.getContent(); map.put("product", product);
Map<String, List<String>> highlightFields = hit.getHighlightFields(); if (highlightFields.containsKey("name")) { map.put("highlightName", highlightFields.get("name").get(0)); } if (highlightFields.containsKey("description")) { map.put("highlightDescription", highlightFields.get("description").get(0)); }
results.add(map); }
return results; } }
|
Controller
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
| @RestController @RequestMapping("/product") public class ProductController {
@Autowired ProductService productService;
@Autowired ProductSearchService productSearchService;
@PostMapping public Product save(@RequestBody Product product) { return productService.save(product); }
@GetMapping("/search") public List<Product> search( @RequestParam(required = false) String keyword, @RequestParam(required = false) Double minPrice, @RequestParam(required = false) Double maxPrice) { return productSearchService.search(keyword, minPrice, maxPrice); }
@GetMapping("/search/page") public Page<Product> searchPage( @RequestParam String keyword, @RequestParam(defaultValue = "0") int page, @RequestParam(defaultValue = "10") int size) { return productSearchService.searchPage(keyword, page, size); } }
|
MySQL数据同步到ES
一般数据还是存在MySQL里,ES只是用来做搜索,所以需要把MySQL的数据同步到ES
方案一:定时同步
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| @Component public class DataSyncScheduler {
@Autowired ProductMapper productMapper;
@Autowired ProductRepository productRepository;
@Scheduled(cron = "0 0 2 * * ?") public void syncData() { List<Product> products = productMapper.selectList(null); productRepository.saveAll(products); } }
|
方案二:实时同步(推荐)
在增删改的时候同步更新ES:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
| @Service public class ProductService {
@Autowired ProductMapper productMapper;
@Autowired ProductRepository productRepository;
public void save(Product product) { productMapper.insert(product);
productRepository.save(product); }
public void updateById(Product product) { productMapper.updateById(product);
productRepository.save(product); }
public void deleteById(Long id) { productMapper.deleteById(id);
productRepository.deleteById(id); } }
|
方案三:使用Canal
如果项目数据量比较大,建议用Canal监听MySQL的binlog,实时同步到ES
这个配置比较复杂,这里就不展开了,有兴趣的可以看看Canal的官方文档
IK分词器配置
默认的分词器对中文支持不好,推荐安装IK分词器
1 2 3 4 5 6 7 8 9
| docker exec -it es /bin/bash
elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v8.8.0/elasticsearch-analysis-ik-8.8.0.zip
exit docker restart es
|
测试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| @SpringBootTest class ProductServiceTest {
@Autowired ProductRepository productRepository;
@Test void testSearch() { Product product1 = new Product(); product1.setId(1L); product1.setName("苹果手机"); product1.setDescription("新款苹果手机,性能强劲"); product1.setPrice(5999.0); productRepository.save(product1);
List<Product> products = productRepository.findByNameContaining("手机"); System.out.println(products); } }
|
总结
Elasticsearch做全文检索确实比MySQL强多了,尤其是中文搜索
但是也有缺点:
- 需要额外维护ES集群
- 数据同步比较麻烦
- 占用内存比较大
根据项目实际情况选择吧,如果数据量不大,其实MySQL也能应付
暂时就先记录这么多