Use Hibernate Search in Standalone mode with Elasticsearch/OpenSearch
You have a Quarkus application? You want to provide a full-featured full-text search to your users? You’re at the right place.
With this guide, you’ll learn how to index entities into an Elasticsearch or OpenSearch cluster in a heartbeat with Hibernate Search. We will also explore how you can query your Elasticsearch or OpenSearch cluster using the Hibernate Search API.
If you want to index Hibernate ORM entities, see this dedicated guide instead. |
前提条件
このガイドを完成させるには、以下が必要です:
-
ざっと 20 minutes
-
IDE
-
JDK 17+がインストールされ、
JAVA_HOME
が適切に設定されていること -
Apache Maven 3.9.6
-
動作するコンテナランタイム(Docker, Podman)
-
使用したい場合は、 Quarkus CLI
-
ネイティブ実行可能ファイルをビルドしたい場合、MandrelまたはGraalVM(あるいはネイティブなコンテナビルドを使用する場合はDocker)をインストールし、 適切に設定していること
アーキテクチャ
このガイドに記載されているアプリケーションは、(シンプルな) 図書館を管理することができます:あなたは、著者とその本を管理します。
The entities are stored and indexed in an Elasticsearch cluster.
ソリューション
次の章で紹介する手順に沿って、ステップを踏んでアプリを作成することをお勧めします。ただし、完成した例にそのまま進んでも構いません。
Gitレポジトリをクローンするか git clone https://github.com/quarkusio/quarkus-quickstarts.git
、 アーカイブ をダウンロードします。
The solution is located in the hibernate-search-standalone-elasticsearch-quickstart
directory.
提供されるソリューションには、テストやテストのインフラストラクチャなど、いくつかの追加要素が含まれています。 |
Mavenプロジェクトの作成
まず、新しいプロジェクトが必要です。以下のコマンドで新規プロジェクトを作成します。 :
Windowsユーザーの場合:
-
cmdを使用する場合、(バックスラッシュ
\
を使用せず、すべてを同じ行に書かないでください)。 -
Powershellを使用する場合は、
-D
パラメータを二重引用符で囲んでください。例:"-DprojectArtifactId=hibernate-search-standalone-elasticsearch-quickstart"
このコマンドは、以下のエクステンションをインポートするMaven構造体を生成します:
-
Hibernate Search Standalone + Elasticsearch,
-
Quarkus REST (formerly RESTEasy Reactive) and Jackson.
If you already have your Quarkus project configured, you can add the hibernate-search-standalone-elasticsearch
extension
to your project by running the following command in your project base directory:
quarkus extension add hibernate-search-standalone-elasticsearch
./mvnw quarkus:add-extension -Dextensions='hibernate-search-standalone-elasticsearch'
./gradlew addExtension --extensions='hibernate-search-standalone-elasticsearch'
これにより、 pom.xml
に以下が追加されます:
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-hibernate-search-standalone-elasticsearch</artifactId>
</dependency>
implementation("io.quarkus:quarkus-hibernate-search-standalone-elasticsearch")
Creating the bare classes
First, let’s create our Book
and Author
classes in the model
subpackage.
package org.acme.hibernate.search.elasticsearch.model;
import java.util.List;
import java.util.Objects;
public class Author {
public UUID id; (1)
public String firstName;
public String lastName;
public List<Book> books;
public Author(UUID id, String firstName, String lastName, List<Book> books) {
this.id = id;
this.firstName = firstName;
this.lastName = lastName;
this.books = books;
}
}
1 | We’re using public fields here,
because it’s shorter and there is no expectation of encapsulation on what is essentially a data class.
However, if you prefer using private fields with getters/setters,
that’s totally fine and will work perfectly as long as the getters/setters follow the JavaBeans naming convention
( |
package org.acme.hibernate.search.elasticsearch.model;
import java.util.Objects;
public class Book {
public UUID id;
public String title;
public Book(UUID id, String title) {
this.id = id;
this.title = title;
}
}
Hibernate Searchアノテーションの使用
Enabling full text search capabilities for our classes is as simple as adding a few annotations.
Let’s edit the Author
entity to include this content:
package org.acme.hibernate.search.elasticsearch.model;
import java.util.ArrayList;
import java.util.List;
import java.util.Objects;
import java.util.UUID;
import org.hibernate.search.engine.backend.types.Sortable;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.DocumentId;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.FullTextField;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.IdProjection;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.Indexed;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.IndexedEmbedded;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.KeywordField;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.ProjectionConstructor;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.SearchEntity;
@SearchEntity (1)
@Indexed (2)
public class Author {
@DocumentId (3)
public UUID id;
@FullTextField(analyzer = "name") (4)
@KeywordField(name = "firstName_sort", sortable = Sortable.YES, normalizer = "sort") (5)
public String firstName;
@FullTextField(analyzer = "name")
@KeywordField(name = "lastName_sort", sortable = Sortable.YES, normalizer = "sort")
public String lastName;
@IndexedEmbedded (6)
public List<Book> books = new ArrayList<>();
public Author(UUID id, String firstName, String lastName) {
this.id = id;
this.firstName = firstName;
this.lastName = lastName;
}
@ProjectionConstructor (7)
public Author(@IdProjection UUID id, String firstName, String lastName, List<Book> books) {
this( id, firstName, lastName );
this.books = books;
}
}
1 | First, let’s mark the Author type as an entity type.
In short, this implies the Author type it has its own, distinct lifecycle (not tied to another type),
and that every `BookAuthor instance carries an immutable, unique identifier. |
2 | Then, let’s use the @Indexed annotation to register our Author entity as part of the full text index. |
3 | And let’s end the mandatory configuration by defining a document identifier. |
4 | @FullTextField アノテーションは、全文検索用に特別に調整されたインデックスのフィールドを宣言します。特に、トークン(~単語)を分割して分析するためのアナライザーを定義する必要があります。 - これについては後で説明します。 |
5 | このように、同じプロパティに複数のフィールドを定義することができます。ここでは、固有の名前を持つ @KeywordField を定義しています。主な違いは、キーワードフィールドはトークン化されません(文字列は1つのトークンとして保持される)が、正規化(すなわちフィルタリング)することができるということです。これについては後で説明します。このフィールドは Author のソートに使用することを意図しているため、ソート可能であるとマークされています。 |
6 | The purpose of @IndexedEmbedded is to include the Book fields into the Author index.
In this case, we just use the default configuration: all the fields of the associated Book instances are included in the index (i.e. the title field).
@IndexedEmbedded also supports nested documents (using the structure = NESTED attribute), but we don’t need it here.
You can also specify the fields you want to embed in your parent index using the includePaths /excludePaths attributes if you don’t want them all. |
7 | We mark a (single) constructor as a @ProjectionConstructor ,
so that an Author instance can be reconstructed from the content of the index. |
Now that our authors are indexed, we will want to map books,
so that this @IndexedEmbedded
annotation actually embeds something.
Open the Book
class and include the content below.
package org.acme.hibernate.search.elasticsearch.model;
import java.util.Objects;
import java.util.UUID;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.FullTextField;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.KeywordField;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.ProjectionConstructor;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.SearchEntity;
@SearchEntity (1)
public class Book {
@KeywordField (2)
public UUID id;
@FullTextField(analyzer = "english") (3)
public String title;
@ProjectionConstructor (4)
public Book(UUID id, String title) {
this.id = id;
this.title = title;
}
}
1 | We also mark the Book type as an entity type,
but we don’t use @Indexed , because we decided we don’t need a dedicated index for books. |
2 | We index the book’s ID, so it can be projected (see below). |
3 | We use a @FullTextField similar to what we did for Author but you’ll notice that the analyzer is different - more on this later. |
4 | Like Author , we mark a constructor as a @ProjectionConstructor ,
so that a Book instance can be reconstructed from the content of the index. |
アナライザーとノーマライザー
はじめに
アナライズは全文検索の大きな部分を占めています。インデックス作成や検索クエリ構築の際に、テキストがどのように処理されるかを定義します。
アナライザーの役割は、テキストをトークン(~単語)に分割し、フィルターをかけることです(例えば、すべて小文字にしたり、アクセントを削除したり)。
ノーマライザーは入力を1つのトークンとして保持する特殊なアナライザです。特に、キーワードのソートやインデックス作成に有効です。
多くのバンドルされたアナライザーがありますが、自分の目的に合わせて独自に開発することもできます。
Elasticsearchアナリシスフレームワークについては、 ElasticsearchドキュメントのText analysis セクション で詳しく説明しています。
使用するアナライザーの定義
エンティティにHibernate Searchアノテーションを追加する際に、使用するアナライザーとノーマライザーを定義しました。典型的には:
@FullTextField(analyzer = "english")
@FullTextField(analyzer = "name")
@KeywordField(name = "lastName_sort", sortable = Sortable.YES, normalizer = "sort")
以下のものを使用しています:
-
人名用の
name
というアナライザー, -
書籍のタイトル用の
english
というアナライザー, -
ソートフィールド用の
sort
というノーマライザー
ですが、まだ設定していません。
それでは、Hibernate Searchを使ってどのように設定できるのか見てみましょう。
アナライザーのセットアップ
これは簡単な作業で、 ElasticsearchAnalysisConfigurer
の実装を作成するだけです(そして、それを使用するようにQuarkusを設定します、詳細は後述します)。
要件を満たすために、次のような実装を作ってみましょう:
package org.acme.hibernate.search.elasticsearch.config;
import org.hibernate.search.backend.elasticsearch.analysis.ElasticsearchAnalysisConfigurationContext;
import org.hibernate.search.backend.elasticsearch.analysis.ElasticsearchAnalysisConfigurer;
import io.quarkus.hibernate.search.standalone.elasticsearch.SearchExtension;
@SearchExtension (1)
public class AnalysisConfigurer implements ElasticsearchAnalysisConfigurer {
@Override
public void configure(ElasticsearchAnalysisConfigurationContext context) {
context.analyzer("name").custom() (2)
.tokenizer("standard")
.tokenFilters("asciifolding", "lowercase");
context.analyzer("english").custom() (3)
.tokenizer("standard")
.tokenFilters("asciifolding", "lowercase", "porter_stem");
context.normalizer("sort").custom() (4)
.tokenFilters("asciifolding", "lowercase");
}
}
1 | Annotate the configurer implementation with the @SearchExtension qualifier
to tell Quarkus it should be used in Hibernate Search Standalone, for all Elasticsearch indexes (by default).
アノテーションは、特定の永続ユニット ( |
2 | これは、スペースで単語を分離し、ASCII以外の文字をASCIIの対応する文字で置換し(したがって、アクセントを除去し)、すべてを小文字にするシンプルなアナライザーです。これは、例では著者名に使用されています。 |
3 | これはもう少し積極的で、ステミングも含まれています。インデックス化された入力に mysteries が含まれていても、 mystery を検索して結果を得ることができます。人名に対しては確かに強引すぎますが、書籍のタイトルに対しては完璧です。 |
4 | ここではソートに使われるノーマライザーを紹介します。最初のアナライザーと非常によく似ていますが、1つだけのトークンが欲しいので、単語をトークン化しないことを除けば同じです。 |
For more information about configuring analyzers, see this section of the reference documentation.
Implementing the REST service
org.acme.hibernate.search.elasticsearch.LibraryResource
クラスを作成します:
package org.acme.hibernate.search.elasticsearch;
import java.util.ArrayList;
import java.util.UUID;
import jakarta.inject.Inject;
import jakarta.ws.rs.Consumes;
import jakarta.ws.rs.DELETE;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.NotFoundException;
import jakarta.ws.rs.POST;
import jakarta.ws.rs.PUT;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.core.MediaType;
import org.acme.hibernate.search.elasticsearch.model.Author;
import org.acme.hibernate.search.elasticsearch.model.Book;
import org.hibernate.search.mapper.pojo.standalone.mapping.SearchMapping;
import org.hibernate.search.mapper.pojo.standalone.session.SearchSession;
import org.jboss.resteasy.reactive.RestForm;
import org.jboss.resteasy.reactive.RestPath;
@Path("/library")
public class LibraryResource {
@Inject
SearchMapping searchMapping; (1)
@PUT
@Path("author")
@Consumes(MediaType.APPLICATION_FORM_URLENCODED)
public void addAuthor(@RestForm String firstName, @RestForm String lastName) {
try (var searchSession = searchMapping.createSession()) { (2)
Author author = new Author(UUID.randomUUID(), firstName, lastName, new ArrayList<>());
searchSession.indexingPlan().add(author); (3)
}
}
@GET
@Path("author/{id}")
public Author getAuthor(@RestPath UUID id) {
try (var searchSession = searchMapping.createSession()) {
return getAuthor(searchSession, id);
}
}
private Author getAuthor(SearchSession searchSession, UUID id) {
return searchSession.search(Author.class) (4)
.where(f -> f.id().matching(id))
.fetchSingleHit()
.orElseThrow(NotFoundException::new);
}
@POST
@Path("author/{id}")
@Consumes(MediaType.APPLICATION_FORM_URLENCODED)
public void updateAuthor(@RestPath UUID id, @RestForm String firstName, @RestForm String lastName) {
try (var searchSession = searchMapping.createSession()) {
Author author = getAuthor(searchSession, id); (5)
author.firstName = firstName;
author.lastName = lastName;
searchSession.indexingPlan().addOrUpdate(author); (5)
}
}
@DELETE
@Path("author/{id}")
public void deleteAuthor(@RestPath UUID id) {
try (var searchSession = searchMapping.createSession()) {
searchSession.indexingPlan().purge(Author.class, id, null); (6)
}
}
@PUT
@Path("author/{authorId}/book/")
@Consumes(MediaType.APPLICATION_FORM_URLENCODED)
public void addBook(@RestPath UUID authorId, @RestForm String title) {
try (var searchSession = searchMapping.createSession()) {
Author author = getAuthor(searchSession, authorId); (7)
author.books.add(new Book(authorId, title));
searchSession.indexingPlan().addOrUpdate(author);
}
}
@DELETE
@Path("author/{authorId}/book/{bookId}")
public void deleteBook(@RestPath UUID authorId, @RestPath UUID bookId) {
try (var searchSession = searchMapping.createSession()) {
Author author = getAuthor(searchSession, authorId); (7)
author.books.removeIf(book -> book.id.equals(bookId));
searchSession.indexingPlan().addOrUpdate(author);
}
}
}
1 | Inject a Hibernate Search mapping, the main entry point to Hibernate Search APIs. |
2 | Create a Hibernate Search session, which allows executing operations on the indexes. |
3 | To index a new Author, retrieve the session’s indexing plan and call add , passing the author instance in argument. |
4 | To retrieve an Author from the index, execute a simple search — more on search later — by identifier. |
5 | To update an Author, retrieve it from the index, apply changes,
retrieve the session’s indexing plan and call addOrUpdate , passing the author instance in argument. |
6 | To delete an Author by identifier, retrieve the session’s indexing plan
and call purge , passing the author class and identifier in argument. |
7 | Since books are "owned" by authors (they are duplicated for each author and their lifecycle is bound to their author’s), adding/deleting a book is simply an update to the author. |
Nothing groundbreaking here: just a few CRUD operations in a REST service, using Hibernate Search APIs.
The interesting part comes with the addition of a search endpoint.
In our LibraryResource
, we just need to add the following method (and a few import
s):
@GET
@Path("author/search")
public List<Author> searchAuthors(@RestQuery String pattern, (1)
@RestQuery Optional<Integer> size) {
try (var searchSession = searchMapping.createSession()) { (2)
return searchSession.search(Author.class) (3)
.where(f -> pattern == null || pattern.isBlank()
? f.matchAll() (4)
: f.simpleQueryString()
.fields("firstName", "lastName", "books.title").matching(pattern)) (5)
.sort(f -> f.field("lastName_sort").then().field("firstName_sort")) (6)
.fetchHits(size.orElse(20)); (7)
}
}
1 | パラメーター名の繰り返しを避けるために org.jboss.resteasy.annotations.jaxrs.QueryParam アノテーションを使用します。 |
2 | Create a Hibernate Search session, which allows executing operations on the indexes. |
3 | Author を検索していることを示しています。 |
4 | 述語を作成します。パターンが空の場合は matchAll() 述語を使用します。 |
5 | 有効なパターンがあれば、パターンにマッチする firstName 、 lastName 、 books.title フィールドに対する simpleQueryString() 述語を作成します。 |
6 | 結果のソート順を定義します。ここでは、姓でソートし、次に名でソートしています。ソートには作成した特定のフィールドを使用していることに注意してください。 |
7 | size で指定した数の一致度が高いものをフェッチします。デフォルトでは 20 です。もちろんページングもサポートしています。 |
Hibernate Search DSLはElasticsearchの述語(match、range、nested、phrase、spatial…)の重要なサブセットをサポートしています。オートコンプリートを使ってDSLをご自由にお試しください。 When that’s not enough, you can always fall back to defining a predicate using JSON directly. |
Automatic data initialization
このデモの目的のために、初期データセットをインポートしてみましょう。
Let’s add a few methods in LibraryResource
:
void onStart(@Observes StartupEvent ev) { (1)
// Index some test data if nothing exists
try (var searchSession = searchMapping.createSession()) {
if (0 < searchSession.search(Author.class) (2)
.where(f -> f.matchAll())
.fetchTotalHitCount()) {
return;
}
for (Author author : initialDataSet()) { (3)
searchSession.indexingPlan().add(author); (4)
}
}
}
private List<Author> initialDataSet() {
return List.of(
new Author(UUID.randomUUID(), "John", "Irving",
List.of(
new Book(UUID.randomUUID(), "The World According to Garp"),
new Book(UUID.randomUUID(), "The Hotel New Hampshire"),
new Book(UUID.randomUUID(), "The Cider House Rules"),
new Book(UUID.randomUUID(), "A Prayer for Owen Meany"),
new Book(UUID.randomUUID(), "Last Night in Twisted River"),
new Book(UUID.randomUUID(), "In One Person"),
new Book(UUID.randomUUID(), "Avenue of Mysteries"))),
new Author(UUID.randomUUID(), "Paul", "Auster",
List.of(
new Book(UUID.randomUUID(), "The New York Trilogy"),
new Book(UUID.randomUUID(), "Mr. Vertigo"),
new Book(UUID.randomUUID(), "The Brooklyn Follies"),
new Book(UUID.randomUUID(), "Invisible"),
new Book(UUID.randomUUID(), "Sunset Park"),
new Book(UUID.randomUUID(), "4 3 2 1"))));
}
1 | Add a method that will get executed on application startup. |
2 | Check whether there already is data in the index — if not, bail out. |
3 | Generate the initial dataset. |
4 | For each author, add it to the index. |
アプリケーションの設定
いつものように、Quarkusの設定ファイル( application.properties
)ですべての設定を行うことができます。
以下の内容の src/main/resources/import.sql
ファイルを作成してみましょう:
quarkus.ssl.native=false (1)
quarkus.hibernate-search-standalone.mapping.structure=document (2)
quarkus.hibernate-search-standalone.elasticsearch.version=8 (3)
quarkus.hibernate-search-standalone.indexing.plan.synchronization.strategy=sync (4)
%prod.quarkus.hibernate-search-standalone.elasticsearch.hosts=localhost:9200 (5)
1 | SSLは使用しないので、ネイティブ実行可能ファイルをよりコンパクトにするために無効にしています。 |
2 | We need to tell Hibernate Search about the structure of our entities.
In this application we consider an indexed entity (the author) is the root of a "document": the author "owns" books it references through associations, which cannot be updated independently of the author. See |
3 | We need to tell Hibernate Search about the version of Elasticsearch we will use.
It is important because there are significant differences between Elasticsearch mapping syntax depending on the version.
Since the mapping is created at build time to reduce startup time, Hibernate Search cannot connect to the cluster to automatically detect the version.
Note that, for OpenSearch, you need to prefix the version with |
4 | これは、エンティティが検索可能になるのを待ってから書き込みが完了したとみなすことを意味します。本番環境では、デフォルトの write-sync の方がパフォーマンスが高くなります。テスト時にはエンティティがすぐに検索可能になる必要があるため sync を使用することが特に重要です。 |
5 | For development and tests, we rely on Dev Services,
which means Quarkus will start an Elasticsearch cluster automatically.
In production mode, however,
we will want to start an Elasticsearch cluster manually,
which is why we provide Quarkus with this connection info in the prod profile (%prod. prefix). |
Because we rely on Dev Services, the Elasticsearch schema
will automatically be dropped and re-created on each application startup
in tests and dev mode
(unless 何らかの理由でDev Servicesを利用できない場合は以下のプロパティを設定することで、同様の動作をさせることができます:
|
For more information about configuration of the Hibernate Search Standalone extension, refer to the Configuration Reference. |
フロントエンドの作成
Now let’s add a simple web page to interact with our LibraryResource
.
Quarkus automatically serves static resources located under the META-INF/resources
directory.
In the src/main/resources/META-INF/resources
directory, overwrite the existing index.html
file with the content from this
index.html file.
アプリケーションで遊ぶ時間
これで、REST サービスと対話できるようになりました。
-
次のようにQuarkusアプリケーションを起動します:
コマンドラインインタフェースquarkus dev
Maven./mvnw quarkus:dev
Gradle./gradlew --console=plain quarkusDev
-
ブラウザで
http://localhost:8080/
を開きます -
著者や書名の検索してください(いくつかのデータを入れておきました)
-
新しい著者や書籍を作成し、それらを検索することもできます
ご覧のように、すべての更新が自動的にElasticsearchクラスタに同期されます。
ネイティブ実行可能ファイルの構築
以下のコマンドでネイティブの実行可能ファイルをビルドすることができます。
quarkus build --native
./mvnw install -Dnative
./gradlew build -Dquarkus.native.enabled=true
ネイティブ実行可能ファイルのコンパイルと同様に、この操作は大量のメモリーを消費します。 ネイティブ実行可能ファイルをビルドしている間は2つのコンテナーを停止して、ビルドが終わったら再度起動した方が安全かもしれません。 |
Running it is as simple as executing ./target/hibernate-search-standalone-elasticsearch-quickstart-1.0.0-SNAPSHOT-runner
.
その後、ブラウザで http://localhost:8080/
を開きアプリケーションを使用します。
The startup is a bit slower than usual: it is mostly due to us dropping and recreating the Elasticsearch mapping every time at startup. We also index some initial data. In a real life application, it is obviously something you won’t do on every startup. |
Dev Services (Configuration Free Datastores)
Quarkus supports a feature called Dev Services that allows you to start various containers without any config.
In the case of Elasticsearch this support extends to the default Elasticsearch connection.
What that means practically, is that if you have not configured quarkus.hibernate-search-standalone.elasticsearch.hosts
,
Quarkus will automatically start an Elasticsearch container when running tests or in dev mode,
and automatically configure the connection.
アプリケーションの本番環境版を実行する場合、Elasticsearch接続は通常通り設定する必要があります。したがって、 application.properties
に本番環境版のデータベース設定を含め、Dev Servicesを引き続き使用したい場合は、 %prod.
プロファイルを使用してElasticsearch設定を定義することをお勧めします。
Dev Services for Elasticsearchは現時点では複数のクラスタを同時に起動することができず、デフォルトの永続化ユニットのデフォルトのバックエンドでのみ動作します:名前付き永続化ユニットや名前付きバックエンドはDev Services for Elasticsearchを利用することができません。 |
詳細については、 Dev Services for Elasticsearch ガイド をご覧ください。
Programmatic mapping
If, for some reason, adding Hibernate Search annotations to entities is not possible,
mapping can be applied programmatically instead.
Programmatic mapping is configured through the ProgrammaticMappingConfigurationContext
that is exposed via a mapping configurer (HibernateOrmSearchMappingConfigurer
).
A mapping configurer ( |
Below is an example of a mapping configurer that applies programmatic mapping:
package org.acme.hibernate.search.elasticsearch.config;
import org.hibernate.search.mapper.pojo.standalone.mapping.StandalonePojoMappingConfigurationContext;
import org.hibernate.search.mapper.pojo.standalone.mapping.StandalonePojoMappingConfigurer;
import org.hibernate.search.mapper.pojo.mapping.definition.programmatic.TypeMappingStep;
import io.quarkus.hibernate.search.standalone.elasticsearch.SearchExtension;
@SearchExtension (1)
public class CustomMappingConfigurer implements StandalonePojoMappingConfigurer {
@Override
public void configure(StandalonePojoMappingConfigurationContext context) {
TypeMappingStep type = context.programmaticMapping() (2)
.type(SomeIndexedEntity.class); (3)
type.searchEntity(); (4)
type.indexed() (5)
.index(SomeIndexedEntity.INDEX_NAME); (6)
type.property("id").documentId(); (7)
type.property("text").fullTextField(); (8)
}
}
1 | Annotate the configurer implementation with the @SearchExtension qualifier
to tell Quarkus it should be used by Hibernate Search Standalone. |
2 | Access the programmatic mapping context. |
3 | Create mapping step for the SomeIndexedEntity type. |
4 | Define SomeIndexedEntity as an entity type for Hibernate Search. |
5 | Define the SomeIndexedEntity entity as indexed. |
6 | Provide an index name to be used for the SomeIndexedEntity entity. |
7 | Define the document id property. |
8 | Define a full-text search field for the text property. |
OpenSearch対応
Hibernate Searchは Elasticsearchと OpenSearchの両方に対応していますが、デフォルトではElasticsearchクラスタとの連携を前提としています。
To have Hibernate Search work with an OpenSearch cluster instead,
prefix the configured version with opensearch:
,
as shown below.
quarkus.hibernate-search-standalone.elasticsearch.version=opensearch:2.11
その他の設定オプションやAPIはElasticsearchの場合と全く同じです。
You can find more information about compatible distributions and versions of Elasticsearch in this section of Hibernate Search’s reference documentation.
CDI統合
Injecting entry points
You can inject Hibernate Search’s main entry point, SearchMapping
, using CDI:
@Inject
SearchMapping searchMapping;
Plugging in custom components
The Quarkus extension for Hibernate Search Standalone will automatically
inject components annotated with @SearchExtension
into Hibernate Search.
The annotation can optionally target a specific
backend (@SearchExtension(backend = "nameOfYourBackend")
), index (@SearchExtension(index = "nameOfYourIndex")
),
or a combination of those
(@SearchExtension(backend = "nameOfYourBackend", index = "nameOfYourIndex")
),
when it makes sense for the type of the component being injected.
This feature is available for the following component types:
org.hibernate.search.engine.reporting.FailureHandler
-
A component that should be notified of any failure occurring in a background process (mainly index operations).
Scope: one per application.
See this section of the reference documentation for more information.
org.hibernate.search.mapper.pojo.standalone.mapping.StandalonePojoMappingConfigurer
-
A component used to configure the Hibernate Search mapping, in particular programmatically.
Scope: one or more per persistence unit.
See this section of this guide for more information.
org.hibernate.search.mapper.pojo.work.IndexingPlanSynchronizationStrategy
-
A component used to configure how to synchronize between application threads and indexing.
Scope: one per application.
Can also be set to built-in implementations through
quarkus.hibernate-search-standalone.indexing.plan.synchronization.strategy
.See this section of the reference documentation for more information.
org.hibernate.search.backend.elasticsearch.analysis.ElasticsearchAnalysisConfigurer
-
A component used to configure full text analysis (e.g. analyzers, normalizers).
Scope: one or more per backend.
See this section of this guide for more information.
org.hibernate.search.backend.elasticsearch.index.layout.IndexLayoutStrategy
-
A component used to configure the Elasticsearch layout: index names, index aliases, …
Scope: one per backend.
Can also be set to built-in implementations through
quarkus.hibernate-search-standalone.elasticsearch.layout.strategy
.See this section of the reference documentation for more information.
オフライン起動
デフォルトではHibernate Search は起動時に Elasticsearch クラスタにいくつかのリクエストを送信します。Hibernate Searchの起動時にElasticsearchクラスタが稼働していない場合は起動失敗の原因となります。
これに対処するためには起動時にリクエストを送信しないようにHibernate Searchを設定します:
-
Disable Elasticsearch version checks on startup by setting the configuration property
quarkus.hibernate-search-standalone.elasticsearch.version-check.enabled
tofalse
. -
Disable schema management on startup by setting the configuration property
quarkus.hibernate-search-standalone.schema-management.strategy
tonone
.
もちろん、この構成でも、Elasticsearchクラスタにアクセスできるようになるまでは、Hibernate Searchはインデックスを作成したり、検索クエリを実行したりすることはできません。
If you disable automatic schema creation by setting See this section of the reference documentation for more information. |
Loading
As an alternative to using Elasticsearch as a primary datastore, this extension can also be used to index entities coming from another datastore.
In such a scenario, you will need to set
|
In order to do this, entities need to be loaded from that other datastore, and such loading must be implemented explicitly.
You can refer to Hibernate Search’s reference documentation for more information about configuring loading:
-
To load entities from an external datasource in order to reindex them, see Mass loading strategy.
-
To load entities from an external datasource when returning search hits, see Selection loading strategy.
In Quarkus, the entity loader mentioned in Hibernate Search’s reference documentation
can be defined as a CDI bean,
but will still need to be attached to particular entities using |
Management endpoint
Hibernate Search’s management endpoint is considered preview. プレビュー版 では、後方互換性やエコシステムでの存在は保証されていません。具体的な改善のためには、設定やAPI、あるいはストレージのフォーマットを変更する必要があるかもしれませんが、 安定化 に向けた計画は進行中です。ご意見・ご感想は、 メーリングリストや GitHubのイシュートラッカーでお寄せください。 |
The Hibernate Search extension provides an HTTP endpoint to reindex your data through the management interface. By default, this endpoint is not available. It can be enabled through configuration properties as shown below.
quarkus.management.enabled=true (1)
quarkus.hibernate-search-standalone.management.enabled=true (2)
1 | Enable the management interface. |
2 | Enable Hibernate Search Standalone specific management endpoints. |
Once the management endpoints are enabled, data can be re-indexed via /q/hibernate-search/standalone/reindex
, where /q
is the default management root path
and /hibernate-search/standalone/
is the default Hibernate Search root management path.
It (/hibernate-search/standalone/
) can be changed via configuration property as shown below.
quarkus.hibernate-search-standalone.management.root-path=custom-root-path (1)
1 | Use a custom custom-root-path path for Hibernate Search’s management endpoint.
If the default management root path is used then the reindex path becomes /q/custom-root-path/reindex . |
This endpoint accepts POST
requests with application/json
content type only.
All indexed entities will be re-indexed if an empty request body is submitted.
In order to reindex an entity type, it needs to be configured for loading from an external source. Without that configuration, reindexing through the management endpoint (or through any other API) will fail. |
If only a subset of entities must be re-indexed or if there is a need to have a custom configuration of the underlying mass indexer then this information can be passed through the request body as shown below.
{
"filter": {
"types": ["EntityName1", "EntityName2", "EntityName3", ...], (1)
},
"massIndexer":{
"typesToIndexInParallel": 1, (2)
}
}
1 | An array of entity names that should be re-indexed. If unspecified or empty, all entity types will be re-indexed. |
2 | Sets the number of entity types to be indexed in parallel. |
The full list of possible filters and available mass indexer configurations is presented in the example below.
{
"filter": { (1)
"types": ["EntityName1", "EntityName2", "EntityName3", ...], (2)
"tenants": ["tenant1", "tenant2", ...] (3)
},
"massIndexer":{ (4)
"typesToIndexInParallel": 1, (5)
"threadsToLoadObjects": 6, (6)
"batchSizeToLoadObjects": 10, (7)
"cacheMode": "IGNORE", (8)
"mergeSegmentsOnFinish": false, (9)
"mergeSegmentsAfterPurge": true, (10)
"dropAndCreateSchemaOnStart": false, (11)
"purgeAllOnStart": true, (12)
"idFetchSize": 100, (13)
"transactionTimeout": 100000, (14)
}
}
1 | Filter object that allows to limit the scope of reindexing. |
2 | An array of entity names that should be re-indexed. If unspecified or empty, all entity types will be re-indexed. |
3 | An array of tenant ids, in case of multi-tenancy. If unspecified or empty, all tenants will be re-indexed. |
4 | Mass indexer configuration object. |
5 | Sets the number of entity types to be indexed in parallel. |
6 | Sets the number of threads to be used to load the root entities. |
7 | Sets the batch size used to load the root entities. |
8 | Sets the cache interaction mode for the data loading tasks. |
9 | Whether each index is merged into a single segment after indexing. |
10 | Whether each index is merged into a single segment after the initial index purge, just before indexing. |
11 | Whether the indexes and their schema (if they exist) should be dropped and re-created before indexing. |
12 | Whether all entities are removed from the indexes before indexing. |
13 | Specifies the fetch size to be used when loading primary keys if objects to be indexed. |
14 | Specifies the timeout of transactions for loading ids and entities to be re-indexed.
Note all the properties in the JSON are optional, and only those that are needed should be used. |
For more detailed information on mass indexer configuration see the corresponding section of the Hibernate Search reference documentation.
Submitting the reindexing request will trigger indexing in the background. Mass indexing progress will appear in the application logs.
For testing purposes, it might be useful to know when the indexing finished. Adding wait_for=finished
query parameter to the URL
will result in the management endpoint returning a chunked response that will report when the indexing starts and then when it is finished.
制約事項
-
The Hibernate Search Standalone extension cannot be used in the same application as the Hibernate Search extension with Hibernate ORM
See #39517 to track progress.
-
AWS request signing is not available at the moment, unlike in the Hibernate Search extension with Hibernate ORM
See #26991 to track progress.
-
Optimistic concurrency control is not available at the moment.
See HSEARCH-5105 to track progress.
-
Elasticsearch/OpenSearch do not support transactions, so multi-document updates may fail partially and leave the index in an inconsistent state.
This cannot be avoided like in the Hibernate Search extension with Hibernate ORM with coordination through outbox polling, because that coordination requires Hibernate ORM and relies on the fact that data is derived from that of a (transactional) relational database.
さらに詳しく
If you are interested in learning more about Hibernate Search, the Hibernate team publishes an extensive reference documentation, as well as a page listing other relevant resources.
FAQ
なぜElasticsearchだけなのか?
Hibernate SearchはLuceneバックエンドとElasticsearchバックエンドの両方をサポートしています。
In the context of Quarkus and to build scalable applications, we thought the latter would make more sense. Thus, we focused our efforts on it.
We don’t have plans to support the Lucene backend in Quarkus for now, though there is an issue tracking progress on such an implementation in the Quarkiverse: quarkiverse/quarkus-hibernate-search-extras#180.
Configuration Reference for Hibernate Search Standalone
ビルド時に固定される構成プロパティ - 他のすべての構成プロパティは実行時にオーバーライド可能
タイプ |
デフォルト |
|||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Whether Hibernate Search Standalone is enabled during the build. If Hibernate Search is disabled during the build, all processing related to Hibernate Search will be skipped,
but it will not be possible to activate Hibernate Search at runtime:
Environment variable: Show more |
boolean |
|
||||||||||||||||||||||||||||||
A bean reference to a component that should be notified of any failure occurring in a background process (mainly index operations). The referenced bean must implement See this section of the reference documentation for more information.
Environment variable: Show more |
string |
|||||||||||||||||||||||||||||||
One or more bean references to the component(s) used to configure the Hibernate Search mapping, in particular programmatically. The referenced beans must implement See Programmatic mapping for an example on how mapping configurers can be used to apply programmatic mappings.
Environment variable: Show more |
list of string |
|||||||||||||||||||||||||||||||
The structure of the Hibernate Search entity mapping. This must match the structure of the application model being indexed with Hibernate Search:
Environment variable: Show more |
|
|
||||||||||||||||||||||||||||||
Whether Hibernate Search Standalone should be active at runtime. If Hibernate Search Standalone is not active, it won’t start with the application, and accessing the SearchMapping for search or other operations will not be possible. Note that if Hibernate Search Standalone is disabled
(i.e. Environment variable: Show more |
boolean |
|
||||||||||||||||||||||||||||||
The schema management strategy, controlling how indexes and their schema are created, updated, validated or dropped on startup and shutdown. Available values:
See this section of the reference documentation for more information. Environment variable: Show more |
SchemaManagementStrategyName |
|
||||||||||||||||||||||||||||||
How to synchronize between application threads and indexing,
in particular when relying on (implicit) listener-triggered indexing on entity change,
but also when using a Defines how complete indexing should be before resuming the application thread
after a Available values:
This property also accepts a bean reference
to a custom implementations of See this section of the reference documentation for more information.
Environment variable: Show more |
string |
|
||||||||||||||||||||||||||||||
Root path for reindexing endpoints.
This value will be resolved as a path relative to Environment variable: Show more |
string |
|
||||||||||||||||||||||||||||||
If management interface is turned on the reindexing endpoints will be published under the management interface.
This property allows to enable this functionality by setting it to Environment variable: Show more |
boolean |
|
||||||||||||||||||||||||||||||
タイプ |
デフォルト |
|||||||||||||||||||||||||||||||
The version of Elasticsearch used in the cluster. As the schema is generated without a connection to the server, this item is mandatory. It doesn’t have to be the exact version (it can be There’s no rule of thumb here as it depends on the schema incompatibilities introduced by Elasticsearch versions. In any case, if there is a problem, you will have an error when Hibernate Search tries to connect to the cluster. Environment variable: Show more |
ElasticsearchVersion |
|||||||||||||||||||||||||||||||
A bean reference to the component used to configure the Elasticsearch layout: index names, index aliases, … The referenced bean must implement Available built-in implementations:
See this section of the reference documentation for more information.
Environment variable: Show more |
string |
|||||||||||||||||||||||||||||||
Path to a file in the classpath holding custom index settings to be included in the index definition when creating an Elasticsearch index. The provided settings will be merged with those generated by Hibernate Search, including analyzer definitions. When analysis is configured both through an analysis configurer and these custom settings, the behavior is undefined; it should not be relied upon. See this section of the reference documentation for more information. Environment variable: Show more |
string |
|||||||||||||||||||||||||||||||
Path to a file in the classpath holding a custom index mapping to be included in the index definition when creating an Elasticsearch index. The file does not need to (and generally shouldn’t) contain the full mapping: Hibernate Search will automatically inject missing properties (index fields) in the given mapping. See this section of the reference documentation for more information. Environment variable: Show more |
string |
|||||||||||||||||||||||||||||||
One or more bean references to the component(s) used to configure full text analysis (e.g. analyzers, normalizers). The referenced beans must implement See Setting up the analyzers for more information.
Environment variable: Show more |
list of string |
|||||||||||||||||||||||||||||||
The list of hosts of the Elasticsearch servers. Environment variable: Show more |
list of string |
|
||||||||||||||||||||||||||||||
The protocol to use when contacting Elasticsearch servers. Set to "https" to enable SSL/TLS. Environment variable: Show more |
|
|
||||||||||||||||||||||||||||||
The username used for authentication. Environment variable: Show more |
string |
|||||||||||||||||||||||||||||||
The password used for authentication. Environment variable: Show more |
string |
|||||||||||||||||||||||||||||||
The timeout when establishing a connection to an Elasticsearch server. Environment variable: Show more |
|
|||||||||||||||||||||||||||||||
The timeout when reading responses from an Elasticsearch server. Environment variable: Show more |
|
|||||||||||||||||||||||||||||||
The timeout when executing a request to an Elasticsearch server. This includes the time needed to wait for a connection to be available, send the request and read the response. Environment variable: Show more |
||||||||||||||||||||||||||||||||
The maximum number of connections to all the Elasticsearch servers. Environment variable: Show more |
int |
|
||||||||||||||||||||||||||||||
The maximum number of connections per Elasticsearch server. Environment variable: Show more |
int |
|
||||||||||||||||||||||||||||||
Defines if automatic discovery is enabled. Environment variable: Show more |
boolean |
|
||||||||||||||||||||||||||||||
Refresh interval of the node list. Environment variable: Show more |
|
|||||||||||||||||||||||||||||||
The size of the thread pool assigned to the backend. Note that number is per backend, not per index. Adding more indexes will not add more threads. As all operations happening in this thread-pool are non-blocking, raising its size above the number of processor cores available to the JVM will not bring noticeable performance benefit. The only reason to alter this setting would be to reduce the number of threads; for example, in an application with a single index with a single indexing queue, running on a machine with 64 processor cores, you might want to bring down the number of threads. Defaults to the number of processor cores available to the JVM on startup. Environment variable: Show more |
int |
|||||||||||||||||||||||||||||||
Whether partial shard failures are ignored ( Environment variable: Show more |
boolean |
|
||||||||||||||||||||||||||||||
Whether Hibernate Search should check the version of the Elasticsearch cluster on startup. Set to Environment variable: Show more |
boolean |
|
||||||||||||||||||||||||||||||
The minimal Elasticsearch cluster status required on startup. Environment variable: Show more |
|
|
||||||||||||||||||||||||||||||
How long we should wait for the status before failing the bootstrap. Environment variable: Show more |
|
|||||||||||||||||||||||||||||||
The number of indexing queues assigned to each index. Higher values will lead to more connections being used in parallel, which may lead to higher indexing throughput, but incurs a risk of overloading Elasticsearch, i.e. of overflowing its HTTP request buffers and tripping circuit breakers, leading to Elasticsearch giving up on some request and resulting in indexing failures. Environment variable: Show more |
int |
|
||||||||||||||||||||||||||||||
The size of indexing queues. Lower values may lead to lower memory usage, especially if there are many queues, but values that are too low will reduce the likeliness of reaching the max bulk size and increase the likeliness of application threads blocking because the queue is full, which may lead to lower indexing throughput. Environment variable: Show more |
int |
|
||||||||||||||||||||||||||||||
The maximum size of bulk requests created when processing indexing queues. Higher values will lead to more documents being sent in each HTTP request sent to Elasticsearch, which may lead to higher indexing throughput, but incurs a risk of overloading Elasticsearch, i.e. of overflowing its HTTP request buffers and tripping circuit breakers, leading to Elasticsearch giving up on some request and resulting in indexing failures. Note that raising this number above the queue size has no effect, as bulks cannot include more requests than are contained in the queue. Environment variable: Show more |
int |
|
||||||||||||||||||||||||||||||
タイプ |
デフォルト |
|||||||||||||||||||||||||||||||
Path to a file in the classpath holding custom index settings to be included in the index definition when creating an Elasticsearch index. The provided settings will be merged with those generated by Hibernate Search, including analyzer definitions. When analysis is configured both through an analysis configurer and these custom settings, the behavior is undefined; it should not be relied upon. See this section of the reference documentation for more information. Environment variable: Show more |
string |
|||||||||||||||||||||||||||||||
Path to a file in the classpath holding a custom index mapping to be included in the index definition when creating an Elasticsearch index. The file does not need to (and generally shouldn’t) contain the full mapping: Hibernate Search will automatically inject missing properties (index fields) in the given mapping. See this section of the reference documentation for more information. Environment variable: Show more |
string |
|||||||||||||||||||||||||||||||
One or more bean references to the component(s) used to configure full text analysis (e.g. analyzers, normalizers). The referenced beans must implement See Setting up the analyzers for more information.
Environment variable: Show more |
list of string |
|||||||||||||||||||||||||||||||
The minimal Elasticsearch cluster status required on startup. Environment variable: Show more |
|
|
||||||||||||||||||||||||||||||
How long we should wait for the status before failing the bootstrap. Environment variable: Show more |
|
|||||||||||||||||||||||||||||||
The number of indexing queues assigned to each index. Higher values will lead to more connections being used in parallel, which may lead to higher indexing throughput, but incurs a risk of overloading Elasticsearch, i.e. of overflowing its HTTP request buffers and tripping circuit breakers, leading to Elasticsearch giving up on some request and resulting in indexing failures. Environment variable: Show more |
int |
|
||||||||||||||||||||||||||||||
The size of indexing queues. Lower values may lead to lower memory usage, especially if there are many queues, but values that are too low will reduce the likeliness of reaching the max bulk size and increase the likeliness of application threads blocking because the queue is full, which may lead to lower indexing throughput. Environment variable: Show more |
int |
|
||||||||||||||||||||||||||||||
The maximum size of bulk requests created when processing indexing queues. Higher values will lead to more documents being sent in each HTTP request sent to Elasticsearch, which may lead to higher indexing throughput, but incurs a risk of overloading Elasticsearch, i.e. of overflowing its HTTP request buffers and tripping circuit breakers, leading to Elasticsearch giving up on some request and resulting in indexing failures. Note that raising this number above the queue size has no effect, as bulks cannot include more requests than are contained in the queue. Environment variable: Show more |
int |
|
期間フォーマットについて
To write duration values, use the standard 数字で始まる簡略化した書式を使うこともできます:
その他の場合は、簡略化されたフォーマットが解析のために
|
Bean参照について
First, be aware that referencing beans in configuration properties is optional and, in fact, discouraged:
you can achieve the same results by annotating your beans with If you really do want to reference beans using a string value in configuration properties know that string is parsed; here are the most common formats:
Other formats are also accepted, but are only useful for advanced use cases. See this section of Hibernate Search’s reference documentation for more information. |