I have researched how to ignore different of those characters. However, PGroonga's current normalizer can't normalize those for إ
and ا
have handled as different character in Unicode.
We need implement of a dedicated normalizer or normalizer's option for actualize this feature.
So, could you create new issues in PGroonga on GitHub and you write detail specification of this feature to that one, please?
@salmagomaa Thank you for creating that.
However, pgroonga.github.io
is a repository for documentation.
So, I recommend that create this issue in issues of pgroonga repository.
@Oniry
The document about indexes with weight is not available yet, but if you have interested in this feature, the following example will help you to understand it.
table_create Bigram TABLE_PAT_KEY ShortText \
--default_tokenizer TokenBigramSplitSymbolAlpha \
--normalizer NormalizerAuto
table_create Diaries TABLE_HASH_KEY ShortText
column_create Diaries title COLUMN_SCALAR ShortText
column_create Diaries content COLUMN_SCALAR Text
column_create Bigram titles COLUMN_INDEX|WITH_POSITION Diaries title
column_create Bigram contents COLUMN_INDEX|WITH_POSITION Diaries content
load --table Diaries
[
{
"_key": "2013-02-06",
"title": "groonga",
"content": "I found it that is a fast fulltext search engine!"
},
{
"_key": "2013-02-07",
"title": "mroonga",
"content": "I found mroonga that is a MySQL storage engine to use groonga!"
},
]
Then execute queries like this:
select Diaries \
--match_columns 'Bigram.titles * 10 || Bigram.contents' \
--query 'groonga' \
--output_columns 'title,_score' --sort_keys -_score --output_pretty yes
┌───────┬──────┐
│title │_score│
├───────┼──────┤
│groonga│10 │
│mroonga│1 │
└───────┴──────┘
select Diaries \
--match_columns 'Bigram.titles || Bigram.contents' \
--query 'groonga' \
--output_columns 'title,_score' --sort_keys -_score --output_pretty yes
┌───────┬──────┐
│title │_score│
├───────┼──────┤
│groonga│1 │
│mroonga│1 │
└───────┴──────┘
If groonga
is contained title
and content
column, it became to the same score without * 10
weight.
&~
supports score.
Umm. It's too rough instruction. I couldn't reproduce your case with the instruction.
I try the following and it works well:
1.Install PostgreSQL 11.3 by installer: https://www.enterprisedb.com/thank-you-downloading-postgresql?anid=1256619
C:\Program Files\PostgreSQL\11
(I needed to enable administrator permission)psql
from startup menupostgres
database with postgres
userpgroonga_test
database by CREATE DATABASE pgroonga_test;
\c pgroonga_test
CREATE EXTENSION pgroonga;
(It doesn't report any error)CREATE TABLE memos (
id integer,
content text
);
CREATE INDEX pgroonga_content_index ON memos USING pgroonga (content);
INSERT INTO memos VALUES (1, 'PostgreSQL is a relational database management system.');
INSERT INTO memos VALUES (2, 'Groonga is a fast full text search engine that supports all languages.');
INSERT INTO memos VALUES (3, 'PGroonga is a PostgreSQL extension that uses Groonga as index.');
INSERT INTO memos VALUES (4, 'There is groonga command.');
SET enable_seqscan = off;
SELECT * FROM memos WHERE content &@ 'engine';
-- id | content
-- ----+------------------------------------------------------------------------
-- 2 | Groonga is a fast full text search engine that supports all languages.
-- (1 row)
So, it will be your environment specific issue.
SET enable_seqscan = off;
.https://www.postgresql.org/docs/current/app-psql.html
\d[S+] [ pattern ]
base_roonga_test-# \d score_memos
Table ½ public.score_memos ╗
Colonne | Type | Collationnement | NULL-able | Par dÚfaut
---------+---------+-----------------+-----------+------------
id | integer | | not null |
content | text | | |
Index :
"score_memos_pkey" PRIMARY KEY, btree (id)
"pgroonga_score_memos_content_index" pgroonga (content)
base_roonga_test-#
I got it.
You need to use pgroonga_text_regexp_ops_v2
operator class https://pgroonga.github.io/reference/#text-regexp-ops-v2 by searching with regular expression with index.
CREATE INDEX pgroonga_score_memos_content_index ON public.score_memos USING pgroonga (content pgroonga_text_regexp_ops_v2);
See also: https://pgroonga.github.io/reference/operators/regular-expression-v2.html#usage
I have a question concerning the score. If i use an inner join i get a sql message saying : ERROR: ERREUR: la colonne « tableoid » n'existe pas
LINE 1: select f.fid_recherchefichiers, pgroonga_score(tableoid, cti...
^
HINT: Il existe une colonne nommée « tableoid » pour la table « p » mais elle ne peut pas être référencée dans cette partie de la requête.
SQL state: 42703
Character: 48
CREATE TABLE
, CREATE INDEX
, INSERT
, SELECT
, ...) to reproduce the case?