Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Sutou Kouhei
    @kou
    I've fixed these links.
    weldomha
    @weldomha
    hi all, pgroonga 11 debian
    there is no package following this :
    doesn't follow debian directory tree for postgres
    any idea how to install pgroonga 11 on debian ??
    weldomha
    @weldomha
    without compelling the source
    Sutou Kouhei
    @kou
    I've uploaded deb packages for PostgreSQL 11: https://pgroonga.github.io/install/debian.html
    Zhanzhao (Deo) Liang
    @DeoLeung
    Hi guys, is there any hints about using rk search on Chinese with pinyin?
    Sutou Kouhei
    @kou
    You can't use prefix RK search for pinyin. You can just use prefix search.
    Zhanzhao (Deo) Liang
    @DeoLeung
    Hi, I have an index question, I create a table with two column id, name and around 3m records, which having name a pgroonga index. but the query doesn't use the index, it just performs a seq scan, is there any technique to enforce it?
    -- Table Definition ----------------------------------------------
    
    CREATE TABLE datamarket.test_name (
        id integer PRIMARY KEY,
        name character varying
    );
    
    -- Indices -------------------------------------------------------
    
    CREATE UNIQUE INDEX test_name_pkey ON datamarket.test_name(id int4_ops);
    CREATE INDEX test_name_name_pgroonga ON datamarket.test_name(name pgroonga_text_full_text_search_ops_v2);
    
    explain SELECT _default_.name AS name
            FROM datamarket.test_name AS _default_
            WHERE _default_.name &@ '欧莱雅' limit 10;
    
    Limit  (cost=10000000000.00..10000005192.86 rows=10 width=38)
      ->  Seq Scan on test_name _default_  (cost=10000000000.00..10001984712.56 rows=3822 width=38)
            Filter: (name &@ '欧莱雅'::character varying)
    
    -- this seems not working neither
    set enable_seqscan = off;
    show enable_seqscan; -- it does show off
    Zhanzhao (Deo) Liang
    @DeoLeung
    if I use ilike '%xx%', it uses the index, I tried other operators with no luck
    Zhanzhao (Deo) Liang
    @DeoLeung
    I cast the column to text works alter table datamarket.test_name alter column name type text; , so is it required that the column should be text type?
    Horimoto Yasuhiro
    @komainu8

    I think, If the type of column is varchar, we need to specify to pgroonga_varchar_full_text_search_ops_v2 operator class.

    For example, how about below example?

    CREATE INDEX ${INDEX_NAME} ON ${TABLE_NAME} USING pgroonga (${COLUMN_NAME} pgroonga_varchar_full_text_search_ops_v2);
    Zhanzhao (Deo) Liang
    @DeoLeung
    i c, let me have a try
    Zhanzhao (Deo) Liang
    @DeoLeung
    it works , thx @komainu8 , and I check the document the default for varchar & text are different :)
    Zhanzhao (Deo) Liang
    @DeoLeung

    hi, we are experiencing a pg crash, which we suspect it may be pgroonga, we found a lot

    6790878 2019-09-03 10:35:19.987139|n|11768: pgroonga: initialize: <2.2.1>
    6790879 2019-09-03 10:35:19.989661|n|11654: io(base/23522/pgrn) collisions(1000/7): lock failed 1000 times
    6790880 2019-09-03 10:35:20.039066|n|11664: io(base/23522/pgrn) collisions(1000/7): lock failed 1000 times

    around crashing time

    also, will pgroonga create tcp connections? we see the connections keep increasing

    Sutou Kouhei
    @kou
    Did your PostgreSQL process crash?
    Zhanzhao (Deo) Liang
    @DeoLeung
    argh, not process crash, it's cpu and memory increase then killed by system
    as we run it in docker, then the server brought back automatically, and cpu/memory keep increasing again, and got killed again
    Sutou Kouhei
    @kou
    It's the problem.
    You need to assign more resources to the docker container.
    Zhanzhao (Deo) Liang
    @DeoLeung
    i c, is there any hardware recommendation about it? we have a total db around 1TB and single table around 500MB, the docker has full 16c32g resource(normally without pgroonga it uses <50%)
    Zhanzhao (Deo) Liang
    @DeoLeung
    good, will give it a try, thx
    anthonydafc
    @anthonydafc
    Hello, could you explain to me how this regular expression works :
    select from trecherchepages where (ftexte &~ 'BARRY') gives results
    select
    from trecherchepages where (ftexte &~ 'SOPHIE') gives results
    select * from trecherchepages where (ftexte &~ 'BARRY') AND (ftexte &~ 'SOPHIE') gives no results
    Sutou Kouhei
    @kou
    Could you also show sample data?
    anthonydafc
    @anthonydafc
    BARRY ANDRE CORINNE RESOPHIE ERIC
    sorry i used select from trecherchepages where (ftexte &~ '.BARRY') AND (ftexte &~ '.*SOPHIE')
    Sutou Kouhei
    @kou
    select * from trecherchepages where (ftexte &~ 'BARRY') AND (ftexte &~ 'SOPHIE') matches a record that contains both BARRY and SOPHIE.
    Your sample data don't include such record.
    anthonydafc
    @anthonydafc
    the word RESOPHIE ends with SOPHIE, is that not possible to search with &~~?
    Sutou Kouhei
    @kou
    RESOPHIE doesn't contain BARRY.
    anthonydafc
    @anthonydafc
    i was looking for bith words : (ftexte &~ 'BARRY') AND (ftexte &~ 'SOPHIE')
    Sutou Kouhei
    @kou
    If you need more help, could you create a new issue https://github.com/pgroonga/pgroonga/issues/new with SQL to reproduce your case?
    CREATE TABLE ...;
    CREATE INDEX ...;
    INSERT ...;
    SELECT ...;
    anthonydafc
    @anthonydafc
    have you done a documentation for using &~ (regular expression) ?
    ふりだしにもどる
    @kenhys_twitter
    @anthonydafc
    @anthonydafc there is &~documentation: https://pgroonga.github.io/reference/operators/regular-expression-v2.html , but as kou mentioned, creating a new issue may be appropriate in this case for getting more help.
    frankietaylor
    @frankietaylor
    I tried to create a simple pat table with UInt32 key type and its failing to load data
    table_create --name Site --flags TABLE_PAT_KEY --key_type UInt32 --default_tokenizer TokenBigram --normalizer NormalizerAuto
    column_create --table Site --name title --type ShortText
    load --table Site
    [{"_key":"1","title":"This is test record 1!"}]
    
    or 
    load --table Site
    [{"_key":1,"title":"This is test record 1!"}]
    
    Error:
    [[-22,1570939085.112,0.03200006484985352,"[table][add][pat] failed to add: #<key 1 table:#<pat Site key:UInt32>>",[["grn_table_add","db.c",1642]]],0]
    If I switch to key_type ShortText, its working
    ふりだしにもどる
    @kenhys_twitter
    @frankietaylor If you use key_type UInt32, there is no need to specify -default_tokenizer TokenBigram --normalizer NormalizerAuto. As you has mentioned, if you use key_type ShortText, it's ok because tokenizer and normalizer is aimed to apply for text type.
    Foxie
    @WinteryFox
    how can I match against any form of any verb? for instance I search for 支えている and it'll return anything containing any form of 支える
    and what about plurals?
    this doesn't seem to support most of the important features of psql's own full text search solutions
    (like plurals, verb forms etc)
    ふりだしにもどる
    @kenhys_twitter

    It seems that there is no simple way to support stemming for '支えている' at the moment. But there is a workaround to do similar thing.

    1. Use pgroonga_query_expand function for searching synonyms. (ref https://pgroonga.github.io/reference/functions/pgroonga-query-expand.html )
    2. is the way to use expanded query for searching. It means that you can expand '支える' to '支える' OR '支えている' OR '支えた' and so on. Expanded query is available by pgroonga_query_expand function. The one drawback is needed to maintain sysnonyms table for required verbs. (See Synonyms groups section in above document)

    Here is the sample to do it:

    CREATE TABLE synonym_groups ( synonyms text[]);
    CREATE INDEX synonym_groups_synonyms ON synonym_groups USING pgroonga (synonyms pgroonga_text_array_term_search_ops_v2);
    INSERT INTO synonym_groups VALUES (ARRAY['支える', '支えている', '支えた']);
    INSERT INTO memos (title) VALUES ('〇〇を支える技術');
    INSERT INTO memos (title) VALUES ('〇〇を支えている技術');
    INSERT INTO memos (title) VALUES ('〇〇を支えた技術');
    INSERT INTO memos (title) VALUES ('〇〇を支えし技術');
    
    # select * from memos where title &@~ pgroonga_query_expand('synonym_groups', 'synonyms', 'synonyms', '支える');
     〇〇を支える技術
     〇〇を支えている技術
     〇〇を支えた技術

    As It takes costs for many supported verbs, so supporting stemming query in your application layer may be better.

    Foxie
    @WinteryFox
    wouldn't that mean running a query for every form of every verb though? seems like it'd be more tedious
    Foxie
    @WinteryFox
    I've found a solution that better suits my needs (https://github.com/oknj/textsearch_ja), thanks for the help anyway
    Horimoto Yasuhiro
    @komainu8

    We added a new option(use_base_form) into tokenizer(TokenMecab).
    We can match against any form of any verb by this option.

    For example, if we search "支えた" using this option, "支える" is hit also.

    This option can use as below from the next release.

    CREATE EXTENSION pgroonga;
    
    CREATE TABLE memos (
      id integer,
      content text
    );
    
    CREATE INDEX base_form_index on memos
      USING pgroonga (content)
      WITH (tokenizer='TokenMecab("use_base_form", true)');
    
    INSERT INTO memos VALUES (1, '支える');
    INSERT INTO memos VALUES (2, '支えた');
    
    SELECT * FROM memos WHERE content &@ '支える';
    
    --  id | content 
    -- ----+---------
    --   1 | 支える
    --   2 | 支えた
    -- (2 row)