Analyzers in ElasticSearch not working -

- March 15, 2014

i using elasticsearch store tweets receive twitter streaming api. before storing them i'd apply english stemmer tweet content, , i'm trying use elasticsearch analyzers no luck.

this current template using:

put _template/twitter {   "template": "139*",   "settings" : {     "index":{       "analysis":{         "analyzer":{           "english":{             "type":"custom",             "tokenizer":"standard",             "filter":["lowercase", "en_stemmer", "stop_english", "asciifolding"]           }         },         "filter":{           "stop_english":{             "type":"stop",             "stopwords":["_english_"]           },           "en_stemmer" : {             "type" : "stemmer",             "name" : "english"           }         }       }     }   },   "mappings": {     "tweet": {       "_timestamp": {         "enabled": true,         "store": true,         "index": "analyzed"       },       "_index": {         "enabled": true,         "store": true,         "index": "analyzed"       },       "properties": {         "geo": {           "properties": {             "coordinates": {               "type": "geo_point"             }           }         },         "text": {           "type": "string",           "analyzer": "english"         }       }     }   } }

when start streaming , index created, mappings i've defined seem apply correctly, text stored comes twitter, raw. index metadata shows:

"settings" : {     "index" : {         "uuid" : "xiokecoysaezorr7pjetng",         "analysis" : {             "filter" : {                 "en_stemmer" : {                     "type" : "stemmer",                     "name" : "english"                  },                  "stop_english" : {                      "type" : "stop",                      "stopwords" : [                          "_english_"                      ]                  }              },              "analyzer" : {                  "english" : {                      "type" : "custom",                      "filter" : [                          "lowercase",                          "en_stemmer",                          "stop_english",                          "asciifolding"                      ],                      "tokenizer" : "standard"                  }              }          },         "number_of_replicas" : "1",         "number_of_shards" : "5",         "version" : {             "created" : "1010099"         }     } }, "mappings" : {     "tweet" : {         [...]         "text" : {             "analyzer" : "english",             "type" : "string"         },         [...]     } }

what doing wrong? analyzers seems applied correctly, nothing happening :/

thank you!

ps: search query use realize analyzer not being applied:

curl -xget 'http://localhost:9200/_all/_search?pretty' -d '{   "query": {     "filtered": {       "query": {         "bool": {           "should": [             {               "query_string": {                 "query": "_index:1397574496990"               }             }           ]         }       },       "filter": {         "bool": {           "must": [             {               "match_all": {}             },             {               "exists": {                 "field": "geo.coordinates"               }             }           ]         }       }     }   },   "fields": [     "geo.coordinates",     "text"   ],   "size": 50000 }'

this should return stemmed text 1 of fields, response is:

{    "took": 29,    "timed_out": false,    "_shards": {       "total": 47,       "successful": 47,       "failed": 0    },    "hits": {       "total": 2,       "max_score": 0.97402453,       "hits": [          {             "_index": "1397574496990",             "_type": "tweet",             "_id": "456086643423068161",             "_score": 0.97402453,             "fields": {                "geo.coordinates": [                   -118.21122533,                   33.79349318                ],                "text": [                   "happy turtle tuesday ! week crawling wednesday morning 🌊🐢🐢🐢☀️#turtles… http://t.co/wavmcxnf76"                ]             }          },          {             "_index": "1397574496990",             "_type": "tweet",             "_id": "456086701451259904",             "_score": 0.97333175,             "fields": {                "geo.coordinates": [                   -81.017636,                   33.998741                ],                "text": [                   "tuesday twins day on here, apparently (it's far occurrence) #tuesdaytwinsday… http://t.co/umhtp6sox6"                ]             }          }       ]    } }

the text field same came twitter (i'm using streaming api). expect text fields stemmed, analyzer applied.

analyzers don't affect way data stored. so, no matter analyzer using same text source , stored fields. analyzer applied when search. searching text:twin , finding records word twins, know stemmer applied.

Search This Blog

EIght

Analyzers in ElasticSearch not working -

Comments

Post a Comment

Popular posts from this blog

windows - Single EXE to Install Python Standalone Executable for Easy Distribution -

c# - Access objects in UserControl from MainWindow in WPF -

javascript - How to name a jQuery function to make a browser's back button work? -