Tag Archives: elasticsearch

Re-Index With Elasticsearch

Elasticsearch Logo

When dealing with indices, it’s inevitable there will be a need to change the mapped fields. For example, in a firewall log, due to default mappings, a field like “RepeatCount” was stored as text instead of integer. To fix this, first write an ingest pipeline (using Kibana) to convert the field from text to integer:

PUT _ingest/pipeline/string-to-long
{
  "description": "convert RepeatCount field from string into long",
    "processors": [      
      {
        "convert": {
          "field": "RepeatCount",
          "type": "long",
          "ignore_missing": true
        }
      }
    ]
}

Next, run the POST command reindex the old index into the new one, while running the pipeline for conversion:

POST _reindex 
{ 
   "source": { 
      "index": "fwlogs-2019.02.01" 
   }, 
   "dest": { 
      "index": "fwlogs-2019.02.01-v2", 
      "pipeline": "string-to-long"
   } 
}

If there are multiple indices, it’s recommended to use a shell script to deal with the individual index systematically, such as “fwlogs-2019.02.01”, “fwlogs-2019.02.02”, etc.

#!/bin/sh
# The list of index names in rlist.txt file
LIST=`cat rlist.txt`
for index in $LIST; do
  curl -HContent-Type:application/json --user elastic:password -XPOST https://mysearch.domain.net:9200/_reindex?pretty -d'{
    "source": {
      "index": "'$index'"
    },
    "dest": {
      "index": "'$index'-v2",
      "pipeline": "string-to-long"
    }
  }'
done

Finally, clean up the old indices by deleting them. It’s a temptation to use Kibana to DELETE fwlogs-2019.02*, but beware the new indices have the suffix “-v2” and it will be deleted if the wildcard argument is used. Instead use the shell script to delete based on the names specifically listed in the txt file.

#!/bin/sh
# The list of index names in rlist.txt file
LIST=`cat rlist.txt`
for index in $LIST; do
  curl --user elastic:password -XDELETE "https://mysearch.domain.net:9200/$index"
done

Installing Elasticsearch Client on PHP

For a simple demonstration of using Elasticsearch programmatically as a web app, it’s a little more practical to use PHP as a starting point to learn how to connect and display search results. As a guideline, the quick-start instruction from Elastic site is a starting point. To expand (possibly complete) the out of the box setup, below are the steps to setup PHP to enable Elasticsearch support.

First, install the PHP Curl support for Apache on Linux:

apt-get -y install php-curl

Setup the PHP Composer in the doc-root folder, as outlined from elasticsearch-php github. Setup the php libraries via Composer:

php composer.phar init
curl -s http://getcomposer.org/installer | php
php composer.phar install --no-dev

Be sure to get the dependency package “elasticsearch/elasticsearch” and use the latest version as default. Note, skip the development package as it’s not really necessary.

Then, edit the composer.json file to include the directive:

   "require": {
            "elasticsearch/elasticsearch": "~6.0"
   }

Finally, create a test page to see if it can connect to the Elasticsearch server:

<?php

require 'vendor/autoload.php';

use Elasticsearch\ClientBuilder;

$hosts = [
   'http://myelasticsearchhost:9200'
];

$client = ClientBuilder::create()
   ->setHosts($hosts)
   ->build();

$params = [
    'index' => 'myindexname',
    'body' => [
        'query' => [
            'match' => [
                'post_title' => 'elasticsearch'
            ]
        ]
    ]
];

$response = $client->search($params);

$totalhits = $response['hits']['total'];
echo "We have $totalhits total hits\n";

echo "<P>The hits are the following:</P>";
$result = null;
$i=0;
while ($i <= $totalhits)
{
        $result[$i] = $response['hits']['hits'][$i]['_source'];
        $i++;
}

foreach ($result as $key => $value)
{
        echo $value['post_title'], "<br>";
}

?>

Output will look something like this:

We have 2 total hits

The hits are the following:


Using Elasticsearch for JBOSS Logs
Deleting Entries in Elasticsearch Based On Timestamp

Update Nov/2019: Since Elasticsearch updated their basic license to include basic username/password security, it’s advisable to set them up. It’s a straight-forward addition:

$hosts = [
   [
      'host' => 'myelasticsearchhost',
      'port' => '9200',
      'scheme' => 'http',
      'user' => 'myElasticUser',
      'pass' => 'myPassword'
   ]
];

Edit November 6, 2020: If there’s an upgrade or re-install of the OS into the latest version (such as from Ubuntu 16.x to 18.x), it is possible the version of cURL installed for PHP is a different one. For example, running php -m reveals:

PHP 7.2.34-8+ubuntu18.04.1+deb.sury.org+1 (cli) (built: Oct 31 2020 16:57:15) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.2.0, Copyright (c) 1998-2018 Zend Technologies
with Zend OPcache v7.2.34-8+ubuntu18.04.1+deb.sury.org+1, Copyright (c) 1999-2018, by Zend Technologies

Since it is version 7.2 of PHP, install the cURL PHP library: apt-get install php7.2-curl

Recovering Kibana After Upgrade

Kibana

Elastic is doing rapid development with Elasticsearch. As of this writing, they’re now on version 6.5.3 – when 6.5.2 was released less than 2 weeks ago!  Luckily, with a package install from repo (such as RPM on CentOS/RHEL), the upgrade process to minor versions is less painful.  However, it’s not without its pitfall. For example, an  upgrade from version 6.4.x to the latest 6.5.x could lead to Kibana not able to start due to incompatible indices.

In order to alleviate this, shutdown the Kibana service, and instruct Elasticsearch to perform a recovery on the .kibana index:

curl --user elasticuser:userpassword -s https://search.mydomain.net:9200/.kibana/_recovery?pretty

If it’s connected to a big cluster with a lot of shards, speed up the recovery process without using replicas:

curl --user elasticuser:userpassword -H 'Content-Type: application/json' -XPUT 'https://search.mydomain.net:9200/.kibana/_settings' -d '{ "index" : { "number_of_replicas" : 0 } }'

Give it a few minutes (depending how much data is there) and then start up Kibana service.  If, for some reason, it still takes a long time, there may be a problem with the migration process.  The kibana.log may indicate something like this:

{“type”:”log”,”@timestamp”:”2018-12-12T17:17:40Z”,”tags”:[“warning”,”stats-collection”],”pid”:15141,”message”:”Unable to fetch data from kibana_settings collector”}
{“type”:”log”,”@timestamp”:”2018-12-12T17:17:42Z”,”tags”:[“reporting”,”warning”],”pid”:15141,”message”:”Enabling the Chromium sandbox provides an additional layer of protection.”}
{“type”:”log”,”@timestamp”:”2018-12-12T17:17:42Z”,”tags”:[“info”,”migrations”],”pid”:15141,”message”:”Creating index .kibana_2.”}
{“type”:”log”,”@timestamp”:”2018-12-12T17:17:44Z”,”tags”:[“warning”,”migrations”],”pid”:15141,”message”:”Another Kibana instance appears to be migrating the index. Waiting for that migration to complete. If no other Kibana instance is attempting migrations, you can get past this message by deleting index .kibana_2 and restarting Kibana.”}

Shutdown Kibana again, and delete the .kibana_2 index:

curl --user elasticuser:userpassword -XDELETE https://search.mydomain.net:9200/.kibana_2

Start the Kibana service again and give it a few more minutes to perform house-keeping.  Kibana should be up and running now.