Tag Archives: shell

Re-Index With Elasticsearch

Elasticsearch Logo

When dealing with indices, it’s inevitable there will be a need to change the mapped fields. For example, in a firewall log, due to default mappings, a field like “RepeatCount” was stored as text instead of integer. To fix this, first write an ingest pipeline (using Kibana) to convert the field from text to integer:

PUT _ingest/pipeline/string-to-long
{
  "description": "convert RepeatCount field from string into long",
    "processors": [      
      {
        "convert": {
          "field": "RepeatCount",
          "type": "long",
          "ignore_missing": true
        }
      }
    ]
}

Next, run the POST command reindex the old index into the new one, while running the pipeline for conversion:

POST _reindex 
{ 
   "source": { 
      "index": "fwlogs-2019.02.01" 
   }, 
   "dest": { 
      "index": "fwlogs-2019.02.01-v2", 
      "pipeline": "string-to-long"
   } 
}

If there are multiple indices, it’s recommended to use a shell script to deal with the individual index systematically, such as “fwlogs-2019.02.01”, “fwlogs-2019.02.02”, etc.

#!/bin/sh
# The list of index names in rlist.txt file
LIST=`cat rlist.txt`
for index in $LIST; do
  curl -HContent-Type:application/json --user elastic:password -XPOST https://mysearch.domain.net:9200/_reindex?pretty -d'{
    "source": {
      "index": "'$index'"
    },
    "dest": {
      "index": "'$index'-v2",
      "pipeline": "string-to-long"
    }
  }'
done

Finally, clean up the old indices by deleting them. It’s a temptation to use Kibana to DELETE fwlogs-2019.02*, but beware the new indices have the suffix “-v2” and it will be deleted if the wildcard argument is used. Instead use the shell script to delete based on the names specifically listed in the txt file.

#!/bin/sh
# The list of index names in rlist.txt file
LIST=`cat rlist.txt`
for index in $LIST; do
  curl --user elastic:password -XDELETE "https://mysearch.domain.net:9200/$index"
done

Listing Memory Usage by Process

Solaris OS LogoA question asked to me often, “Which processes are using up too much memory?”  I generally use top to figure them out manually.  But there’s a better way to do it, using Solaris pmap command.  I can get a good estimate on the memory usage.  Brandon Hutchinson has a shell script that provides a nice output.  I modified it a little bit to include a column for process owner.

#!/bin/sh
/usr/bin/printf "%-6s %-9s %-13s %s\n" "PID" "Total" "User" "Command"
/usr/bin/printf "%-6s %-9s %-13s %s\n" "---" "-----" "----" "-------"
for PID in `/usr/bin/ps -ef  | /usr/bin/awk '$2 ~ /[0-9]+/ { print $2 }'`
do
   USER=`/usr/bin/ps -o user -p $PID | /usr/bin/tail -1`
   CMD=`/usr/bin/ps -o comm -p $PID | /usr/bin/tail -1`
   # Avoid "pmap: cannot examine 0: system process"-type errors
   # by redirecting STDERR to /dev/null
   TOTAL=`/usr/bin/pmap $PID 2>/dev/null | /usr/bin/tail -1 | \
   /usr/bin/awk '{ print $2 }'`
   [ -n "$TOTAL" ] && /usr/bin/printf "%-6s %-9s %-13s %s\n" "$PID" "$TOTAL" "$USER" "$CMD"
done | /usr/bin/sort -rn -k2

Note, this script needs to run as “root” for pmap to have permission to examine each process.

Output looks something like this:

PID    Total     User      Command
---    -----     ----      -------
694    25240K    root      /opt/RICHPse/bin/se.sparcv9.5.9
696    5208K     root      /usr/dt/bin/dtlogin
613    4992K     root      /opt/CA/BABcmagt/caagentd
326    4512K     smmsp      /usr/lib/sendmail
260    4440K     root      /usr/sbin/syslogd
269    2440K     root      /usr/sbin/cron
196    2360K     root      /usr/sbin/keyserv
193    2352K     root      /usr/sbin/rpcbind
103    2336K     root      /usr/lib/sysevent/syseventd
235    2224K     root      /usr/lib/nfs/lockd
206    2184K     root      /usr/lib/netsvc/yp/ypbind