Intermittently losing Cortex #739

gonrada · 2018-10-01T17:07:35Z

Request Type

Bug

Work Environment

Question	Answer
OS version (server)	Ubuntu 16.04
OS version (client)	Windows 7x64
TheHive version / git hash	3.1.0-1
Package Type	Docker
Cortex version	2.1.0-0.1RC1

Problem Description

TheHive intermittently loses connection to Cortex. After a few minutes I will see another message that the connection is up. While the connectivity between TheHive and Cortex is down I am still able to login to the Cortex via the web gui and run jobs. Both TheHive and Cortex are running in docker containers on the same machine. The CPU load is not high when this happens. I've checked the logs for TheHive container and I'm not seeing errors. I'm not sure where to look for more information to try to debug this.

nadouani · 2018-10-02T08:09:36Z

Well, these notifications are displayed when TheHive sees a status change related to the configured cortex instances. This statusis polled every minute.

Can you provide the result of

curl -H 'Authorization: Bearer THEHIVE-API-KEY' 'http://THEHIVE-SERVER:THEHIVE-PORT/api/status'

gonrada · 2018-10-02T12:51:17Z

I ran that and got the following:

{
    "config": {
        "authType": [
            "key",
            "local",
            "ad"
        ],
        "capabilities": [
            "authByKey",
            "changePassword",
            "setPassword"
        ],
        "protectDownloadsWith": "malware",
        "ssoAutoLogin": false
    },
    "connectors": {
        "cortex": {
            "enabled": true,
            "servers": [
                {
                    "name": "cortex1",
                    "status": "OK",
                    "version": "2.1.0-RC1"
                }
            ],
            "status": "OK"
        },
        "misp": {
            "enabled": true,
            "servers": [
                {
                    "name": "misp",
                    "purpose": "ExportOnly",
                    "status": "ERROR",
                    "version": ""
                }
            ],
            "status": "ERROR"
        }
    },
    "health": {
        "elasticsearch": "WARNING"
    },
    "versions": {
        "Elastic4Play": "1.6.2",
        "Elastic4s": "5.6.6",
        "ElasticSearch": "5.6.9",
        "Play": "2.6.18",
        "TheHive": "3.1.0"
    }
}

I then waited a couple of minutes and ran it again:

{
    "config": {
        "authType": [
            "key",
            "local",
            "ad"
        ],
        "capabilities": [
            "authByKey",
            "changePassword",
            "setPassword"
        ],
        "protectDownloadsWith": "malware",
        "ssoAutoLogin": false
    },
    "connectors": {
        "cortex": {
            "enabled": true,
            "servers": [
                {
                    "name": "cortex1",
                    "status": "ERROR",
                    "version": ""
                }
            ],
            "status": "ERROR"
        },
        "misp": {
            "enabled": true,
            "servers": [
                {
                    "name": "misp",
                    "purpose": "ExportOnly",
                    "status": "ERROR",
                    "version": ""
                }
            ],
            "status": "ERROR"
        }
    },
    "health": {
        "elasticsearch": "WARNING"
    },
    "versions": {
        "Elastic4Play": "1.6.2",
        "Elastic4s": "5.6.6",
        "ElasticSearch": "5.6.9",
        "Play": "2.6.18",
        "TheHive": "3.1.0"
    }
}

nadouani · 2018-10-02T12:57:18Z

OK, so the UI is behaving as expected. No the question is: Why your TheHive is randomly reaching your Cortex.

Do you have any logs in /var/log/thehive/application.log?

To-om · 2018-10-02T13:00:53Z

@gonrada Do you use docker-compose ? If so, what is your docker-compose file ?

gonrada · 2018-10-02T13:41:43Z

@To-om

version: "2"
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:5.6.0
    restart: always
    volumes:
      - /srv/thehive/elasticsearch/data:/usr/share/elasticsearch/data
      - /srv/thehive/elasticsearch/backup:/backup
    environment:
      - http.host=0.0.0.0
      - transport.host=0.0.0.0
      - xpack.security.enabled=false
      - cluster.name=hive
      - script.inline=true
      - thread_pool.index.queue_size=100000
      - thread_pool.search.queue_size=100000
      - thread_pool.bulk.queue_size=100000
      - path.repo=/backup
    ulimits:
      nofile:
        soft: 65536
        hard: 65536
    ports:
      - "9200:9200"
  cortex:
    container_name: thecortex
    restart: always
    image: thehiveproject/cortex:latest
    depends_on:
      - elasticsearch
    volumes:
      - /srv/thehive/cortex/application.conf:/etc/cortex/application.conf
      - /srv/thehive/cortex/Cortex-Analyzers:/opt/Cortex-Analyzers
    ports:
      - "9001:9001"
  thehive:
    container_name: thehive
    restart: always
    image: thehiveproject/thehive:3.1.0-1
    volumes:
      - /srv/thehive/keystore.jks:/etc/thehive/keystore.jks
      - /srv/thehive/application.conf:/etc/thehive/application.conf
    depends_on:
      - elasticsearch
      - cortex
    ports:
      - "9443:9443"

@nadouani I was running a tail -f /var/log/thehive/application.log inside the container. There aren't any updates to the log that occur when I get the error. Even while updating this ticket I've seen the error a couple of times and there aren't any new lines in the log file.

rayschippers · 2018-10-05T07:10:10Z

Probably doesn't help a lot but I'm using docker as well and also get this error constantly. Happy to provide any logs or data that might help troubleshoot

sprungknoedl · 2018-10-09T07:43:19Z

We are losing the Cortex connection as well. TheHive 3.1.0 and Cortex 2.1 installed as a "normal" service on 2 Ubuntu servers.

nadouani · 2018-10-09T08:45:11Z

We are losing the Cortex connection as well. TheHive 3.1.0 and Cortex 2.1 installed as a "normal" service on 2 Ubuntu servers.

Hello, I'm curious about what Losing connection means: are your seeing TheHive's UI saying: The Cortex connection is red? or you are getting broken connections when calling cortex APIs from TheHive?

rayschippers · 2018-10-09T09:04:45Z

In my case yes we get the UI pop up constantly saying it has disconnected and then that it's back and get analyzer failures.

nadouani · 2018-10-09T10:26:35Z

@rayschippers and @secdecompiled can you please call this type of script, to poll the status API:

while [ 1 ]
do
    curl -H 'Authorization: Bearer API_KEY' 'http://THEHIVE:9000/api/status' -s | jq  .connectors.cortex.status
    sleep 30
done

This will wait 30 seconds and call the API, you can stop it manually.
Here I use jq to get the cortex status from the API response.

gonrada · 2018-10-16T15:56:37Z

@nadouani is there any further information I can provide?

nadouani · 2018-10-16T15:59:08Z

I need the restult for my last question, to see how does thehive poll the cortex connection

rayschippers · 2018-10-16T16:10:03Z

Hi @nadouani I spent today upgrading everything to latest to see if there was any improvements, and it's happening less but still happening, output when it's broken:
"cortex":{"enabled":true,"servers":[{"name":"XXCORTEX","version":"","status":"ERROR"}]

and when it's back

"cortex":{"enabled":true,"servers":[{"name":"XXCORTEX","version":"2.1.2","status":"OK"}]

nadouani · 2018-10-16T16:13:29Z

Well, again. I need to know does the status polling work, so without running that script for few minutes, I cannot investigate. Thanks

rayschippers · 2018-10-16T16:28:37Z

Ran it for a few minutes and the output for Cortex status
OK
OK
ERROR
ERROR
ERROR
ERRROR
ERROR
ERROR
ERROR
ERROR
OK
OK

nadouani · 2018-10-17T09:47:10Z

Well this looks like a bug within the status polling that has a very small timeout.

Will be fixed in the next hotfix

BrijJhala · 2020-10-02T19:56:18Z

We have been running cortex almost 3 hours with 200 users in loop of 5. 1000 samples continuously. when we reach 12k cortex jobs, our codebase can not hit cortex /api/run or /api/<<jobid//results. we use httpClient axios to communicate cortex end point. its not responding. we thought its an issue on axios side but apparantly we started cortex, communication between our service (axios client) and cortex working. so sounds like cortex is holding up connections of client. we really need some inputs on it. Note : very important : we can run scan from UI without any issue. but not from our httpClient. Restarting cortex fixes our issue. need some inside on this issue. Another aspects is even we restart the pod of our service, cortex communication with our service is not functional.

nadouani added the need:investigation label Oct 9, 2018

nadouani added bug and removed need:investigation labels Oct 17, 2018

nadouani assigned To-om Oct 17, 2018

To-om added this to the 3.1.3 milestone Oct 17, 2018

To-om added a commit that referenced this issue Nov 5, 2018

#739 Remove the timeout when retrieving the remote service status

9715f37

To-om closed this as completed Nov 5, 2018

To-om modified the milestones: 3.1.3, 3.2.0 (Cerana 2) Nov 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intermittently losing Cortex #739

Intermittently losing Cortex #739

gonrada commented Oct 1, 2018

nadouani commented Oct 2, 2018

gonrada commented Oct 2, 2018 •

edited by nadouani

Loading

nadouani commented Oct 2, 2018

To-om commented Oct 2, 2018

gonrada commented Oct 2, 2018 •

edited by nadouani

Loading

rayschippers commented Oct 5, 2018

sprungknoedl commented Oct 9, 2018

nadouani commented Oct 9, 2018

rayschippers commented Oct 9, 2018

nadouani commented Oct 9, 2018

gonrada commented Oct 16, 2018

nadouani commented Oct 16, 2018 •

edited

Loading

rayschippers commented Oct 16, 2018

nadouani commented Oct 16, 2018

rayschippers commented Oct 16, 2018

nadouani commented Oct 17, 2018

BrijJhala commented Oct 2, 2020

Intermittently losing Cortex #739

Intermittently losing Cortex #739

Comments

gonrada commented Oct 1, 2018

Request Type

Work Environment

Problem Description

nadouani commented Oct 2, 2018

gonrada commented Oct 2, 2018 • edited by nadouani Loading

nadouani commented Oct 2, 2018

To-om commented Oct 2, 2018

gonrada commented Oct 2, 2018 • edited by nadouani Loading

rayschippers commented Oct 5, 2018

sprungknoedl commented Oct 9, 2018

nadouani commented Oct 9, 2018

rayschippers commented Oct 9, 2018

nadouani commented Oct 9, 2018

gonrada commented Oct 16, 2018

nadouani commented Oct 16, 2018 • edited Loading

rayschippers commented Oct 16, 2018

nadouani commented Oct 16, 2018

rayschippers commented Oct 16, 2018

nadouani commented Oct 17, 2018

BrijJhala commented Oct 2, 2020

gonrada commented Oct 2, 2018 •

edited by nadouani

Loading

gonrada commented Oct 2, 2018 •

edited by nadouani

Loading

nadouani commented Oct 16, 2018 •

edited

Loading