I have been playing with TheHive, Cortex and MISP for a few months now, and have been working a way forward to have it programmatically analyse new observables whilst the SOC operator is working on other tasks. Effectively having TheHive start core jobs on observables automatically and then waiting for the operator to check over the results as required.

Here is Version 2 of what I have been putting together to expedite the process of observable analysis...

My requirement came out of a personal investigation into some spam I have getting. Nothing too  special, just a UPS branded shipping delivery error. My objective was to follow the threads of the observables and see just where this campaign originates from - using nothing but the core TheHive and Cortex analysers.

What started as a single URL observable, began growing into thousands of observables as I started pulling data from Shodan and VirusTotal on domains, nameservers and ip addresses. So how was I going to get analysis done on that many objects without breaking various API limitations. I couldn't search them all in one hit, or I would break the rate limiter - and any analysis I did needed to end up back in TheHive for the operator to look over afterwards.

This is where I wrote the below python script to look over currently Open cases with observables compatible with the analyzer being run:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import print_function
from __future__ import unicode_literals

import sys
import json
import time
from thehive4py.api import TheHiveApi
from thehive4py.query import *

api = TheHiveApi('http://127.0.0.1:9000', 'API Key here')

cases = api.find_cases(query=Eq('status', 'Open'), range='all', sort=[])

for case in cases.json():
  case_id = case['id']
  case_observables = api.get_case_observables(case_id)
  print('Querying CaseID {}'.format(case_id))
  time.sleep(1)
  for observable in case_observables.json():
    if ('domain' or 'url' or 'fqdn' or 'ip' or 'hash') in observable['dataType']:
      if 'MISP_2_0' not in observable['reports']:
        print('Querying MISP for observable {}'.format(observable['id']))
        response = api.run_analyzer('CORTEX-SERVER-ID',observable['id'],'MISP_2_0')
        if response.status_code == 200:
          print('OK')
    #print('Analysis complete!')
  print('####################################################################')

What I have done here is made a Python script to look at all Open cases, then for all observales which are dataTypes defined as Domain, URL, FQDN, IP or Hash - look at my local MISP instance to find any existing reports which match those elements.

Based on the above, I have written several separate iterations of this script to cover the following sources, with rate limiting implemented to prevent API rate-limit violations:

  • Malware Information Sharing Platform (MISP)
  • Shodan
  • Urlscan.io
  • VirusTotal (GetReport)

Each of these implementations may or may not be against your various TOS / EULA depending on your API access level. But this has certainly reducing the clicking and potential for human fatigue error.

Although in the above I have come up against another emerging issue - now I need to tweak ElasticSearch to handle the volume of transactions I am now storing and querying.