An interesting use case came up with a requirement to scan a large quantity of hash observables with TheHive using the VirusTotal analyzer. Unfortunately there are two limitations to this requirement - the VT rate limiting, and the sheer volume of hashes which would make manual instigation not feasible.

What follows is my initial implementation into scripting the VT hash lookup in a manner which would automatically analyse newly submitted hashes.

First of all we need to put in the basics of our python file and the appropriate imports. You will need to run pip the thehive4py before this can be run.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

from __future__ import print_function
from __future__ import unicode_literals

import requests
import sys
import json
import time
from thehive4py.api import TheHiveApi
from thehive4py.query import *
from thehive4py.models import Case, CaseObservable, CaseTask, CustomFieldHelper

Set our variables with the particulars of our environment. In my case I am running this locally before deploying this to a production system, so my values are benign.

urlTheHive='http://192.168.1.250:9000'
apiKeyTheHive = 'vudgLSyw1cDq6N3mVSmrdSsZ9AbUaQOR'
apiTheHive = TheHiveApi(urlTheHive,apiKeyTheHive)

So now we need to find cases which we will be searching within for hash checks on VirusTotal. In my case I am running a case scenario for rogue applications which need to be identified, categorised and investigated.

The target in this case will be open cases, with the keyword HASHCHECK in the title.

def search(title, query, range, sort):
        #print(title)
        #print('-----------------------------')
        response = apiTheHive.find_cases(query=query, range=range, sort=sort)

        if response.status_code == 200:
                #print(json.dumps(response.json(), indent=4, sort_keys=True))
                #print('')
                return response
        else:
                print('ko: {}/{}'.format(response.status_code, response.text))
                sys.exit(0)

searchCases = search("Hash observables",And(String("title:'HASHCHECK'"),Eq('status','Open')),'all',[])
cases = searchCases.json()

Here comes the fun part. We need to start stepping through the JSON response, then extract the observables from the case, filter out everything but a dataType of 'hash', then instigate an analyzer task through VirusTotal. As such...

for case in cases:
        CaseID = case['id']
        Observables = apiTheHive.get_case_observables(CaseID)
        CaseObservables = Observables.json()
        #print(json.dumps(Observables.json(), indent=4, sort_keys=True))

        for Observable in CaseObservables:
                if Observable['dataType'] == 'hash':
                        ObservableID = Observable['id']
                        print('Waiting 15 seconds to prevent API over utilisation')
                        time.sleep(15)
                        ObservableResponse = apiTheHive.run_analyzer('CORTEX-SERVER-ID',ObservableID,'VirusTotal_GetReport_3_0')

                        if ObservableResponse.status_code == 200:
                                print('VirusTotal request for {} queued.'.format(ObservableID))

The end result will be a cycle through of all the hash observables associated with the cases which are Open. Between each submission request is a 15 second break which limits the effect on the VT API and keeps the API from throwing errors.