13 October 2018

Security Operations Center Maturity – a step-by-step DIY Guide

3 year roadmap to take your SOC from concept to maturity

As the threat landscape evolves rapidly, cyber protection needs to keep pace. Your Security Operations Center (SOC) is the front line of network defense, charged with preventing an attack before it happens. To be effective, your SOC must be matured and capable of detecting, investigating and responding to complex and persistent attacks.

However, global findings indicate that many SOCs are below target maturity levels, and unable to detect advanced attacks.This makes organizations vulnerable, placing their most sensitive and valuable assets at risk.

While powerful protection is the prime reason for developing, maturating and improving the capabilities of your SOC, increasing regulatory pressure and compliance requirements also play a part.

In this step-by-step DIY guide, you will learn how to assess the current maturity levels of your SOC, establish your desired level and chalk out your developmental roadmap.

People (trained and skilled security specialists), processes (for incident response and management) and technology (tools to collect and analyze data) are the foundation of SOC operations. This guide focuses on the Security Information and Event Management (SIEM) solution, which is an established platform for maturity modeling.  

SIEM is implemented to

  • Meet compliance requirements and perform log collection, correlation and monitoring.
  • Alert for incidents/ critical actions and acting as an early warning indicator for monitoring threats and detect cyber-attacks and respond to those attacks.
  • Augment capability to respond to and recover from a cyber-attack with minimal impact to business.
  • Trace back to the vulnerability post occurrence of any cyber attack/ incident which happened due to an exposure/vulnerability or targeted attack and fabricate the process to deal with similar type of incidents ensuring minimum Recovery Time Objective (RTO) and possibly defining the Recovery Point Objective (RPO)
  • Provide effective and efficient incident monitoring and detection.
  • Deliver near Real time incident response.
  • Incorporate excellent Cyber Security Governance practices
  • Provide meaningful metrics, reporting & dashboards which reflect SOC performance, operational efficiencies and incident response and management
  • Permit unhindered 24x7 operations
  • Allow end to end policy management and review of various devices for security monitoring

These objectives can only be achieved if all of the five components of SOC namely Governance, Services, People, Process and Technology are defined in line with the risk, maturity and in alignment of the business objectives. SOCs world-over are adopting technological innovations like automation, security analytics, machine learning and several other applications of cognitive computing. But, what technologies to adopt and how extensively to use them are critical decisions for improving SOC effectiveness. It is best to adopt a balanced approach that augments your people, processes and technologies through the right mix of automation, analytics, real-time monitoring, and hybrid staffing models.

Our experienced security experts recommend this balanced maturity assessment model for SOCs. This is based on real-world data and our experience helping thousands of security teams measure capability and maturity.

Comprehensive Governance Model

  • Clear objectives for SOC; identifying immediate and burning needs
  • Executive support; with decision makers, stakeholders and budget called out
  • Plan and approval; outlining resources, platform and processes needed, the framework and roadmap to execute the framework
  • Target customers (both internal and external) and service catalog
  • Strategy & vision to provide a secure environment that enables business to deliver
  • Collaborative breach detection and coordinated response

Well-defined Processes

  • Incident Management – Monitoring and Notification process, Triage and Escalation process, Incident Response and recommendation process and knowledge base management
  • Shift handover process, Roster Management Process
  • Log Management process - Capabilities to ensure that system logs are retained in accordance with company policies as well as regulatory and legal mandates
  • Threat Intelligence, emerging Indicators of Compromises (IoCs) and Hunting process
  • Problem Management process
  • SIEM Administration process – HA, user creation, device onboarding, role-based reporting for eg. executive reports/dashboards, Analyst dashboards, compliance dashboards etc. reporting, use case management
  • Change Management - Plans and processes to ensure that operations will be maintained and that the business will survive in the event of unexpected incidents and outages
  • Patch Management processes - Capabilities to ensure that patches are up-to-date according to policy in order to reduce risk
  • Risk Management - Capabilities to ensure that systems are properly configured to mitigate risk. (SIEM & SOC Components)

Service Catalog and SOC Operations

  • Log management
  • Alert management, case management, escalation and reporting
  • Incident Management
  • Security Vulnerability Management
  • Compliance management for Security Monitoring including Reporting and “State of Security” definition
  • Malware Management

Skilled People

  • Shift Management, Rotation
  • Training and Skill Management
  • Performance management
  • SOC people metrics

Integrated Security Technology

  • SIEM Technology Maturity
    • Asset Modelling/ Log Baseline

    • Log source Integration and custom parser/connector development

    • Use cases management - single device, multi device and threat aligned and compliance use cases

    • Flow Management and integration with SIEM

    • Full packet capture for forensics

    • Vulnerability integration for asset, incident and vulnerability visualization

  • Threat Intelligence
    • Tactical Threat Intelligence integration into the SIEM

    • Strategic and actionable threat intelligence and analysis

    • Threat Intelligence Platform deployment, analysis and contextual mapping with assets/crown jewels

  • Security Analytics
    • User and Entity Behavior Analytics

    • Network Threat Analytics

  • Incident management tools
    • Visualization, Reporting, Trends and dashboard integration

    • Incident response orchestration

  • Endpoint Security Analytics and Integration
    • EDR/ EPP

When you are starting from scratch (no SIEM, no SOC functions defined), building and your SOC and taking it to maturity can seem daunting.  Our SOC experts have outlined a roadmap to get you there in 3 years.

Year 1: Log Collection, Enrichment and Management along with out of the box rules / compliance rules for monitoring and threat detection

  • Design and Deploy a basic SIEM with capabilities of Log collection, correlation, visualization and basic workflow integration
  • Consider log baseline and log source integration with the focus on critical assets as mentioned below
    • Network devices like core router, core switches, key network management components

    • All perimeter security devices like Firewall, IPS, Proxy Servers, URL Filters etc.

    • All Authentication devices like AD, VPN Servers, database used for authentication, TACAS/ Radius servers

    • Public facing servers like Web applications used for transactions from middleware and backend limited to security logs

    • DNS Servers

    • Critical applications like CRM, Core banking channels, API intersection, but limited to security logs

  • Use Cases
    • Monitor “what matters the most” (Web Apps, Core OS, PCI related Application, Databases, Credit Card Information, Customer and Employee PII etc.) from data security point of view

    • Keep the number of correlation rules in the order of risk, initially 10-12 Use cases each with 3-6 SIEM rules/alerts in a Kill Chain Model. For Example

      • The solution will include out of box rules for alerting on threats found in log or network data. Ex- failed logins, account changes, expirations, port scans, suspicious file names, default usernames, default passwords, security tools, AV signature updates, successful authentications, bandwidth by IP, email senders, failed privilege escalations, VPN failed logins, group management system configuration changes, traffic to non-standard ports, etc.
          • Monitor all privilege activities

          • Monitor key system file changes

          • Website home page, file auditing and its alerts

          • All anomalous authentication activities

          • System reboot followed by Audit logs cleared and Audit logs cleared, and system rebooted; Failed Windows logins for multiple (3) user names from a single workstation

          • Add / Remove AD admin group membership privileges to another person

          • Forced password reset.

          • Account Management (A user account was locked out Followed by; A user account was unlocked Followed by; A member was added to a security-enabled local group Followed by; A member was removed from a security-enabled local group). Monitor terminated users, these could be users whose employment has been terminated or will be terminated

          • Audit Privilege Use and enable Privilege Auditing – Audit (- CREATE TABLE; - DROP TABLE; - ALTER TABLE)

          • Correlate Log drop user and revoking of rights from a user

          • Large web file sent and log http request and response

          • SQL Injection and XSS detected

          • Large spike in DNS request

          • A single machine receiving authentication failures from multiple servers

  • Create three basic processes
    • Monitoring and Notification Process

    • Triage and Escalation Management

    • Incident Response and Service Desk/ Ticketing Management

    • Create basic reporting and trends, alerts and notification and IR summary

Year 2: Advance event/flows, application (layer-7) level monitoring correlations, Threat detection/hunting and threat modelling

  • SIEM
    • Add flow monitoring and create use cases with flows like large html packets, clear text username and password, clear text card information etc.

    • Perform 100% coverage for critical log sources; Include enterprise applications in the order of risk

    • Customization and Integration of application log sources via parsers / uDSM

    • Leverage user behavior analytics functions of an SIEM with authentication sources

    • Improve alert management with context

  • Use Cases – Improve use cases to include multi-device use cases
    • Privileged Access Monitoring – Monitor administrative activities and alerts for violations. Accounts having privileged access, e.g. Admin, sudo should be monitored on activity performed by ID. Any unauthorised activity or suspicious activity has to be alerted.

    • Back Doors – malwares and back doors and remote known exploits detected

    • Social (e.g. Phishing, Threat Intel.) - Communication to known malware sites such as Botnet Cnc, Phishing, Watering hole etc., Tactical threat feed integration

    • Vulnerability Management - Detects vulnerability scanning of the hosts

    • Anomaly (Behaviour) –

      • Identify the DNS traffic generated from non-DNS Servers or non-standard ports
      • Reports a remote host attempting reconnaissance or suspicious connections on common local web server ports to more than 60 hosts in 10 minutes.
    • DDOS - Detect DOS/DDOS attacks such as sudden spikes in network bandwidth usage with net flow

    • Configuration Changes – Monitor and alert for configuration and system file changes on critical servers, applications and network devices

    • Physical Security – Verify physical security access logs for multiple failures and integrate with logical authentication logs for violation and context

    • Business Policy – Business policy violations like logons during non-working hours, direct database connections/queries.

    • File Transfer - Detect file transfer activity from sensitive servers such as DB or SAP servers or file servers

  • Processes
    • Create Run Book– For each use case define validation, containment, eradication and recovery steps.

    • Perform simulations of the run books

    • RCA and Lessons Learned and back to operations processes and analysis

  • Threat Intelligence
    • Add tactical threat Intelligence via TAXII protocols.

    • Have a threat hunter to look for IOCs

Year 3: Machine Learning driven Advanced security and business analytics, High performance big data compute

  • SIEM
    • Add full packet capture for forensics

    • Advanced firewall rule simulations and corresponding suggestions for optimized and noise free networks using the risk analysis modules

    • Improve and automate alert management

    • Advanced asset baselining to prioritize the response and recovery of assets over the other low risk assets in case of cyber attack/incident

    • Integrate with CMDB and perform auto change management

    • Integrate with Vulnerability management and provide for internal threat intelligence and prioritization of the state of security vulnerability, asset criticality and incidents.

    • Compliance driven reporting based on the line of business the Customer operate in for e.g. HIPAA, PCI-DSS, GLBA, FISMA,GDPR, NYDFS etc.

  • Use Cases
    • Monitor and Adapt Rule Bases – Fine tune rules

    • Create additional 10-12 Use Cases for business focused Applications and correlate them with the Flows which provide deeper contextual information and help in threat modelling and Incident Forensics

  • Threat Intelligence and Threat Hunting
    • Automate tactical threat intelligence and its response

    • Define Threat Intelligence process for threat hunting, asset criticality and vulnerability identification and actionable – This can be achieved by implementing big data platforms like ELK or other big data threat hunting platforms to look for IOC, hunt for threats etc.

    • Leverage threat intelligence platform to fuse threat feeds from multiple sources, contextualize with asset criticality and provide actionable

    • Proactive threat hunting deployment using machine learning and automation like port-protocol mismatches, user behaviour and threat intelligence information for executing hunt missions.

  • Security Analytics
    • Implement User and system behaviour analytics

    • Additional reporting and visualization for key systems and data

    • Network Analytics using full packet capture

    • Consider Endpoint  Protection (EPP/ EDR) tools that leverage machine learning, Intelligence integration and IOC Management at the endpoints

The use cases of security analytics can be varied and leverage what is relevant

      • Employee monitoring
      • Analysing user behaviour to detect potentially suspicious patterns
      • Analysing network traffic to pinpoint trends indicating potential attacks
      • Identifying improper user account usage, such as shared accounts
      • Detecting data exfiltration by attackers
      • Detecting insider threats
      • Identifying compromised accounts
      • Investigating incidents
      • Threat hunting