Splunk Live!

I attended the Splunk Live! London event last Thursday.  I am currently in the process of assessing Splunk and it’s suitability as a security SIEM (Security Information and Event Management) tool in addition to general data collection and correlation tool.  During the day I made various notes that I thought I would share, I’ll warn you up front that these are relatively unformatted as they were just taken during the talks on the day.

Before I cover off the day, I should highlight that I use the term SIEM to relate to the process of Security Information and Event Management, NOT SIEM ‘tools’.  Most traditional tools labelled as SIEM as inflexible, do not scale in this world of ‘big data’ and are only usable by the security team.  This for me is a huge issue and waste of resources.  SIEM as a process is performed by security teams every day and will continue to be performed even when using whatever big data tool of choice.

The background to my investigating Splunk is that I believe a business should have a single log and data collection and correlation system that gets literally everything from applications to servers to networking equipement to security tools logs / events etc.  This then means that everyone from Ops to application support, to the business to security can use the same tool and be ensured a view encompassing the entire environment.  Each set of users would have different access rights and custom dashboards in order for them to perform their roles.

From a security perspective this is the only way to ensure the complete view that is required to look for anomalies and detect intelligent APT (Advanced Persistent Threat) type attacks.

Having a single tool also has obvious efficiency, management and economies of scale benefits over trying to run multiple largely overlapping tools.

Onto the notes from the day;

Volume – Velocity – Variety – Variability = Big Data

Machine generated data is one of the fastest growing, most complex and most valuable segments of big data..

 

Real time business insights

Operational visibility

Proactive monitoring

Search and investigation

Enables move from ‘break fix’ to real time operations insight (including security operations). 

GUI to create dashboards – write quires and select how to have them displayed (list, graph, pie chart etc.) can move things around on dashboard with drag and drop.

Dev tools – REST API, SDKs in multiple languages.

More data in = more value.

My key goal for the organisation – One log management / correlation solution – ALL data.  Ops (apps, inf, networks etc.) and Security (inc PCI) all use same tool with different dashboards / screens and where required different underlying permissions.

Many screens and dashboards available free (some like PCI and Security cost)  dashboards look and feel helps users feel at home and get started quickly – e.g. VM dashboards look and feel similar to VMware interface.

another example – windows dashboard – created by windows admins, not splunk – all the details they think you need.

Exchange dashboard – includes many exchange details around message rates and volumes etc, also includes things like outbound email reputation

VMware – can go down to specific guests and resource use, as well as host details. (file use, CPU use, men use etc.)

Can pivot between data from VMware and email etc. to troubleshoot the cause of issues.

These are free – download from spunkbase

Can all be edited if not exactly what you need, but are at least a great start..

Developers – from tool to platform – can both support development environments and be used to help teach developers how to create more useful log file data.

Security and Compliance – threat levels growing exponentially – cloud, big data, mobile etc. – the unknown is what is dangerous – move from known threats to unknown threats..

Wired – the internet of things has arrived, and so have massive security threats

Security operations centre, Security analytics, security managers and execs

  • Enterprise Security App – security posture, incident review, access, endpoint, network, identity, audit, resources..

Look for anomalies -things someone / something has not done before

  • can do things like create tasks, take ownership of tasks, report progress etc.
  • When drilling down on issues has contextual pivot points – e.g right click on a host name and asset search, google search, drill down into more details etc.
  • Even though costs, like all dashboards is completely configurable.

Splunk App for PCI compliance – Continuous real time monitoring of PCI compliance posture, Support for all PCI requirements (12 areas), State of PCI compliance over time, Instant visibility on compliance status – traffic lights for each area – click to drill down to details.

  • Security prioritisation of in scoop assets
  • Removes much of the manual work from PCI audits / reporting

Application management dashboard

  • spunk can do math – what is average stock price / how many users on web site in last 15 minutes etc.
  • Real time reporting on impact of marketing emails / product launches and changes etc.
  • for WP – reporting on transaction times, points of latency etc – enable focus on slow or resource intensive processes!
  • hours / days / weeks to create whole new dashboards, not months.

Links with Google earth – can show all customer locations on a map – are we getting connections from locations we don’t support, where / what are our busiest connections / regions.

Industrial data and the internet of things; airlines, medical informatics (electronic health records – mobile, wireless, digital, available anywhere to the right people – were used to putting pads down, so didn’t get charged – spunk identified this).

Small data, big data problem (e.g. not all big data is a actually a massive data volume, but may be complex, rapidly changing, difficult to understand and correlate between multiple disparate systems).

Scale examples;

Barclays – 10TB security data year.

HPC – 10TB day

Trading 10TB day

VM – >10TB year

All via splunk..

DataShift – Social networking ‘ETL’ with spunk. ~10TB new data today

Afternoon sessions – Advanced(isn) spunk..

– Can create lookup / conversion tables so log data can be turned into readable data (e.g. HTTP error codes read as page not found etc. rather than a number)  This can either be automatic, or as a reference table you pipe logs through when searching.

– As well as GUI for editing dashboards, you can also directly edit the underlying XML

– Can have lots of saved searches, should organise them into headings or dashboards by use / application or similar for ease of use.

– Simple and advanced XML – simple has menus, drop downs, drag and drop etc.  Advanced required you to write XML, but is more powerful.  Advice is to start in simple XML, get layout, pictures etc sorted, then convert to advanced XML if any more advanced features are require.

– Doughnut chart – like a pie chart with inside and outside layers – good if you have a high level grouping, and a lower level grouping – can have both on one chart.

– Can do a rolling, constantly updating dashboard – built in real time option to refresh / show figures for every xx minutes.

High Availability

  • replicate indexes
    • gives HA, gives fidelity, may speed up searches

Advanced admin course;

http://www.splunk.com/view/SPCAAAGNF

Report acceleration

  • can accelerate a qualifying report – more efficiently run large reports covering wide date ranges
  • must be in smart or fast mode

Lots of free and up to date training is available via the Splunk website.

Splunk for security

Investigation / forensics – Correlation, fast to root cause, look for APTs, investigate and understand false positives

Splunk can have all original data – use as your SIEM – rather than just sending a subset of data to your SIEM

Unknown threats – APT / malicious insider

  • “normal” user and machine data – includes “unknown” threats
  • “security” data or alerts from security products etc.  “known” security issues..   Misses many issues

Add context  – increases value and chance of detecting threats.  Business understanding and context are key to increasing value.

Get both host and network based data to have best chance of detecting attacks

Identify threat activity

  • what is the modus operandi
  • who / what are most critical people and data assets
  • what patterns and correlations of ‘weak’ signals in normal IT activities would represent abnormal activity?
  • what in my environment is different / new / changed
  • what deviations are there from the norm

Sample fingerprints of an Advanced Threat.

Remediate and Automate

  • Where else do I see the indicators of compromise
  • Remediate infected systems
  • Fix weaknesses, including employee education
  • Turn the Indicators of Compromise into real time search to detect future threats

– Splunk Enterprise Security (2.4 released next week – 20 something april)

– Predefined normalisation and correlation, extensible and customisable

– F5, Juniper, Cisco, Fireeye etc all partners and integrated well into Splunk.

Move away from talking about security events to all events – especially with advanced threats, any event can be a security event..

I have a further meeting with some of the Splunk security specialists tomorrow so will provide a further update later.

Overall Splunk seems to tick a lot of boxes and looks certainly taps into the explosion of data we must correlate and understand in order to maintain our environment and spot subtle, intelligent security threats.

K

 

Advertisements

PCI-DSS Virtualisation Guidance

In what was obviously a response to my recent blog post stating
more detailed guidance would be helpful (yes I am that influential!) the ‘PCI
Security Standards Council Virtualisation Special Interest Group’ have just
released the ‘PCI DSS Virtualisation Guidelines’ Information Supplement.

This can be found here;

https://www.pcisecuritystandards.org/documents/Virtualization_InfoSupp_v2.pdf

This is a welcome addition to the PCI-DSS as it makes the
requirements for handling card data in a virtual environment much more clear.
The use of the recommendations in this document along with the reference
architecture linked to in my previous post will provide a solid basis for
designing PCI-DSS compliant virtual environment.

The document itself is in 3 main sections. These comprise;

– ‘Virtualisation Overview’ which outlines the various components
of a virtual environment such as hosts, hypervisor, guests etc. and under what
circumstances they become in scope of the PCI-DSS

– ‘Risks for Virtualised Environments’ outlines the key risks
associated with keeping data safe in a virtual environment including the
increased attack surface or having a hypervisor, multiple functions per system,
in memory data potentially being saved to disk, Guests of different trust
levels on the same host etc. along with procedural issues such as a potential
lack of separation of duties.

– ‘Recommendations’; This section is the meat of the document that
will be of main interest to most of the audience as it details the PCI’s recommended
actions and best practices to meet the DSS requirements. This is split into 4
sections;

– General –
Covering broad topics such as evaluating risk, understanding the standard,
restricting physical access, defence in depth, hardening etc.   There is also a recommendation to review other guidance such as that from NIST (National Institute of Standards Technology), SANS (SysAdmin Audit Network Security) etc. – this is generally
good advice for any situation where a solid understanding of how to secure a
system is required.

– Recommendations for Mixed Mode Environments –

This is a key section for most businesses as the reality for most of us is that being able to run a mixed mode environment, (where guests in scope of PCI-DSS and guests not hosting card data are able to reside on the same hosts and virtual environment via acceptable logical separation), are the best option in order to gain the maximum benefits from virtualisation.  This section is rather shorter than expected with little detail other than many warnings about how difficult true separation can be.  On a bright note it does clearly
say that as long as separation of PCI-DSS guests and none PCI-DSS guests can be configured and I would imagine audited then this mode of operating is permitted.  Thus by separating the Virtual networks and segregating the guests into separate resource pools, along with the use of virtual IPS appliances and likely some sort of auditing (e.g. a netflow monitoring tool) it should be very possible to meet the DSS requirements in a mixed mode virtual environment.

– Recommendations for Cloud Computing Environments –

This section outlines various cloud scenarios such as Public / Private / Hybrid along with the different service offerings such as IaaS (Infrastructure as a Service), PaaS (Platform as a Service), SaaS (Software as a Service).  Overall it is highlighted that in many cloud scenarios it may not be possible to meet PCI-DSS requirements due to the complexities around understanding where the data resides at all times and multi tenancy etc.

– Guidance for Assessing Risks in Virtual Environments –

This is a brief section outlining areas to consider when performing a risk assessment, these are fairly standard and include Defining the environment, Identifying threats and vulnerabilities.

Overall this is a useful step forward for the PCI-DSS as it clearly shows that the PCI are moving with the times and understanding that the use of virtual environments can indeed be secure providing it is well managed, correctly configured and audited.

If you want to make use of virtualisation for the benefits of consolidation, resilience and management etc. and your environment handles card data this along with the aforementioned reference architecture should be high on your reading list.

K

 

PCI-DSS compliance in a virtual environment

Version 2 of the PCI-DSS (Payment Card Industry – Digital Security Standard) that was released in October of last year (2010) finally added some much needed, if limited, clarification around the use of virtualised environments.

This change / clarification is an addition to section 2.2.1 of the standard, adding the statements;

Note: Where virtualization technologies are in use, implement only one primary function per virtual system component.

And

2.2.1.b If virtualization technologies are used, verify that only one primary function is implemented per virtual system component or device

While this does not clarify how to set up a virtual environment that handles card data to meet PCI-DSS it does at least make it very clear that the use of virtual environments is acceptable and can meet the standard.

This removes the previous confusion around the acceptability of of using virtualisation to host environments dealing with card data that stemmed from the statement in version one of the standard around each server having to have only a single function.  By definition the physical hosts in a virtualised environment host multiple guests (the virtual servers) and thus have multiple functions.

Despite not having as much detail as many had hoped this is a great step forward given the ever increasing adoption of virtualisation to reduce costs and make better use of server hardware.

This has also opened the door to the possibility of using cloud based services to provide a PCI-DSS compliant architecture.  During some recent research into virtual architecture that will meet the requirements of PCI-DSS 2 I came across this work from a combination of companies to provide a reference architecture for PCI-DSS compliance in a cloud based scenario;

http://info.hytrust.com/pci_reference_architecture_x1.html

The above links to both a webinar providing an overview of the work undertaken, and a white paper detailing the actual reference architecture.

The architecture design was undertaken by Cisco, VMWare, Savvis, Coalfire and Hytrust, and while the solution is understandably made up of the products and services offered by those companies, it clearly outlines a solution that you can adapt for your needs and make use of similar solutions that fit with your companies tech stack.  As such this is a highly recommended read for anyone involved in designing or auditing solutions that need to be PCI-DSS compliant.

K