Fusion Of Context And Content Awareness: Making Endpoint DLP Effective
The risk-based approach to information security that has dominated the corporate market in recent years has resulted in the IT security industry moving from a network-centric to a data-centric information security model.
It was around 2002-2004 that the first network-resident data leak prevention (DLP) appliances for analysing the content of network communications appeared on the market. These filtered network content such as Web access, emails and instant messages to prevent corporate data leakage as a result of insider misconduct. At the same time, the growing threat of data leaks from corporate computers through their local ports and peripheral devices created a demand for device/port control products and soon after for endpoint DLP solutions with much wider context-based functionality.
As both network DLP appliances and endpoint device control products were targeting the same market but using essentially heterogeneous technologies (content filtering against context-based methods), the vendor-level competition created an “ideological” contradiction between content filtering and context-based DLP technologies.
The proponents of content filtering argued that only these highly-intelligent technologies would be able to comprehensively solve the problem of corporate data leakage because they address it directly by analyzing the data’s meaningful content – information. To the contrary, device control technologies were “accused” of not being able to “understand” the subject of protection and instead using indirect methods, which was inefficient in principle.
In response, device control vendors quite fairly pointed out the high percentage of “false positives” in content filtering solutions, and highlighted their complete inability to prevent local data leaks from corporate computers.
Since then, the situation has changed: the performance of endpoint computers has dramatically improved and this has enabled both pure endpoint DLP players and some DLP appliance vendors to port content analysis components to their endpoint agents. Does the increasing deployment of content filtering endpoint solutions mean that the inefficiency of context-based DLP technologies has become evident and due to the loss of value they will soon cease to be used?
Not at all! As DLP solutions for endpoint computers and indeed customer requirements have matured it has become clear that the contradiction between context- and content-based DLP technologies was completely artificial. To understand why, their fundamental interdependencies with regards to endpoint computing should be considered.
Firstly, the ultimate objective of any DLP solution is to prevent information leakage so it must be able to directly detect and verify the meaning of the data in transfer – that is the content. Given that pure context-based endpoint DLP solutions do not support content detection and analysis, but rather use indirect methods – like device access control – they are essentially incomplete for the purpose of information protection and therefore need to integrate with content filtering in order to provide a complete solution.
On the other hand, it is a fundamental principle that the data’s real meaning, or information, can be understood and rationally used only within a specific application context. With regards to DLP, it is the full knowledge of the context of the data transfer that determines if otherwise abstract data is meaningful – and leaked – information. Without understanding who is transferring the data, where it is from, through which channel or media, and where it is destined to go, it is impossible to define what information the data contains, how sensitive it is, if the transfer is legitimate, or if it violates the organization's security policy. In other words, content-aware DLP methods are not feasible without the ability to fully detect the context of a data operation and use it for policy reasoning. This is why to be meaningful and actionable any content filtering policy must combine content specifications with relevant context parameters and conditions.
The dependence of endpoint content analysis on the completeness of context controls comes to light when the complexity of the endpoint DLP structural anatomy is reviewed. The four most dangerous data leakage channels on the endpoint computer are the network channel and the three local channels: removable storage and plug and play devices, data synchronizations with locally connected smartphones and PDAs, and the printing channel. Each channel is a layered aggregate of its physical and logical interfaces, the specific device types that it can connect, the applications or system services that operate the channel, the relevant data types used by these applications, and finally, the content carried by these data.
The essential feature of any endpoint DLP process is that before the content filtering engine can start analyzing the data in any channel, the specific device or network connection needs to be identified and allowed at the interface layer. The DLP agent should then detect the application or service in charge of the connection to identify the channel type and, if required, control its behavior. Another context control layer of the DLP architecture should perform the next step – the detection of the type of transferred data (e.g. file format), which is necessary for extracting their textual content for filtering. Only then will the content filtering engine be able to start its job.
Any gap in the “underlying” puzzle of endpoint context controls immediately limits the effective range of its content filtering “overlay” down to the data leakage scenarios not affected by the gap, because it creates severe inconsistencies in the entire DLP policy enforcement and may lead to the complete inability to prevent specific data leaks.
An example of the negative impact on the endpoint DLP capabilities could be a case of incomplete context controls over the printing channel when the DLP policy specifies that “the documents classified as confidential or secret shall not be printed to network or local printers”. If the DLP agent can detect only USB-connected printers – as it is common for many endpoint DLP solutions – then any printers connected to the computer through LPT or FireWire ports, or through a network connection won’t be recognized as “printing-capable devices”.
Consequently, the DLP agent will mistakenly enforce on their connections port-level access policies that have nothing to do with how the solution is expected to control data in the printing channel. Furthermore, the printout format detection won’t be possible and the textual data won’t be extracted for content filtering. Another result of this “non-USB detection” fault will be that the agent won’t understand whether the printer is locally connected or network-attached. Ultimately, this context control “gap” prohibits the agent from consistently enforcing the specified printing DLP policy. Even worse, it leads to a fatal deficiency in the DLP solution: users will be able to uncontrollably print any documents to printers connected through LPT or FireWire ports, as well as to network printers unless all non-USB local ports of the computer are fully blocked and network connections are firewalled.
As endpoint DLP solutions become mature and proliferate the mainstream corporate including SMBs organisations need to be fully aware of the essential interdependence between the effectiveness of content-aware DLP components and the completeness of context DLP controls across all the layers and channels on the endpoint computer.
When choosing an endpoint DLP solution, their requirements need to be carefully considered and balanced depending on the insider threat profile of the organization. First and foremost they should ensure that the context-based DLP controls of the solutions they are considering covers all possible data leak scenarios in the threat profile in their entirety. Only after the context-level balance is set should they start developing requirements for the content filtering components. There are many technologies to choose from – word patterns, regular expressions, data fingerprinting, conceptual and adaptive lexical analyses, clustering, etc. As their prices differ considerably, the rational approach is to deploy only those methods that are minimally sufficient to match the threat profile and comply with the organization’s data protection policies.
Ultimately it is not the contradiction but the fusion of content filtering and context control technologies that will make endpoint DLP solutions truly reliable, effective and complete yet affordable for organizations of any size and the sooner the market comes to this realization the better for everyone.
-- by Sacha Chahrvin, Managing Director, DeviceLock UK and Ireland
Device Lock are exhibiting at Infosecurity Europe is the No. 1 industry event in Europe held on 27th – 29th April at Earl’s Court, London. The event provides an unrivalled free education programme, exhibitors showcasing new and emerging technologies and offering practical and professional expertise. For further information please visit www.infosec.co.uk
| Tweet |
|
|
