Data Security

Published on June 2016 | Categories: Documents | Downloads: 109 | Comments: 0 | Views: 1092
of 16
Download PDF   Embed   Report

Comments

Content

Login | Register | Contact

Home Blog Research Library About

Search

Data Security Lifecycle
• Daily Digest

Subscribe to our daily email digest.

TagCloud

data security dlpencryption data loss
prevention app security virtualizationhacking apple cloud
security information-centric
securityvulnerabilities tools riskcybercrime database
security physical securitycloud security breach pcifriday
summary database activity monitoring home security cloud computingweb application security

--Big Ass Tag Cloud--

EntriesCalendar
< < November 2013 > > S M T W T S

2728 29 30 3
3 10 17 24 4 11 18 25 5 12 19 26 6 13 20 27 7 14 21 28

1 8 15 22 29

2 9 16 23 30

Wednesday, August 10, 2011

DataSecurityLifecycle2.0: Functions,Actors,andControls
By Rich In our last post we added location and access attributes to the Data Security Lifecycle. Now let’s start digging into the data flow and controls. To review, so far we’ve completed our topographic map for data:

This illustrates, at a high level, how data moves in and out of different environments, and to and from different devices. It doesn’t yet tell us which controls to use or where to place them. That’s where the next layer comes in, as we specify locations, actors (‘who’), and functions:

Functions
There are three things we can do with a given datum: • Access: View/access the data, including copying, file transfers, and other exchanges of information.

• •

Process: Perform a transaction on the data: update it, use it in a business processing transaction, etc. Store: Store the data (in a file, database, etc.).

The table below shows which functions map to which phases of the lifecycle:

Each of these functions is performed in a location, by an actor (person).

Controls
Essentially, a control is what we use to restrict a list of possible actions down to allowed actions. For example, encryption can be used to restrict access to data, application controls to restrict processing via authorization, and DRM storage to prevent unauthorized copies/accesses.

To determine the necessary controls; we first list out all possible functions, locations, and actors; and then which ones to allow. We then determine what controls we need to make that happen (technical or process). Controls can be either preventative or detective (monitoring), but keep in mind that monitoring controls that don’t tie back into some sort of alerting or analysis merely provide an audit log, not a functional control. This might be a little clearer for some of you as a table:

Here you would list a function, the actor, and the location, and then check whether it is allowed or not. Any time you have a ‘no’ in the allowed box, you would implement and document a control.

TyingIt together

In essence what we’ve produced is a high-level version of a data flow diagram (albeit not using standard programming taxonomy). We start by mapping the possible data flows, including devices and different physical and virtual locations, and at which phases in its lifecycle data can move between those locations. Then, for each phase of the lifecycle in a location, we determine which functions, people/systems, and more-granular locations for working

with the data are possible. We then figure out which we want to restrict, and what controls we need to enforce those restrictions. This looks complex, but keep in mind that you aren’t likely to do it for all data within an entire organization. For given data in a given application/implementation you’ll be working with a much more restrictive subset of possibilities. This clearly becomes more involved with bigger applications, but practically speaking you need to know where data flows, what’s possible, and what should be allowed, to design your security. In a future post we’ll show you an example, and down the road we also plan to produce a controls matrix which will show you where the different data security controls fit in. –Rich
Posted at Wednesday 10th August 2011 11:59 am Filed under: (1) Comments • Permalink
Tuesday, August 09, 2011

DataSecurityLifecycle2.0 andthe Cloud:LocationsandAccess
By Rich In our last post we reviewed the Data Security Lifecycle, but other than some minor wording changes (and a prettier graphic thanks to PowerPoint SmartArt) it was the same as our four-year-old original version. But as we mentioned, quite a bit has changed since then, exemplified by the emergence and adoption of cloud computing and increased mobility. Although the Lifecycle itself still applies to basic, traditional infrastructure, we will focus on these more complex use cases, which better reflect what most of you are dealing with on a day to day basis.

Locations
One gap in the original Lifecycle was that it failed to adequately address movement of data between repositories, environments, and organizations. A large amount of enterprise data now transitions between a variety of storage locations, applications, and operating environments. Even data created in a locked-down application may find itself backed up someplace else, replicated to alternative standby environments, or exported for processing by other applications. And all of this can happen at any phase of the Lifecycle. We can illustrate this by thinking of the Lifecycle not as a single, linear operation, but as a series of smaller lifecycles running in different operating environments. At nearly any phase data can move into, out of, and between these environments – the key for data security is identifying these movements and applying the right controls at the right security boundaries. As with cloud deployment models, these locations may be internal, external, public, private, hybrid, and so on. Some may be cloud providers, other traditional outsourcers, or perhaps multiple locations within a single data center. For data security, at this point there are four things to understand: 1. Where are the potential locations for my data? 2. 3. What are the lifecycles and controls in each of those locations? Where in each lifecycle can data move between locations?

4.

How does data move between locations (via what channel)?

Access
Now that we know where our data lives and how it moves, we need to know who is accessing it and how. There are two factors here: 1. Who accesses the data? 2. How can they access it (device & channel)?

Data today is accessed from all sorts of different devices. The days of employees only accessing data through restrictive applications on locked-down desktops are quickly coming to an end (with a few exceptions). These devices have different security characteristics and may use different applications, especially with applications we’ve moved to SaaS providers – who often build custom applications for mobile devices, which offer different functionality than PCs. Later in the model we will deal with who, but the diagram below shows how complex this can be – with a variety of data locations (and application environments), each with its own data lifecycle, all accessed by a variety of devices in different locations. Some data lives entirely within a single location, while other data moves in and out of various locations… and sometimes directly between external providers.

This completes our “topographic map” of the Lifecycle. In our next post we will dig into mapping data flow and controls. In the next few posts we will finish covering background material, and then show you how to use this to pragmatically evaluate and design security controls. –Rich
Posted at Tuesday 9th August 2011 3:44 pm Filed under: (3) Comments • Permalink

Introducingthe DataSecurityLifecycle2.0
By Rich Four years ago I wrote the initial Data Security Lifecycle and a series of posts covering the constituent technologies. In 2009 I updated it to better fit cloud computing, and it was incorporated into the Cloud Security Alliance Guidance, but I have never been happy with that work. It was rushed and didn’t address cloud specifics nearly sufficiently. Adrian and I just spent a bunch of time updating the cycle and it is now a much better representation of the real world. Keep in mind that this is a high-level model to help guide your decisions, but we think this time around we were able to identify places where it can more specifically guide your data security endeavors. (As a side note, you might notice I use “data security” and “information-centric security” interchangeably. I think infocentric is more accurate, but data security is more recognized, so that’s what I tend to use.) If you are familiar with the previous model you will immediately notice that this one is much more complex. We hope it’s also much more useful. The old model really only listed controls for data in different phases of the lifecycle – and didn’t account for location, ownership, access methods, and other factors. This update should better reflect the more complex environments and use cases we tend to see these days. Due to its complexity, we need to break the new Lifecycle into a series of posts. In this first post we will revisit the basic lifecycle, and in the next post we will add locations and access.

The lifecycle includes six phases from creation to destruction. Although we show it as a linear progression, once created, data can bounce between phases without restriction, and may not pass through all stages (for example, not all data is eventually destroyed). 1. Create: This is probably better named Create/Update because it applies to creating or changing a data/content element, not just a

document or database. Creation is the generation of new digital content, or the alteration/updating of existing content. 2. Store: Storing is the act committing the digital data to some sort of storage repository, and typically occurs nearly simultaneously with creation. Use: Data is viewed, processed, or otherwise used in some sort of activity. Share: Data is exchanged between users, customers, and partners. Archive: Data leaves active use and enters long-term storage. Destroy: Data is permanently destroyed using physical or digital means (e.g., cryptoshredding).

3. 4. 5. 6.

These high-level activities describe the major phases of a datum’s life, and in a future post we will cover security controls for each phase. But before we discuss controls we need to incorporate two additional aspects: locations and access devices. –Rich
Posted at Tuesday 9th August 2011 2:50 pm Filed under: (1) Comments • Permalink
Thursday, September 17, 2009

CloudDataSecurity:Store(RoughCut)
By Rich In our last post in this series, we covered the cloud implications of the Create phase of the Data Security Cycle. In this post we’re going to move on to the Store phase. Please remember that we are only covering technologies at a high level in this series on the cycle; we will run a second series on detailed technical implementations of data security in the cloud a little later.

Definition
Store is defined as the act of committing digital data to structured or unstructured storage (database vs. files). Here we map the classification and rights to security controls, including access controls, encryption and rights management. I include certain database and application controls, such as labeling, in rights management – not just DRM. Controls at this stage also apply to managing content in storage repositories (cloud or traditional), such as using content discovery to ensure that data is in approved/appropriate repositories.

StepsandControls
C!NT"!# !ccess "ontrols )ncryption ST"$CT$"%&'A((#)CAT)!N #$%& !ccess "ontrols !dministrator &eparation of #uties 'ield *evel )ncryption !pplication *evel )ncryption +ransparent #atabase )ncryption $NST"$CT$"%& 'ile &ystem !ccess "ontrols !pplication(#ocument %anagement &ystem !ccess "ontrols %edia )ncryption 'ile('older )ncryption ,irtual -rivate &torage

C!NT"!# .ights %anagement "ontent #iscovery

ST"$CT$"%&'A((#)CAT)!N !pplication *ogic +agging(*abeling "loud--rovided #atabase #iscovery +ool #atabase #iscovery(#!% #*-("%- #iscovery

$NST"$CT$"%& #istributed )ncryption +agging(*abeling )nterprise #.% "loud--rovided "ontent #iscovery #*-("%- "ontent #iscovery

AccessControls
One of the most fundamental data security technologies, built into every file and management system, and one of the most poorly used. In cloud computing environments there are two layers of access controls to manage – those presented by the cloud service, and the underlying access controls used by the cloud provider for their infrastructure. It’s important to understand the relationship between the two when evaluating overall security – in some cases the underlying infrastructure may be more secure (no direct back-end access) whereas in others the controls may be weaker (a database with multiple-tenant connection pooling). 1. DBMS Access Controls: Access controls within a database management system (cloud or traditional), including proper use of views vs. direct table access. Use of these controls is often complicated by connection pooling, which tends to anonymize the user between the application and the database. A database/DBMS hosted in the cloud will likely use the normal access controls of the DBMS (e.g., hosted Oracle or MySQL). A cloud-based database such as Amazon’s SimpleDB or Google’s BigTable comes with its own access controls. Depending on your security requirements, it may be important to understand how the cloud-based DB stores information, so you can evaluate potential back-end security issues. 2. Administrator Separation of Duties: Newer technologies implemented in databases to limit database administrator access. On Oracle this is called Database Vault, and on IBM DB2 I believe you use the Security Administrator role and Label Based Access Controls. When evaluating the security of a cloud offering, understand the capabilities to limit both front and back-end administrator access. Many cloud services support various administrator roles for clients, allowing you to define various administrative roles for your own staff. Some providers also implement technology controls to restrict their own back-end administrators, such as isolating their database access. You should ask your cloud provider for documentation on what controls they place on their own administrators (and superadmins), and what data they can potentially access. File System Access Controls: Normal file access controls, applied at the file or repository level. Again, it’s important to understand the differences between the file access controls presented to you by the cloud service, vs. their access control implementation on the back end. There is an incredible variety of options across cloud providers, even within a single SPI tier – many of them completely proprietary to a specific provider. For the purposes of this model, we only include access controls for cloud based file

3.

storage (IaaS), and the back-end access controls used by the cloud provider. Due to the increased abstraction, everything else falls into the Application and Document Management System category. 4. Application and Document Management System Access Controls: This category includes any access control restrictions implemented above the file or DBMS storage layers. In non-cloud environments this includes access controls in tools like SharePoint or Documentum. In the cloud, this category includes any content restrictions managed through the cloud application or service abstracted from the back-end content storage. These are the access controls for any services that allow you to manage files, documents, and other ‘unstructured’ content. The back-end storage can consist of anything from a relational database to flat files to traditional storage, and should be evaluated separately.

When designing or evaluating access controls you are concerned first with what’s available to you to control your own user/staff access, and then with the back end to understand who at your cloud provider can see what information. Don’t assume that the back end is necessarily less secure – some providers use techniques like bit splitting (combined with encryption) to ensure no single administrator can see your content at the file level, with strong separation of duties to protect data at the application layer.

Encryption
The most overhyped technology for protecting data, but still one of the most important. Encryption is far from a panacea for all your cloud data security issues, but when used properly and in combination with other controls, it provides effective security. In cloud implementations, encryption may help compensate for issues related to multitenancy, public clouds, and remote/external hosting. 1. Application-Level Encryption: Collected data is encrypted by the application, before being sent into a database or file system for storage. For cloud-based applications (e.g., public or private SaaS) this is usually the recommended option because it protects the data from the user all the way down to storage. For added security, the encryption functions and keys can be separated from the application itself, which also limits the access of application administrators to sensitive data. 2. Field-Level Encryption: The database management system encrypts fields within a database, normally at the column level. In cloud implementations you will generally want to encrypt data at the application layer, rather than within the database itself, due to the complexity. Transparent Encryption: Encryption of the database structures, files, or the media where the database is stored. For database structures this is managed by the DBMS, while for files it can be the DBMS or third-party file encryption. Media encryption is managed at the storage layer; never by the DBMS. Transparent encryption protects the database data from unauthorized direct access, but does not provide any internal security. For example, you can encrypt a remotely hosted database to prevent local administrators from accessing it, but it doesn’t protect data from authorized database users.

3.

4.

Media Encryption: Encryption of the physical storage media, such as hard drives or backup tapes. In a cloud environment, encryption of a complete virtual machine on IaaS could be considered media encryption. Media encryption is designed primarily to protect data in the event of physical loss/theft, such as a drive being removed from a SAN. It is often of limited usefulness in cloud deployments, although may be used by hosting providers on the back end in case of physical loss of media. File/Folder Encryption: Traditional encryption of specific files and folders in storage by the host platform. Virtual Private Storage: Encryption of files/folders in a shared storage environment, where the encryption/decryption is managed and performed outside the storage environment. This separates the keys and encryption from the storage platform itself, and allows them to be managed locally even when the storage is remote. Virtual Private Storage is an effective technique to protect remote data when you don’t have complete control of the storage environment. Data is encrypted locally before being sent to the shared storage repository, providing complete control of user access and key management. You can read more about Virtual Private Storage in our post. Distributed Encryption: With distributed encryption we use a central key management solution, but distribute the encryption engines to any end-nodes that require access to the data. It is typically used for unstructured (file/folder) content. When a node needs access to an encrypted file it requests a key from the central server, which provides it if the access is authorized. Keys are usually user or group based, not specific to individual files. Distributed encryption helps with the main problem of file/folder encryption, which is ensuring that everyone who needs it gets access to the keys. Rather than trying to synchronize keys continually in the background, they are provide at need.

5. 6.

7.

RightsManagement
The actual enforcement of rights assigned during the Create phase. For descriptions of the technologies, please see the post on the Create phase. In future posts we will discuss cloud implementations of each of these technologies in greater detail.

ContentDiscovery
Content Discovery is the process of using content or context-based tools to find sensitive data in content repositories. Content aware tools use advanced content analysis techniques, such as pattern matching, database fingerprinting, and partial document matching to identify sensitive data inside files and databases. Contextual tools rely more on location or specific metadata, such as tags, and are thus better suited to rigid environments with higher assurance that content is labeled appropriately. Discovery allows you to scan storage repositories and identify the location of sensitive data based on central policies. It’s extremely useful for ensuring that sensitive content is only located where the desired security controls are in place. Discovery is also very useful for supporting compliance initiatives, such as PCI, which restrict the usage and handling of specific types of data.

1.

Cloud-Provided Database Discovery Tool: Your cloud service provides features to locate sensitive data within your cloud database, such as locating credit card numbers. This is specific to the cloud provider, and we have no examples of current offerings. Database Discovery/DAM: Tools to crawl through database fields looking for data that matches content analysis policies. We most often see this as a feature of a Database Activity Monitoring (DAM) product. These tools are not cloud specific, and depending on your cloud deployment may not be deployable. IaaS environments running standard DBMS platforms (e.g., Oracle or MS SQL Server) may be supported, but we are unaware of any cloud-specific offerings at this time. Data Loss Prevention (DLP)/Content Monitoring and Protection (CMP) Database Discovery: Some DLP/CMP tools support content discovery within databases; either directly or through analysis of a replicated database or flat file dump. With full access to a database, such as through an ODBC connection, they can perform ongoing scanning for sensitive information. Cloud-Provided Content Discovery: A cloud-based feature to perform content discovery on files stored with the cloud provider. DLP/CMP Content Discovery: All DLP/CMP tools with content discovery features can scan accessible file shares, even if they are hosted remotely. This is effective for cloud implementations where the tool has access to stored files using common file sharing protocols, such as CIFS and WebDAV.

2.

3.

4. 5.

CloudSPI Tier Implications
Softwareas a Service(SaaS)
As with most security aspects of SaaS, the security controls available depend completely on what’s provided by your cloud service. Front-end access controls are common among SaaS offerings, and many allow you to define your own groups and roles. These may not map to back-end storage, especially for services that allow you to upload files, so you should ask your SaaS provider how they manage access controls for their internal users. Many SaaS offerings state they encrypt your data, but it’s important to understand just where and how it’s encrypted. For some services, it’s little more than basic file/folder or media encryption of their hosting platforms, with no restrictions on internal access. In other cases, data is encrypted using a unique key for every customer, which is managed externally to the application using a dedicated encryption/key management system. This segregates data between co-tenants on the service, and is also useful to restrict back-end administrative access. Application-level encryption is most common in SaaS offerings, and many provide some level of storage encryption on the back end. Most rights management in SaaS uses some form of labeling or tagging, since we are generally dealing with applications, rather than raw data. This is the same reason we don’t tend to see content discovery for SaaS offerings.

Platformas a Service(PaaS)
Implementation in a PaaS environment depends completely on the available APIs and development environment. When designing your PaaS-based application, determine what access controls are available and how they map to the provider’s storage infrastructure. In some cases application-level encryption will be an option, but make sure you

understand the key management and where the data is encrypted. In some cases, you may be able to encrypt data on your side before sending it off to the cloud (for example, encrypting data within your application before making a call to store it in the PaaS). As with SaaS, rights management and content discovery tend to be somewhat restricted in PaaS, unless the provider offers those features as part of the service.

Infrastructureas a Service(IaaS)
Your top priority for managing access controls in IaaS environments is to understand the mappings between the access controls you manage, and those enforced in the back-end infrastructure. For example, if you deploy a virtual machine into a public cloud, how are the access controls managed both for those accessing the machine from the Internet, and for the administrators that maintain the infrastructure? If another customer in the cloud is compromised, what prevents them from escalating privileges and accessing your content? Virtual Private Storage is an excellent option to protect data that’s remotely hosted, even in a multi-tenant environment. It requires a bit more management effort, but the end result is often more secure than traditional inhouse storage. Content discovery is possible in IaaS deployments where common network file access protocols/methods are available, and may be useful for preventing unapproved use of sensitive data (especially due to inadvertent disclosure in public clouds). –Rich
Posted at Thursday 17th September 2009 12:59 pm Filed under: (0) Comments • Permalink
Tuesday, September 08, 2009

CloudDataSecurityCycle:Create(RoughCut)
By Rich Last week I started talking about data security in the cloud, and I referred back to our Data Security Lifecycle from back in 2007. Over the next couple of weeks I’m going to walk through the cycle and adapt the controls for cloud computing. After that, I will dig in deep on implementation options for each of the potential controls. I’m hoping this will give you a combination of practical advice you can implement today, along with a taste of potential options that may develop down the road. We do face a bit of the chicken and egg problem with this series, since some of the technical details of controls implementation won’t make sense without the cycle, but the cycle won’t make sense without the details of the controls. I decided to start with the cycle, and will pepper in specific examples where I can to help it make sense. Hopefully it will all come together at the end. In this post we’re going to cover the Create phase:

Definition

Create is defined as generation of new digital content, either structured or unstructured, or significant modification of existing content. In this phase we classify the information and determine appropriate rights. This phase consists of two steps – Classifyand Assign Rights.

StepsandControls
< div class=”bodyTable”>

Structured'A++licatio $*structured * !pplication *ogic "lassify +ag(*abeling +ag(*abeling !ssign .ights*abel &ecurity )nterprise #.% Co*trol

Classify
Classification at the time of creation is currently either a manual process (most unstructured data), or handled through application logic. Although the potential exists for automated tools to assist with classification, most cloud and noncloud environments today classify manually for unstructured or directly-entered database data, while application data is automatically classified by business logic. Bear in mind that these are controls applied at the time of creation; additional controls such as access control and encryption are managed in the Store phase. There are two potential controls: 1. Application Logic: Data is classified based on business logic in the application. For example, credit card numbers are classified as such based on on field definitions and program logic. Generally this logic is based on where data is entered, or via automated analysis (keyword or content analysis) Tagging/Labeling: The user manually applies tags or labels at the time of creation e.g., manually tagging via drop-down lists or open fields, manual keyword entry, suggestion-assisted tagging, and so on.

2.

AssignRights
This is the process of converting the classification into rights applied to the data. Not all data necessarily has rights applied, in which cases security is provided through additional controls during later phases of the cycle. (Technically rights are always applied, but in many cases they are so broad as to be effectively non-existent). These are rights that follow the data, as opposed to access controls or encryption which, although they protect the data, are decoupled from its creation. There are two potential technical controls here: 1. Label Security: A feature of some database management systems and applications that adds a label to a data element, such as a database row, column, or table, or file metadata, classifying the content in that object. The DBMS or application can then implement access and logical controls based on the data label. Labels may be applied at the application layer, but only count as assigning rights if they also follow the data into storage. 2. Enterprise Digital Rights Management (EDRM): Content is encrypted, and access and use rights are controlled by metadata

embedded with the content. The EDRM market has been somewhat self-limiting due to the complexity of enterprise integration and assigning and managing rights.

CloudSPI Tier Implications
Softwareas a Service(SaaS)
Classification and rights assignment are completely controlled by the application logic implemented by your SaaS provider. Typically we see Application Logic, since that’s a fundamental feature of any application – SaaS or otherwise. When evaluating your SaaS provider you should ask how they classify sensitive information and then later apply security controls, or if all data is lumped together into a single monolithic database (or flat files) without additional labels or security controls to prevent leakage to administrators, attackers, or other SaaS customers. In some cases, various labeling technologies may be available. You will, again, need to work with your potential SaaS provider to determine if these labels are used only for searching/sorting data, or if they also assist in the application of security controls.

Platformas a Service(PaaS)
Implementation in a PaaS environment depends completely on the available APIs and development environment. As with internal applications, you will maintain responsibility for how classification and rights assignment are managed. When designing your PaaS-based application, identify potential labeling/classification APIs you can integrate into program logic. You will need to work with your PaaS provider to understand how they can implement security controls at both the application and storage layers – for example, it’s important to know if and how data is labeled in storage, and if this can be used to restrict access or usage (business logic).

Infrastructureas a Service(IaaS)
Classification and rights assignments depend completely on what is available from your IaaS provider. Here are some specific examples: • Cloud-based database: Work with your provider to determine if data labels are available, and with what granularity. If they aren’t provided, you can still implement them as a manual addition (e.g., a row field or segregated tables), but understand that the DBMS will not be enforcing the rights automatically, and you will need to program management into your application.



Cloud-based storage: Determine what metadata is available. Many cloud storage providers don’t modify files, so anything you define in an internal storage environment should work in the cloud. The limitation is that the cloud provider won’t be able to tie access or other security controls to the label, which is sometimes an option with document management systems. Enterprise DRM, for example, should work fine with any cloud storage provider.

This should give you a good idea of how to manage classification and rights assignment in various cloud environments. One exciting aspect is that use of tags, including automatically generated tags, is a common concept in the Web 2.0 world, and we can potentially tie this into our security controls. Users are better “trained” to tag content

during creation with web-based applications (e.g., photo sharing sites & blogs), and we can take advantage of these habits to improve security. –Rich
Posted at Tuesday 8th September 2009 10:19 am Filed under: (5) Comments • Permalink

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close