Open access peer-reviewed chapter

Data Collection Techniques for Forensic Investigation in Cloud

By Thankaraja Raja Sree and Somasundaram Mary Saira Bhanu

Submitted: June 13th 2018Reviewed: October 15th 2018Published: September 30th 2020

DOI: 10.5772/intechopen.82013

Downloaded: 104


Internet plays a vital role in providing various services to people all over the world. Its usage has been increasing tremendously over the years. In order to provide services efficiently at a low cost, cloud computing has emerged as one of the prominent technologies. It provides on-demand services to the users by allocating virtual instances and software services, thereby reducing customer’s operating cost. The availability of massive computation power and storage facilities at very low cost motivates a malicious individual or an attacker to launch attacks from machines either from inside or outside the cloud. This causes high resource consumption and also results in prolonged unavailability of cloud services. This chapter surveys the systematic analysis of the forensic process, challenges in cloud forensics, and in particular the data collection techniques in the cloud environment. Data collection techniques play a major role to identify the source of attacks by acquiring evidence from various sources such as cloud storage (Google Drive, Dropbox, and Microsoft SkyDrive), cloud log analysis, Web browser, and through physical evidence acquisition process.


  • distributed denial of service attacks
  • digital forensics
  • network forensics
  • web forensics
  • cloud forensics
  • mobile forensics

1. Introduction

In today’s world, users are highly dependent on the cyberspace to perform all day-to-day activities. With the widespread use of Internet technology, cloud computing plays a vital role by providing services to the users. Cloud computing services enable vendors (Amazon EC2, Google, etc.) to provide on-demand services (e.g., CPU, memory, network bandwidth, storage, applications, etc.) to the users by renting out physical machines at an hourly basis or by dynamically allocating virtual machine (VM) instances and software services [1, 2, 3]. Cloud computing moves application software and databases to large data centers, where the outsourcing of sensitive data and services is not trustworthy. This poses various security threats and attacks in the cloud. For instance, the attackers use employee login information to access the account remotely with the usage of cloud [4]. Besides attacking cloud infrastructure, adversaries can also use the cloud to launch an attack on other systems. For example, an adversary can rent hundreds of virtual machine (VM) instances to launch a distributed denial-of-service (DDoS) attack. A criminal can also keep secret files such as child pornography, terrorist documents, etc. in cloud storage to remain clean. To investigate such crimes involved in the cloud, investigators have to carry out forensic investigations in the cloud environment. This arises the need for cloud forensics, which is a subset of network forensics. Cloud forensics is an application of scientific principles, practices, and methods to reorganize the events through identification, collection, preservation, examination, and reporting of digital evidence [5]. Evidence can reside anywhere in the cloud and it is more complex to identify the traces located in the cloud server.

The advancement of new technologies, frameworks, and tools enables the investigator to identify the evidence from trusted third parties, that is, cloud service provider (CSP). There are numerous techniques in cloud forensics that arises on the basis of cloud service models and deployment models. In the Software as a Service (SaaS) and Platform as a Service (PaaS) models, the customer does not have any control of the hardware and they need to depend on CSP for collecting the evidence, whereas, in the case of Infrastructure as a Service (IaaS) model, customers can acquire the virtual machine (VM) image and logs.

The forensic examiner isolates the attacked system in the virtualized environment by segregating and protecting the information from a hard disk, RAM images, log files, etc. This evidence is analyzed based on the artifacts of the attack traces left by the attacker [6, 7]. The forensic investigator relies on finding a series of information such as where, why, when, by whom, what, and how attack has happened. This chapter details the challenges in cloud forensics and also details the data collection techniques in the cloud.

2. Types of forensics

The forensic process is initiated after the crime occurs as a post-incident activity. It follows a set of predefined steps to identify the source of evidence. It is categorized into five groups, namely digital forensics, network forensics, Web forensics, cloud forensics, and mobile forensics.

  • Digital forensics: According to National Institute of Standards and Technology (NIST) standards, it is the application of science to the identification, collection, examination, and analysis of data while preserving the integrity of the information and maintaining a strict chain of custody for the data.

  • Network forensics: It identifies and analyzes the evidence from the network. It retrieves information on which network ports are used to access the information.

  • Web forensics: It identifies the evidence from the user history, temporary log files, registry, chat logs, session log, cookies, etc. as digital crimes occur on the client side with the help of Web browser.

  • Cloud forensics: It is the application of digital forensics in the cloud and it is a subset of network forensics. It is harder to identify evidence in cloud infrastructure since the data are located in different geographical areas. Some examples of evidence sources are system log, application log, user authentication log, database log, etc.

  • Mobile forensics: It is the branch of digital forensics that identifies evidence from mobile devices. The evidence is collected from the mobile device as call history, SMS, or from the memory.

2.1 Cloud forensic process flow

The cloud forensic process flow is shown in Figure 1, which is described as follows:

  • Identification: The investigator identifies whether crime has occurred or not.

  • Evidence collection: The investigator identifies the evidence from the three different sources of cloud service model (SaaS, IaaS, and PaaS) [8]. The SaaS model monitors the VM information of each user by accessing the log files such as application log, access log, error log, authentication log, transaction log, data volume, etc. The IaaS monitors the system level logs, hypervisor logs, raw virtual machine files, unencrypted RAM snapshots, firewalls, network packets, storage logs, backups, etc. The PaaS model identifies the evidence from an application-specific log and accessed through API, patch, operating system exceptions, malware software warnings, etc.

  • Examination and analysis: The analyst inspects the collected evidence and merges, correlates, and assimilates data to produce a reasoned conclusion. The analyst examines the evidence from physical as well as logical files where they reside.

  • Preservation: The information is protected from tampering. The chain of custody has been maintained to preserve the log files since the information is located in a different geographical area.

  • Presentation and reporting: An investigator makes an organized report to state his findings about the case.

Figure 1.

Cloud forensic process flow.

3. Evidence collection

Evidence collection plays a vital role to identify and access the data from various sources in the cloud environment for forensic investigation. The evidence is no longer stored in a single physical host and their data are distributed across a different geographical area. So, if a crime occurs, it is very difficult to identify the evidence. The evidence is collected from various sources such as router, switches, server, hosts, VMs, browser artifacts, and through internal storage media such as hard disk, RAM images, physical memory, etc., which are under forensic investigation. Evidence is also collected through the analysis of log files, cloud storage data collection, Web browser artifacts, and physical memory analysis.

3.1 Cloud log analysis

Logging is considered as a security control which helps to identify the operational issues, incident violations, and fraudulent activities [9, 10]. Logging is mainly used to monitor the system and to investigate various kinds of malicious attacks. Cloud log analysis helps to identify the source of evidence generated from various devices such as the router, switches, server, and VM instances and from other internal components, namely hard disk, RAM images, physical memory, log files etc., at different time intervals. The information about different types of attacks is stored in various log files such as application logs, system logs, security logs, setup logs, network logs, Web server logs, audit logs, VM logs, etc., which are given as follows:

  • Application log is created by the developers through inserting events in the program. Application logs assist system administrators to know about the situation of an application running on the server.

  • System log contains the information regarding date and time of the log creation, type of messages such as debug, error, etc., system-generated messages related to the occurrence, and processes that are affected by the occurrence of an event.

  • Firewall log provides information related to source routed packets, rejected IP addresses, outbound activities from internal servers, and unsuccessful logins.

  • Network log contains detailed information related to different events that happened on the network. The events include recording malicious traffic, packet drops, bandwidth delays, etc. The network administrator monitors and troubleshoots daily activities by analyzing network logs for different intrusion attempts.

  • Web server log records entries related to the Web pages running on the Web server. The entries contain history for a page request, client IP address, date and time, HTTP code, and bytes served for the request.

  • Audit log records unauthorized access to the system or network in a sequential order. It assists security administrators to analyze malicious activities at the time of attack. The information in audit log files includes source and destination addresses, user login information, and timestamp.

  • VM log records information specific to instances running on the VM, such as startup configuration, operations, and the time VM instance finishes its execution. It also records the number of instances running on VM, the execution time of each application, and application migration to assist CSP in finding malicious activities that happen during the attack.

Due to the increase in usage of network or new release of software in the cloud, there is an increase in the number of vulnerabilities or attacks in the cloud and these attacks are reflected in various log files. Application layer attacks are reflected in various logs, namely access log, network log, authentication log, etc., and also reflected in the various log file traces stored on Apache server. These logs are used for forensic examination to detect the application layer attacks. Table 1 indicates the various attack information and the tools used for log analysis of different types of attacks. Figure 2 shows the sample access log trace (Table 2).

  • Sample Network Log Entry

    [**] [1:1407:9] SNMP trap udp [**] [Classification: Attempted Information Leak] [Priority: 2] 03/12–15:14:09.082119 - > UDP TTL:118 TOS:0x0 ID:29101 IpLen:20 DgmLen:87.

  • Sample Firewall Log Entry

    03/12/2015 8:14:07 AM,"Rule ““Block Windows File Sharing”“ blocked (, netbios-ssn(139)).”,"Rule ““Block Windows File Sharing”“ blocked (, netbios-ssn(139)). Inbound TCP connection. Local address,service is (KENT(,netbios-ssn(139)). Remote address,service is (,39922). Process name is ““System”“.”

    03/12/2015 9:04:04 AM,Firewall configuration updated: 398 rules., Firewall configuration updated: 398 rules.

  • Sample Syslog Entries

    Mar 1 06:25:43 server1 sshd[23170]: Accepted publickey for server2 from port 21,011 ssh2.

    Mar 1 07:16:42 server1 sshd[9326]: Accepted password for murugiah from port 1070 ssh2.

    Mar 1 07:16:53 server1 sshd[22938]: reverse mapping checking getaddrinfo for ip10.165.nist.govfailed - POSSIBLE BREAKIN ATTEMPT!

    Mar 1 07:26:28 server1 sshd[22572]: Accepted public key for server2 from port 30,606 ssh2.

    Mar 1 07:28:33 server1 su: BAD SU kkent to root on /dev/ttyp2.

    Mar 1 07:28:41 server1 su: kkent to root on /dev/ttyp2.

Types of logAttacksTools for log analysis
DMesg logThis is not a log file, but this is used for determining anomalous activity from recent bots.
Debugging logStack tracing to determine the nature of application and service-based attacks.
Firewall logDirect method for auditing the firewall.Event Log Analyzer, event logging and monitoring services
System logDetermines if someone is trying or has executed buffer overflow.Syslog-ng, Log & Event Manager
Network logDetermining Web-based attacks and DDoS attacks.Splunk, Log4j2
Web server access logDetermining Web-based attacks (XSS, XSRF, SQLI), remote file inclusion, local file inclusion and flooding attacks.Nihuo Web Log Analyzer
Web server error logDetermining Web-based attacks.Nihuo Web Log Analyzer
VM logDetermining hypervisor-related attacks.Virtual Machine Log Auditor, JVM controller
Authentication logAuditing of attacks on credentials and determines the unauthorized access.
Audit logDetermining unauthorized user access to the system and network. Includes destination addresses, user login information, and timestamp.WP Security Audit Log, auditpol.exe
Database logDetermining database-related attacks.Splunk, Nihuo Web Log Analyzer

Table 1.

Different types of logs, attacks, and the log analysis tool.

Figure 2.

Sample access log trace as evidence.

S. No.FieldsValueDescription
1Remote Host10.1.3.122IP address of the HTTP user who makes HTTP resource request
2Rfc931Identifier used to determine client
3UsernameUser name or user id used for authentication
4Date: time Timezone[17-Mar-2015: 10: 49: 33 + 530]Date and timestamp of the HTTP request
5HTTP requestGET/scripts/root.exe?/c+dir/HTTP/1.0HTTP request containing
(a) HTTP method—GET
(b) HTTP request resource scripts/root.exe?/c+dir/ and
(c) HTTP protocol version −1.0
6Status code200Status of HTTP request, i.e., success or failure
7Bytes578Number of bytes of data transferred during the HTTP request
8Referral URL
OLCLD/view.php?q = book/
Referrer header of the HTTP request (containing URL of the page from which this request was initiated) if present, and “-” otherwise
9User agentMozilla/4.08 [en] (Win98; I; Nav)Browser Identification String

Table 2.

Description of the access log format.

3.2 Evidence collection from cloud storage

It is the process of collecting evidence from cloud storage such as Dropbox, Microsoft SkyDrive, Google drive, etc., using the Web browser and also by downloading files using existing software tools [11, 12, 13]. This helps to identify the illegal modification or access of cloud storage during the uploading or downloading of file contents in storage media and also checks whether the attacker alters the timestamp information in user’s accounts. The Virtual Forensic Computing (VFC) tool is used by forensic investigators to identify evidence from VM image file. The evidence is accessed for each account using the Web browser running in the cloud environment by recording the encoded value of VM image. The packets are captured using network packet tools, namely Wireshark, snappy, etc., of each VM instance running in hosts. The account information is synchronized and downloaded using client accessing software of each device which is used to identify the source of evidence. The evidence is isolated from the files found in VM using “C:\Users\[username]\ Dropbox\” for Dropbox as shown in Figure 3. The zip file contains the name of the folder that can be accessed via the browser to determine the effect of a timestamp in a drive. If an attacker modifies the contents of a file, the evidence is found by analyzing the VM hard drive, history of files stored in the cloud, and also from a cache. It can also be analyzed by computing the hash value of the VM image. The evidence of Google Drive cloud storage is depicted in Figure 4.

Figure 3.

Dropbox evidence.

Figure 4.

Google Drive evidence.

3.3 Evidence collection via a Web browser

The clients communicate with the server in the cloud environment with the help of a Web browser to do various tasks, namely checking email and news, online shopping, information retrieval, etc. [14, 15, 16, 17, 18]. Web browser history is a critical source of evidence. The evidence is found by analyzing the URLs in Web browser history, timeline analysis, user browsing behavior, and URL encoding, and is recovered from deleted information. Here is an example of Web browser URLs,<Foldername>,<FolderID>.

Similarly, the evidence stored in Web browser cache at the root directory of a Web application is used to identify the source of an attack. Table 3 indicates the evidence collection process and recovery method for various Web browsers.

Web browserInformation to be analyzedTools for forensic investigationRecovery method for evidence identification
Internet ExplorerIndex.dat
Web historian 6.13
Index.dat analyzer 2.5
Net analysis 1.52
Encase 6.3
FTK 3.3
Recovery from internet files
Analyzing the index.dat files weekly/daily history
Recovery of the evidence from index.dat file through carving method
Recovery from cookies
Google ChromeBookmark history
Bookmark downloads
List of search words
Chrome analysis 1.0
Net analysis 1.52
Cache back 3.17
Recovery of session file through carving method
Mozilla FirefoxHistory
Cookies history
Download list
Firefox forensic 2.3
Net analysis 5.2
Cache back 3.17
Encase 6.3
FTK 3.3
Recovery of cache files
Web historian 6.13
Net analysis 1.52
Cache back 3.17
Encase 6.3
FTK 3.3
Recovery of session files, cookies
Web historian 6.13
Net analysis 1.52
Cache back 3.17
Encase 6.3
Recovery of cookies

Table 3.

Evidence collection process and recovery method for different Web browsers.

Here is an example of a Chrome forensic tool that captures and analyzes data stored in Google Web browser. It analyzes the data from the history, web logins, bookmarks, cookies, and archived history. It identifies the evidence from C:\Users\USERNAME\Appdata\Local\Google chrome\UserData\Default. Figure 5 depicts the Google Chrome analysis forensic tool.

Figure 5.

Chrome forensic analysis tool.

3.4 Physical memory analysis

This has the ability to provide caches of cloud computing usage that can be lost without passive monitoring such as network socket information, encryption keys, and in-memory database. They are analyzed from the physical memory dump using the “pslist” function, which recovers the process name, process identifier, parent process identifiers, and process initiation time. The processes can be differentiated using the process names ©exe© on the Windows, and ©sync© on the Ubuntu and Mac OS. Table 4 indicates the evidence collection process for cloud forensics in cloud storage and cloud log analysis.

Forensic analysis frameworkEvidence collection for cloud storageEvidence collection for cloud log analysis
Evidence identificationIdentification of evidence from cloud storage (Dropbox, iCloud, SkyDrive and Google Drive, etc.) and also from user account informationIdentification of evidence from cloud log files
Evidence collectionCollecting the evidence from VM image to access the cloud storage account, using packet analysis tools such as Ethernet cap, Wireshark tool, Burp suite, etc. to capture packets between the client and server
Collecting evidence from VM browser such as Google Chrome, chromium browser, Internet Explorer, Apple Safari, Mozilla Firefox, etc.
Collecting the evidence from cloud storage namely, user account and password
Collecting the evidence from client software to access the VM hard drive and also to synchronize the user account to retrieve the files and folders in VMs
Collecting the evidence from various sources in VM as log files, namely network log, access log, authentication log, error log, database log, etc. and through network analysis tools such as Wireshark, Snort, Snappy tool, Burp Suite, etc.
Evidence analysisIdentifying patterns from the evidence collection process to determine the source of attacks in cloud environmentDetermining the attack patterns from cloud log files and analyzing these patterns using cloud traceback mechanism to identify the source of evidence.
Evidence presentation and reportingForensic investigator examines the evidence and presents the evidence in courtIdentifying the evidence from analysis and reporting the evidence

Table 4.

Evidence collection process for cloud forensics.

4. Cloud forensics challenges

This section elucidates the forensic challenges in private and public cloud. It is observed from the literature that most of the challenges are applicable to the public cloud while fewer challenges are applicable to the private cloud environment.

4.1 Accessibility of logs

Logs are generated in different layers of the cloud infrastructures [2, 3, 4, 5, 6, 7]. System administrators require relevant logs to troubleshoot the system, developers need logs for fixing up the errors, and forensic investigators need relevant logs to investigate the case. With the help of an access control mechanism, the logs can be acquired from all the parties, that is, from a user, CSP, and forensic investigator.

4.2 Physical inaccessibility

The data are located in different geographical areas of the hardware device. It is difficult to access these physical access resources since the data reside in different CSPs and it is impossible to collect the evidence from the configured device. If an incident occurs, all the devices are acquired immediately in case of a private cloud environment since an organization has full control over the resources. The same methods cannot be used to access the data in case of a public cloud environment.

4.3 Volatility of data

Data stored in a VM instance in a cloud will be lost when the VM is turned off. This leads to the loss of important evidence such as syslog, network logs, registry entries, and temporary Internet files. It is important to preserve the snapshot of the VM instance to retrieve the logs from the terminated VMs. The attacker launches an attack and turns off the VM instance, hence these traces are unavailable for forensic investigation.

4.4 Identification of evidence at client side

The evidence is identified not only in the provider’s side but also the client side. The user can communicate with the other client through the Web browser. An attacker sends malicious programs with the help of a Web browser that communicates with the third parties to access the services running in the cloud. This, in turn, leads to destroying all the evidence in the cloud. One way of collecting the evidence is from the cookies, user agent, etc., and it is difficult to obtain all the information since the client side VM instance is geographically located.

4.5 Dependence of CSP trust

The consumers blindly depend on CSPs to acquire the logs for investigation. The problem arises when CSPs are not providing the valid information to the consumer that resides in their premises. CSPs sign an agreement with other CSPs to use their services, which in turn leads to loss of confidential data.

4.6 Multitenancy

In cloud infrastructures, multiple VMs share the same physical infrastructure, that is, the logs are distributed across various VMs. The investigator needs to show the logs to court by proving the malicious activities occurring from the different service providers. Moreover, it also preserves the privacy of other tenants.

4.7 Decentralization

In cloud infrastructures, the log information is located on different servers since it is geographically located. Multiple users’ log information may be collocated or spread across several layers and tiers in the cloud. The application log, network log, operating system log, and database log produce valuable information for a forensic investigation. The decentralized nature of the cloud brings the challenge for cloud synchronization.

4.8 Absence of standard format of logs

Logs are available in heterogeneous formats from different layers of a cloud at CSP. The logs provide information such as by whom, when, where, and why some incidents occurred. This is an important bottleneck to provide a generic solution for all CSPs and all types of logs. Table 5 indicates the survey of literature that deals with the challenges of cloud forensics mainly for evidence collection process.

AuthorsDiscussionForensic process
Sang et al.Log accessibility for SaaS & PaaSEvidence collection
Zawood et al.Focus on the integrity of log filesEvidence collection
Dystra et al.Log collection and accessibility of logsEvidence collection
Thorpe et al.VM kernel logs for forensic investigationLog contention
Boeck et al.Confidentiality and log integrityEvidence collection
Zaferulla et al.Uses Eucalyptus logs for forensic investigationEvidence analysis
Marty et al.Collection of logs from different cloud componentsLog retention
Sibiya et al.Uses data mining techniques to collect logs for forensic investigationEvidence collection
Patrascu et al.Collection of specific logsEvidence collection
Nakahara et al.Evidence identification from different types of logsEvidence collection and log retention

Table 5.

Challenges of cloud forensics.

5. Forensic tools

There are many tools to identify, collect, and analyze the forensic data for investigation. Juel et al. developed the PORs tool for the identification of online archives for providing integrity and privacy of files [19]. Dykstra et al. proposed a forensic tool for acquiring the cloud-based data in management plane [6]. It ensures trust in cloud infrastructures. Moreover, Encase and Access data FTK toolkit are used for the identification of trusted data to acquire the evidence. Similarly, tools such as evidence finder and F-response are used to find the evidence related to social networks. Dystra et al. proposed FROST, an open source OpenStack cloud tool for the identification of evidence from virtual disks, API logs, firewall logs, etc. [20].

6. Open research problems in cloud forensics

Many researchers have proposed various solutions to mitigate the challenges of cloud forensics. Some of the researchers have proposed new approaches to test the attacks in real-time environment. CSPs have not adopted the proposed solutions yet. Customers or investigators rely on CSPs to collect the necessary logs since they do not have direct physical access. Customers or investigators depend on CSP to collect the various information from the registry, hard disk, memory, log files etc. Even though various forensic acquisition process is proposed still the dependence of CSP remain unsolved. The critical issue is the usage of bandwidth resources. If the cloud storage is too high, then it results in more utilization of bandwidth. There is insufficient work evolved to preserve the chain of custody to secure provenance. There is no ideal solution for cybercrime scene reconstruction and preservation of evidence. Another critical issue is based on the modification of existing forensic tools that may lose evidence. Some researchers have proposed logging as a service to provide confidentiality, integrity, and authentication [3]. This solution is not suitable for IaaS cloud.

7. Case study

This section introduces a hypothetical forensic case study related to a cloud storage service and also describes a forensic investigation of the case.

7.1 Case study: cloud storage

The organization “X” found that their document named as “X_new.pdf” about the new release of a product has been leaked to their competitor [21, 22, 23, 24]. “Mr. Morgan” was managing the credential files of the document stored in the cloud. At the initial stage of the investigation process, the suspect of the leaked file case was “Mr. Morgan.” The forensic investigator has to identify the suspect by checking the organization network, or by the analysis of log files, or by collecting the trace of relevant file in the network. Mr. Morgan’s network does not have any clue about the secrets since he uses only the personal computer (PC) and Android phone for business. To identify the suspects, the forensic investigator seized the PC and Android phone since these are the target devices used by the adversary. From the suspected devices, the leaked file has not been detected. Later, the investigator started analyzing the unallocated area in the file system, operating system, external devices such as hard drive, tablets, etc., and the Web service, but no evidence was found in the investigation. The investigator found that the Dropbox was installed in the PC and five files of config.db have been accessed recently. The forensic investigator issued the search warrant and identified the evidence in Dropbox by accessing Morgan’s Dropbox storage with the username and password. It was observed that Morgan recently uploaded five files in Dropbox and identified that one of the files named as “XYZ_new.pdf” had the same contents as “X_new.pdf.” Later, he deleted the traces of uploading or downloading the contents in PCs. The investigator found that Mr. Morgan has deleted the traces of the file contents and shared the evidence stored as “XYZ_new.pdf” with the competitor through an external SD card.

7.2 Case study: online railway ticket fraud

An online railway ticket booking service provider claimed that some unknown user had used the internet ticket booking facility to book 44 railway tickets using the stolen credit card details [25]. It has been charged back from the credit card companies for all transactions which led to a huge loss to the service provider. It is inferred from the investigation that the suspects have booked 44 tickets with different names of a person through the website at different locations. Later on, through investigation, the investigator found that the suspects arrived from a particular IP address, thereby seized the contents of the user accounts with the password, and the stolen credit cards were recovered from the suspects.

7.3 Case study: morphed photographs

The user got threatening pornographic emails from the adversary that one photograph was posted on the popular website [25]. The IP address for posting such threatening emails on the website was retrieved and was traced to a company. During an investigation, it was observed that the emails were sent from the company premises from one of the terminals. The log records and cookies were examined from the seized system and the morphed photographs were found in one of the systems used by the suspect. The mirror image of the hard disk was collected and analyzed using disk imaging and forensic analysis tools to recover all the data files required for the case. At the end of the investigation, it was found that the suspect was an ex-colleague of the company.

7.4 Case study: malicious insiders

Mr. X is an intruder who intends to exploit victims by sending malicious Web page in the cloud [26]. He uses a vulnerability to exploit the cloud presence of Buzz Coffee, a legitimate company. He installs a rootkit that injects a malicious payload into Web pages displayed and hides the malicious activity from the operating system. It redirects victims to the website, which infects them with malware. The users complain to the legitimate company that they are being infected, so the company went to the investigator to investigate the case by finding all the traces of the malicious Web page to identify the malicious user.

7.5 Case study: ransomware attack

A securities and brokerage firm became a victim of a ransomware attack [26]. The hacker demanded a ransom of two Bitcoins for each system that was infected. During the investigation process, it was observed that several other critical systems were infected with the same ransomware. Emails with malicious attachments appeared to be originating from a foreign location and were identified as the source of infection. The organization decided to take a proactive approach toward security with the focus on real-time monitoring to thwart such attacks in the future.

8. Conclusion

Cloud computing offers on-demand services (CPU, memory, network bandwidth, storage, applications, etc.) to users by allocating virtual instances and software services. Security is a major concern in the cloud wherein investigation of security attacks and crimes are very difficult. Due to the distributed nature of attacks and crimes in cloud, there is a need for efficient security mechanism. As cloud logs are spread across different virtual/physical machines (VM instances), switches, routers, etc., and also the customer (end user) is not aware of the activities of VM instances, cybercriminals exploit these sources to exhaust all the resources running in the cloud. Hence, evidence collection plays a crucial role to identify the suspects. However, collecting logs from the cloud infrastructure is extremely difficult because the investigator/security analyst has to depend on CSPs for collecting the logs and they have little control over the infrastructure. So, in order to identify the suspicious activity involved in the cloud, this chapter surveys the various forensic processes, evidence collection techniques for cloud forensics and the various challenges faced in cloud environment for forensic investigation.

Conflict of interest

The author does not have any conflict of interest.

© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Thankaraja Raja Sree and Somasundaram Mary Saira Bhanu (September 30th 2020). Data Collection Techniques for Forensic Investigation in Cloud, Digital Forensic Science, B Suresh Kumar Shetty and Pavanchand Shetty H, IntechOpen, DOI: 10.5772/intechopen.82013. Available from:

chapter statistics

104total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Detectability of the Psychotropic Substance Cannabis in Head or Body Hair: Update of Forensic Criminalistics

By Thorsten Floren M.A.

Related Book

First chapter

Geomatics and Forensic: Progress and Challenges

By Pablo Rodríguez‐Gonzálvez, Ángel Luis Muñoz‐Nieto, Sandra Zancajo‐Blázquez and Diego González‐Aguilera

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us