Transcription

A Case Study of the Capital One Data Breach(Revised)Nelson Novaes Neto, Stuart Madnick,Anchises Moraes G. de Paula, Natasha Malara BorgesWorking Paper CISL# 2020-16March 2020Cybersecurity Interdisciplinary Systems Laboratory (CISL)Sloan School of Management, Room E62-422Massachusetts Institute of TechnologyCambridge, MA 02142

A Case Study of the Capital One Data BreachNelson Novaes NetoCybersecurity at MIT Sloan,MIT Sloan School of ManagementMassachusetts Institute of [email protected] MadnickCybersecurity at MIT Sloan,MIT Sloan School of Management& MIT School of [email protected] Moraes G. de PaulaC6 [email protected] Malara BorgesC6 [email protected] an increasingly regulated world, with companies prioritizing a big part of their budget forexpenses with cyber security protections, why have all of these protection initiatives andcompliance standards not been enough to prevent the leak of billions of data points in recentyears? New data protection and privacy laws and recent cyber security regulations, such asthe General Data Protection Regulation (GDPR) that went into effect in Europe in 2018,demonstrate a strong trend and growing concern on how to protect businesses andcustomers from the significant increase in cyberattacks. Does the flaw lie in the existingcompliance requirements or in how companies manage their protections and enforcecompliance controls? The purpose of this research was to answer these questions by meansof a technical assessment of the Capital One data breach incident, one of the largest financialinstitutions in the U.S. This case study aims to understand the technical modus operandi ofthe attack, map out exploited vulnerabilities, and identify the related compliancerequirements, that existed, based on the National Institute of Standards and Technology(NIST) Cybersecurity Framework, version 1.1, an agnostic framework widely used in theglobal industry to provide cyber threat mitigation guidelines. The results of this researchand the case study will help government entities, regulatory agencies, and companies toimprove their cyber security controls for the protection of organizations and individuals.1. IntroductionTechnology is nowadays one of the main enablers of digital transformation worldwide. The use ofinformation technologies increases each year and directly impact changes in consumer behavior,development of new business models, and creation of new relationships supported by all the informationunderlying these interactions.Based on numerous cyberattacks reported by the media (Kammel, Pogkas, & al., 2019), organizations arefacing an increasing urgency to understand the threats that can expose their data as well as the need tounderstand and to comply with the emerging regulations and laws involving data protection within theirbusiness.As privacy has emerged as a priority concern, governments are constantly planning and approving newregulations that companies need to comply to protect consumer information and privacy (Gesser, Forester,& al., 2019), while the regulatory authorities throughout the world are seeking to improve transparency andresponsibility involving data breach. Regulatory agencies are imposing stricter rules, e.g. they aredemanding disclosure of data breaches, imposing bigger penalties for violating privacy laws, as well as usingregulations to promote public policies to protect information and consumers.Despite all efforts made by regulatory agencies and organizations to establish investments and properprotection of their operations and information (Dimon), cases of data leak in large institutions are becoming1

more frequent and involving higher volumes of data each time. According to our research, the number ofdata records breached increased from 4.3 billion in 2018 to over 11.5 billion in 2019 1.There are a number of frameworks, standards and best practices in the industry to support organizationsto meet their regulatory obligations and to establish robust security programs. For this research, theCybersecurity Framework version 1.1, published by the U.S. National Institute of Standards and Technology(NIST), a critical infrastructure resilience framework widely used by U.S. financial institutions, will beconsidered as a basis for compliance evaluation. 2For the purpose of this paper, we selected U.S. bank Capital One as the object of study due to the severityof the security incident they faced in July 2019.The main research goals and questions of this study are: Analyze the Capital One data breach incident;Based on Capital One data breach incident - Why were compliance controls and Cybersecuritylegislations insufficient to prevent the data breach?The result of this study will be valuable to support executives, governments, regulators, companies andspecialists in the technical understanding of what principles, techniques, and procedures are needed for theevolution of the normative standards and company’s management in order to reduce the number of databreach cases and security incidents.2. Related ArticlesThe academic literature related to the objective of this research is very limited, since the Capital One databreach incident was very recent, and few cyber security incidents have enough information public availableto provide a detailed technical analysis.Very few incidents have enough technical records public available, indeed. Salane (Salane, 2009) describesthe great difficulty associated with studies regarding data leaks: “Unfortunately, the secrecy that typicallysurrounds a data breach makes answers hard to find. ( ) In fact, the details surrounding a breach maynot be available for years since large scale breaches usually result in various legal actions. The partiesinvolved typically have no interest in disclosing any more information than the law requires.” Such courtrecords are a rich resource for research, since it provides detailed investigation on the cause of the incidents,including details of the modus operandi of the attack and, eventually, existing compliance controls.Due to the high relevance of Capital One data leak to US consumers, an extensive news coverage exist, whichprovided valuable help for this paper. The most extensive report, the indictment at US District Court atSeatle, is available online, including the detailed FBI investigation report (US District Court at Seattle,2019). In addition, many cyber security consulting companies published blog posts with technical analysisof the incident, such as CloudSploit (CloudSploit, 2019). American journalist Brian Krebs also covered thestory, providing some additional technical details about the incident (Krebs, 2019). With such amount ofinformation available, it was possible to identify the technical details that describe how the cyberattack tookplace.3. Methodological ConsiderationsThis research required the production of preliminary studies that were relevant to this project, allowing theconstruction of a database with the latest information on data leak incidents that took place betweenJanuary 2018 and December 20191. This included the identification of relevant information on the type of1 Details of this research will appear in a forthcoming report.NIST published a Cybersecurity Framework in 2014 that provides guidelines to protect the critical infrastructure fromcyberattacks, organized in five domains. This Cybersecurity Framework is adopted by financial institutions in the U.S.to guide the information security strategy and it is formally recommended by the governance agencies, such as theFederal Financial Institutions Examination Council (FFIEC).22

incidents, who was the target (organization and geography), existence of a technical assessment of themodus operandi of the attacks and the regulations related to the organizations that suffered the attacks.This research required the availability of technical and trustworthy information regarding the details of theattacks, as well as which regulations were applied at the companies that suffered the data breach. Thecorrelation between the type of data, organizations, country, region, technical details of the attacks, as wellas regulations and laws involved are important to answer the key question of the study: Why wereconformity controls and Cybersecurity laws insufficient to prevent data breaches?One of the greatest difficulties for understanding the modus operandi of the successful attacks thatcompromised billions of records in the recent years is obtaining detailed information on the attack’s vectors,threats, exploited vulnerabilities, technical details of the technological environments and what were theTTPs (Tactics, Techniques, and Procedures) used to compromise the data. Unfortunately, many companiesdo not disclose the details of the incidents while some will only report and notify clients that their data wascompromised, either to comply with regulations, e.g. EU General Data Protection Regulation (GDPR), orinvoluntarily due to disclosure of details of the incidents by hackers, researchers, the media, or other ways.To properly understand the chain of events that led to the incident related to this case study, the MITREATT&CK (Adversarial Tactics, Techniques, and Common Knowledge) framework was adopted to helpassessing the TTPs behind each technical step that played a significant role in the success of the cyberattackanalyzed. 3 Different from NIST Framework, MITRE ATT&CK is not a compliance and control framework;instead, it is a framework for mapping each one of a list of well-known cyberattack techniques, describingtheir TTPs and related mitigation and detection recommendations. As a result, it helped to determine thesecurity controls that failed or should have been in place to mitigate the attack.Our background research comprised:1.2.3.4.This case study containing a detailed analysis to identify and understand the technical modusoperandi of the attack, as well as what conditions allowed a breach and the related regulations;Technical assessment of the main regulations related to the case study;Answer to the question: Why were the regulations insufficient to protect the data and what arethe recommendations for an effective protection?Recommendations for regulatory agencies, organizations, and entities.4. Technical Criteria for Selection of the Case StudyThe first step of the technical analysis was to assess the public records available, if any, about the data leakattacks that were included in the Database of Data Leaks that was built for this study. The objective was toidentify the techniques that were deployed in the cyberattack and, as a result, to map the security controlsthat might have failed.This study considered as trustworthy sources the targeted companies themselves, third party companiesinvolved in the incident investigation and in the response to the cyberattack, information published in legaltestimonies and reports provided to regulating agencies, such as the U.S. Security and ExchangeCommission (SEC).4.1Criteria for regulations analysis (Compliance)The regulatory scenario is large and permeates several segments in the industry worldwide. When it comesto Cybersecurity, there are strong regulations in the Health and Finance industries (TCDI), among whichthe most well-known regulations include the Health Insurance Portability and Accountability Act (HIPAA)for healthcare and the Sarbanes Oxley (SOX) and Payment Card Industry – Data Security Standard (PCIDSS) for the financial industry, in addition to the numerous legislations applicable to a particular country3An extensive ATT&CK description is available online at https://attack.mitre.org.3

or region such as the General Data Protection Regulation (GDPR) in the European Union, the BrazilianGeneral Personal Data Protection Act (LGPD) and a number of laws in other countries such as the UnitedStates. Due to this diversity, it is more productive to select an agnostic framework that is widely used in theindustry and offers a mitigation guideline to cyber threats. Thus, the Cybersecurity Framework, version 1.1,published in 2018 by the National Institute of Standards and Technology (NIST) was selected.4.2Criteria for Case Study SelectionTo choose the Case Study, a survey for a target (company or entity) that suffered a data leak incidentbetween January 2018 and December 2019 was performed under the following two criteria:1.Had enough technical details publicly available about the incident, and;2.Public information was available about the regulations to which they were subject and existingcompliance report.Most of the public stories about data leak incidents in 2018 and 2019 did not cover technical details aboutthe incident or had enough information about compliance information on the targeted organization.Usually, press reports only cover superficial information about the type and the extent of the incident.A rare exception was the data breach of U.S. bank Capital One. The incident, which was the result of anunauthorized access to their cloud-based servers hosted at Amazon Web Service (AWS), took place onMarch 22 and 23, 2019. However, the company only identified the attack on July 19, resulting in a databreach that affected 106 million customers (100 million in the U.S. and 6 million in Canada) (Capital One 1, 2019). Capital One’s shares closed down 5.9% after announcing the data breach, losing a total of 15%over the next two weeks (Henry, Capital One Shares Fall Nearly 6% After Breach, 2019). A class actionlawsuit seeking unspecified damages was filed just days after the breach became public (Reeves, 2019).The Capital One case stood out in this research because there is a lot of public information available on thecase, including the FBI investigation report (US District Court at Seattle, 2019). Based on the abundance ofdetails about the incident, as well as the relevant impact to U.S. consumers, the Capital One incident waschosen for the Case Study. In addition, Capital One meets the research criteria since it is an organizationworking in a highly regulated industry, and the company abides to existing regulations.5. Hypothesis ProcedureThe initial hypothesis of this study was that the current global regulations, normative standards and lawson cybersecurity do not provide the proper guidance nor protection to help companies avoid new data leakincidents.An additional hypothesis is that the institutions were deficient in implementing and/or maintaining thecontrols required by existing regulations.The recent cases of data leaks from large institutions did not result in a quick evolution of the existingstandards and cybersecurity policies to minimize or prevent the occurrence of new leaks. For instance, inthe Equifax incident in May 2017, criminals stole credit files from 147 million Americans, as well as Britishand Canadian citizens and millions of payment card records. Equifax will have to pay up to US 700 millionUS dollars in fines, as part of a settlement with federal authorities (Whittaker, FTC slaps Equifax with a fineof up to 700M for 2017 data breach, 2019). The Capital One data breach in 2019 impacted 106 millioncustomers (Capital One - 1, 2019), an initial impact not too much different from the Equifax breach. Theeditor of news channel TechCrunch, Zack Whittaker, claimed the Capital One data breach was inevitablebecause probably nothing was done by the industry after the Equifax incident (Whittaker, Capital One’sbreach was inevitable, 2019):“Companies continue to vacuum up our data — knowingly and otherwise — and don’t do enough to protectit. As much as we can have laws to protect consumers from this happening again, these breaches willcontinue so long as the companies continue to collect our data and not take their data securityresponsibilities seriously. We had an opportunity to stop these kinds of breaches from happening again, yetin the two years passed we’ve barely grappled with the basic concepts of internet security.”4

6. Case Study: Capital One6.1Capital One adoption of technologyCapital One is the fifth largest consumer bank in the U.S. and eighth largest bank overall (Capital One,2020), with approximately 50 thousand employees and 28 billion US dollars in revenue in 2018 (CapitalOne - 2, 2019).Capital One works in a highly regulated industry, and the company abides to existing regulations, such as“the New York Stock Exchange (“NYSE”) corporate governance rules, the Sarbanes-Oxley Act of 2002, theDodd-Frank Wall Street Reform and Consumer Protection Act of 2010, and the implementing rules of theSecurities and Exchange Commission (SEC) thereunder (or any other legal or regulatory requirements,as applicable)” (Capital One - 3, 2019). In addition, Capital One is a member of the Financial Services SectorCoordinating Council (FSSCC), the organization responsible for proposing improvements in theCybersecurity framework, which was selected for this research. We also found job advertisements at CapitalOne’s Career website available online in December 2019 where Capital One was looking for Managers withexperience in the NIST framework, which demonstrates that the company had adopted it (Capital One - 4,2019) (Capital One - 5, 2019) (Capital One - 6, 2019).Capital One is an organization that values the use of technology and it is a leading U.S. bank in terms ofearly adoption of cloud computing technologies. According to its 2018 annual investor report (Capital One- 2, 2019), Capital One claims that “We’re Building a Technology Company that Does Banking”. Withinthis mindset, the company points out that “Today, 85% of our technology workforce are engineers. CapitalOne has embraced advanced technology strategies and modern data environments. We have adoptedagile management practices, ( ). We harness highly flexible APIs and use microservices to deliver anddeploy software.” In addition, the report highlights that “The vast majority of our operating and customerfacing applications operate in the cloud ( ).”Capital One was one of the first banks in the world to invest in migrating their on-premise datacenters to acloud computing environment, which played a key role in the data leak incident in 2019. Indeed, Amazonlists Capital One as a renowned case study (AWS, 2018) once the company has been expanding the use ofcloud computing for key financial services since 2014 to reduce its datacenter footprint. From 8 datacentersin 2014, the last 3 are expected to be decommissioned by 2020 (Magana, 2019). In addition, Capital Oneworked closely with AWS to develop a security model to enable operating more securely in the cloud.According to George Brady, executive vice president at Capital One, “Before we moved a single workload,we engaged groups from across the company to build a risk framework for the cloud that met the samehigh bar for security and compliance that we meet in our on-premises environments.” (AWS, 2018)6.2Technical Assessment of the Capital One IncidentDespite the strong investments on IT infrastructure, in July 2019 Capital One disclosed that the companyhad sensitive customer data assessed by an external individual. According to Capital One’s public reportreleased on July 29, 2019 (Capital One - 1, 2019), “On July 19, 2019, we determined that an outsideindividual gained unauthorized access and obtained certain types of personal information from CapitalOne credit card customers and individuals ( ).” The company claimed that compromised datacorresponded to “personal information Capital One routinely collects at the time it receives credit cardapplications, including names, addresses, zip codes/postal codes, phone numbers, e-mail addresses, datesof birth, and self-reported income.” The unauthorized access “affected approximately 100 millionindividuals in the United States and approximately 6 million in Canada”.According to the FAQ published by Capital One (Capital One - 7, 2019), the company discovered the incidentthanks to their Responsible Disclosure Program on July 17, 2019, instead of being discovered by regularcybersecurity operations. The FBI complaint filed with the Seattle court (US District Court at Seattle, 2019)displays an e-mail from an outsider informing that data from Capital One’s customers was available on aGitHub page (see screenshot extracted from FBI report).5

Figure 1: E-mail reporting supposed leaked data belonging to Capital OneCapital One reported via a press release (PRNewswire, 2019) that some of the stolen data was encryptedbut the company did not provide any detail on how it was possible for the attacker to access the information:“We encrypt our data as a standard. Due to the particular circumstances of this incident, the unauthorizedaccess also enabled the decrypting of data.”According to the FBI investigations, “Federal agents have arrested a Seattle woman named Paige A.Thompson for hacking into cloud computing servers rented by Capital One, ( ).” The press soon realizedthat, according to her LinkedIn profile, Thompson worked previously at Amazon (Sandler, 2019). Inaddition, the U.S. Department of Justice accused Paige Thompson of stealing additional data from morethan 30 companies, including an unnamed state agency, a telecommunications conglomerate, and a publicresearch university (U.S. Attorney’s Office, 2019). Thompson created a scanning software tool that allowedher to identify cloud computing servers with misconfigured firewalls, allowing the execution of commandsfrom outside to penetrate and to access these servers.The complaint filed with the Seattle court indicates that FBI investigations identified a script hosted on aGitHub repository that was deployed to access the Capital One data stored in their cloud servers,compromising three commands allowing the unauthorized access: the first command was used ”to obtainsecurity credentials ( ) that, in turn, enabled access to Capital One’s folders”, a second one “to list thenames of folders or buckets of data in Capital One’s storage space”, and a third command “to copy datafrom these folders or buckets ( ).” In addition, “A firewall misconfiguration allowed commands to reachand to be executed at Capital One’s server.” FBI adds that Capital One checked its computer logs to confirmthat the commands was in fact executed.After analyzing the records of the Seattle Court, cloud security company CloudSploit published an analysisof the incident in its corporate blog (CloudSploit, 2019), describing that the access to the vulnerable serverwas possible thanks to a Server-Side Request Forgery (SSRF) attack 4 that was able to bypass themisconfigured Web Application Firewall (WAF) solution deployed by Capital One: “An SSRF attack tricksa server into executing commands on behalf of a remote user, enabling the user to treat the server as aproxy for his or her requests and get access to non-public endpoints.”American journalist Brian Krebs also concluded that the attacker ran an SSRF attack that exploited amisconfigured WAF tool. Krebs added (Krebs, 2019): “Known as “ModSecurity,” 5 this WAF is deployed4 Server-Side Request Forgery, (SSRF) is a software vulnerability class where servers can be tricked into connecting toanother server it did not intend to, them making a request that’s under the attacker’s control (Abma, 2017), enablingan attacker to send crafted requests from the back-end server of a vulnerable web application (O'Donnell, 2019).5Modsecurity is a popular open-source, host-based Web Application Firewall (WAF) solution.6

along with the open-source Apache Web server to provide protections against several classes ofvulnerabilities that attackers most commonly use to compromise the security of Web-based applications.”(Figure 2) provides a summary of how the attacker got access to the vulnerable server and executed thecommands that led to the access to sensitive data stored in AWS S3 buckets 6.Figure 2: Capital One attackThe reports from FBI, CloudSploit and Mr. Brian Krebs made it possible to figure out the steps taken duringthe cyberattack, as presented in Figure 2:1.The FBI and Capital One identified several accesses through anonymizing services such as TORNetwork and VPN service provider IPredator, both used to hide the source IP address of the maliciousaccesses;2.The SSRF attack allowed the criminal to trick the server into executing commands as a remote user,which gave the attacker access to a private server;3.The WAF misconfiguration allowed the intruder to trick the firewall into relaying commands to adefault back-end resource on the AWS platform, known as the metadata service with temporary credentialsfor such environment (accessed through the URL http://169.254.169.254);4.By combining the SSRF attack and the WAF misconfiguration, the attacker used the als” to obtain the AccessKeyId and SecretAccessKeycredentials from a role described in the FBI indictment as “*****-WAF-Role” (name was partially redacted).The resulting temporary credentials allowed the criminal to run commands in AWS environment via API,CLI or SDK;5.By using the credentials, the attacker ran the “ls” command 7 multiple times, which returned acomplete list of all AWS S3 Buckets of the compromised Capital One account (" aws s3 ls");6 Amazon launched its Simple Storage Service (S3) in 2006 as a platform for data storage. Since then, S3 buckets havebecome one of the most commonly used cloud storage tools.“ls” is a command available at AWS’s command-line interface that list objects and common prefixes under a prefix orall Simple Storage Service (S3) buckets.77

6.Lastly, the attacker used the AWS “sync” command 8 to copy nearly 30 GB of Capital One creditapplication data from these buckets to the local machine of the attacker (" aws s3 sync s3://bucketone.").This command gave the attacker access to more than 700 buckets, according to the FBI report.The steps described above can be mapped within the specific stages of the MITRE ATT&CK framework, asshown in Table 1 below. The ATT&CK framework also describes, for each known attack technique, the mainrecommendations for mitigation and detection controls that can be used whenever applicable. Therefore,MITRE ATT&CK Framework provides a valuable help by identifying the faulty security controls that madethe incident possible.StageStep of the attackATT&CKCommand andControlUse TOR to hide accessT1188 - Multi-hop Proxy (MITRE,2018)Initial AccessUse SSRF attack to run commandsT1190 - Exploit Public-FacingApplication (MITRE, 2018)Initial AccessExploit WAF misconfiguration to relay thecommands to the AWS metadata serviceClassification unavailable 9Initial AccessObtain access credentials (AccessKeyId andSecretAccessKey)T1078 - Valid Accounts (MITRE,2017)ExecutionRun commands in the AWS command lineinterface (CLI)T1059 - Command-Line Interface(MITRE, 2017)DiscoveryRun commands to list the AWS S3 BucketsT1007 - System Service Discovery(MITRE, 2017)ExfiltrationUse the sync command to copy the AWS bucketdata to a local machineT1048 - Exfiltration OverAlternative Protocol (MITRE,2017)Table 1: List of attack steps mapped to MITRE ATT&CK Framework6.3Technical Assessment of the Regulations Applied to Capital OneTo support this article and the selection of the NIST Cybersecurity Framework both regulatory aspectsrequired by US governance instruments and the best practices were studied.Based on the analysis regarding the regulatory framework applied to Capital One, it was possible tounderstand the security guidelines provided by Federal Financial Institutions Examination Council(FFIEC), which is a mandatory cybersecurity-related banking regulation in the United States (Miller, 2015).The FFIEC assumes that the COSO structure (ISACA Control Objectives for Enterprise IT Governance) isthe framework elected to support the information security strategy of the financial institutions, associatedwith the NIST Cybersecurity Framework.According to information made available by Capital One in their investors’ webpage (Capital One - 8, 2019),in the scope of Corporate Governance Capital One states that “The Board of Directors has adoptedCorporate Governance Guidelines to formalize the Board’s governance practices and to provide its view“sync” is a command available at AWS’s command-line interface that recursively copies new and updated files fromthe source directory to a specific destination.8MITRE ATT&CK has no specific category that represents the exploitation of a misconfigured cyber security control ortool.98

of effective governance. ( ) The Board reviews and periodically updates these principles and practices aslegal, regulatory, and best practice developments evolve.”Capital One follows governance practices regarding cyber security and applied normative frameworks.Indeed, to map the best-practices that Capital One’s professionals follow, we investigated the jobdescriptions for Capital One's open positions (Capital One, n.d.) to confirm that the abilities and knowledgerelated to the NIST Cybersecurity Framework are required for those positions.While there are numerous regulatory requirements and global standards and best practices coveringcybersecurity, this research focused on NIST framework since it is the most comprehensive one.6.4Assessment of Technical Controls Versus Normative StandardsApplied to the Capital One IncidentThis assessment focused on technical controls that could prevent the Capital One data leak incident,according to the incident details published in the U.S. Department of Justice report (US District Court atSeattle, 2019), as described in session 6.2. In addition, the MITRE ATT&CK framework were used to helpmap the CSF NIST domains and controls related to the Capital One incident.For each step performed by the attacker, Table 2 lists the related technical controls and NIST controls,compromising a total of 61 potential NIST security controls that could have b

A Case Study of the Capital One Data Breach (Revised) Nelson Novaes Neto, Stuart Madnick, . In an increasingly regulated world , with companies prioritizing a big part of their budget for expenses with cyber security protections, why all of these protection initiatives and have . (NIST), a critical infrastructure resilience framework widely .