Dr. Jean-Claude Franchitti

Learning Objectives

By the end of this section, you will be able to:

Understand the meaning of cybersecurity
Learn how to secure information and communication
Learn how cybersecurity can be used to protect software solutions
Learn how cybersecurity can be used to protect Internet mobile/web applications
Understand how cybersecurity can be used to protect cloud-centric solutions
Relate to cybersecurity as it applies to Industry 4.0 metaverse smart ecosystems
Relate to cybersecurity as it applies to Industry 5.0 supersociety solutions

As discussed in the previous section, broadscale adoption of cyber resources brings challenges as it seeks to ensure TRM qualities such as security, performance, and reliability. This section focuses on enforcing cybersecurity assurance, which is the confidence that every effort is made to protect IT solutions against undesirable use. Cybersecurity assurance is an in-depth topic, and due to lack of space, this section will provide only a brief overview.

What Is Cybersecurity?

The field of cybersecurity includes the policies, procedures, technology, and other tools, including people on which organizations rely to protect their computer systems and information systems environments from digital threats. Cybersecurity focuses on five categories of security: network, application, critical infrastructure, IoT, and cloud. To assess an organization’s cyber risks and develop cybersecurity assurance, you should ask the following questions:

What is the threat model?
Who are the attackers, and what are their capabilities, motivations, and access?
What are the risks, vulnerabilities, and likeliness of breach per risk assessment?
What are the technical and nontechnical countermeasures, and how much will the countermeasures cost, including both direct and indirect costs? (Nontechnical countermeasures include laws, policies, procedures, training, auditing, and incentives. Indirect costs can be reputation, or future business.)
What assets do you seek to protect? (This question relates to security policies that address confidentiality, integrity, service availability, privacy, and authenticity.)
Who and what do you trust to help maintain cybersecurity?

Cybersecurity threats include behaviors such as

breaching privacy by revealing confidential information such as corporate secrets, private data, or personally identifiable information (PII);
damaging integrity/authenticity by destroying records, altering data, or installing unwanted software (e.g., spambot, spyware); and
denying access to a service through activities such as crashing a website for political reasons, causing a denial-of-service attack, or allowing only certain individuals to have access to services.

In 2001, the Open Web Application Security Project (OWASP) was launched with the purpose of securing web applications. This model was only concerned with actionable controls and possible risks. The controls were focused on securing the risks involved with the development and deployment of the applications (Table 14.3).

Rank	Risk
1	Broken access control
2	Cryptographic failures
3	Injections
4	Insecure design
5	Security misconfigurations
6	Vulnerable and outdated components
7	Identification and authentication flaws
8	Software and data integrity failures
9	Security logging and monitoring flaws
10	Server-side request forgery

Table 14.3 Open Web Application Security Project Top 10 (Source: Indusface. “OWASP Top 10 Vulnerabilities in 2021: How to Mitigate Them? February 24, 2022. https://www.indusface.com/learning/what-are-the-owasp-top-10-risks-2021/)

Link to Learning

To achieve security, we must eliminate defects and design flaws in systems and make them harder for hackers to exploit. This includes developing a foundation for deeply understanding the networked information systems we use and build. We also must be aware that no system is completely secure. To learn more, read this article about the security mindset from the Journal of Cybersecurity.

Why Is Cybersecurity Important?

In 2023, the average cost of a data breach was $4.45 million globally, which, in just three years, was a 15% increase over 2020.¹² These costs cover expenses incurred to discover and respond to breaches, lost revenue from downtime, and long-term damages to the business reputation and brand. Cybersecurity Ventures recently shared its top ten cybersecurity predictions and statistics for 2024, unveiling the alarming fact that global cybercrime financial damage will likely reach $10.5 trillion by 2025. Based on this total, if cybersecurity is regarded as a country, it would rank third, trailing only the United States and China as the world’s largest economy.¹³ Personally identifiable information (PII) is generally the target of cybercriminals. They collect information, such as names, addresses, identification numbers, and credit card information, and sell them in the dark web or other underground marketplaces. This results in several consequences for these organizations, including regulatory fines, legal action, and the loss of customer trust.

Cybersecurity is increasingly important as the cost of breaches increases. However, a comprehensive cybersecurity strategy based on best practices and using machine learning, advanced analytics, and artificial intelligence (AI) can effectively combat cyber threats and reduce the impacts of breaches.

Concepts In Practice

Think like a Hacker

To implement effective cybersecurity policies and procedures, cybersecurity experts need to understand hackers, including how they think and how they are likely to target an organization. What does it mean to think like a hacker?

Skilled hackers tend to be inquisitive, with an in-depth understanding of technology and its capabilities. They stay abreast of technological advances, including cybersecurity measures. They tend to be creative thinkers with excellent analytical skills, which makes them capable of finding innovative ways to circumvent cybersecurity features and hack into a system. They also tend to understand human nature, including our weaknesses and vulnerabilities.

To think like a hacker, you must develop these skills and learn to use them with a hacker mindset, constantly reviewing your system and recognizing how it can be hacked. This includes prioritizing security and remaining aware that hackers constantly look for weaknesses that enable them to exploit a system. As part of this, learn about ethical hacking, which includes penetration testing, and use ethical hacking practices to ensure your system is up-to-date and ready to withstand cyberattacks.

Domains of Cybersecurity and Associated Cryptography Techniques

Cybersecurity includes various domains that comprise overall cybersecurity. Common cybersecurity domains include the following:

Protection of computer systems and networks that ensure security at a national scale, economic health, and public safety, called infrastructure security. The National Institute of Standards and Technology (NIST) cybersecurity framework is meant to help organizations in this area. Additional guidance is provided by the U.S. Department of Homeland Security (DHS).
Protection of wired and wireless (Wi-Fi) computer networks from intrusions, called network security. The designs for operating systems, virtual machines, and monitors must include protections against unauthorized use.
Protection of applications on premise and in the cloud, with security integrated during the design phase for data handling and user authentication, called application security. The process of authentication confirms the identity and authorization of people and devices that use a system. Many resources support these efforts, including books and online materials on security engineering, “robust” programming, and secure programming for specific operating systems.
Protection of the data, digital files, and other information a system maintains, called information security. For example, database systems must protect against SQL injection by ensuring that queries issued to the database do not somehow contain malignant code that could compromise the security of the database and its infrastructure. Information security requires data protection measures, such as the General Data Protection Regulation (GDPR), that secure sensitive data from unauthorized access, exposure, or theft.

These domains function cooperatively to create a cybersecurity risk management plan (Figure 14.7).

Illustration of components of a cybersecurity risk management plan: identification, evaluation, mitigation, monitoring.

Figure 14.7 Cybersecurity should be based on a comprehensive risk management plan that identifies and evaluates cybersecurity risks, designs and implements practices to mitigate those risks, and monitors the system and identifies areas that need to be updated. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

An important pillar of cybersecurity assurance is nonrepudiation, which provides proof of data’s origin, authenticity, and integrity. Nonrepudiation is achieved through cryptography using, for example, digital signatures, which are used in online transactions to ensure that a party cannot later deny placing an order or denying its signature’s authenticity. In addition to digital signatures, nonrepudiation is also used in digital contracts and emails. Email nonrepudiation involves methods such as email tracking to assure senders that messages are delivered and also provides recipients with proof of the sender’s identity. This ensures neither party can deny that a message was sent, received, and processed.

Beyond basic securing of information systems, cybersecurity requires creating and governance of processes that protect organizations and individuals against costly breaches. End-user education is a critical part of this process. Organizations must promote security awareness to enhance endpoint security. For example, training users to delete suspicious email attachments and avoid using unknown USB devices can help.

Disaster recovery and business continuity planning are also essential to minimize disruption to key operations. This requires tools and procedures to respond to cybersecurity incidents, natural disasters, and power outages. Data storage is another critical area. Protection measures that promote data resilience with safeguards include encryption and immutable and isolated data copies that can quickly be restored to recover data and minimize the impact of a cyberattack.

Evolvability is an important cyber quality, but the evolution of platforms on which information systems may be deployed creates a need for new security measures. For example, mobile solutions security requires specific protection measures to protect applications, containers, and mobile mail. Furthermore, when using the cloud, organizations must ensure confidential computing by encrypting data at rest (i.e., while data are stored in the cloud), during transfer, and during processing. This ensures that processes meet business requirements and regulatory compliance standards and supports customer privacy.

Cybersecurity Misconceptions

Because of the constant evolution of technology, there are also several misconceptions surrounding cybersecurity risks.

Threats only come from outside the organization. Instead, breaches can involve people within the organization working maliciously with external hackers or organized groups.
Risks are known and can be predicted. In reality, the attack surface is constantly expanding with new vulnerabilities, and human error by negligent employees or contractors can also increase the risk of data breaches.
Attack vectors are limited. Cybercriminals constantly discover new attack vectors through various environments.
The industry is safe. Every industry faces cybersecurity risks, and adversaries exploit communication networks across the governments and private sector. Ransomware attacks are expanding to nonprofits and local governments, and threats are increasing in other areas.¹⁴

Supply Chain and Security Issues

After the COVID-19 pandemic, companies still struggle to get necessary goods and services. Many products are still in ports and warehouses waiting to be delivered because there are not enough truck drivers in the workforce. In the wake of the chip manufacturing plant shortage, the automobile industry started to recover in 2023. For many months, car dealership lots were empty of new cars. There is a constant strain on workers to get goods and materials to the customers quicker. Amazon has set a precedence for the delivery rate of goods that the rest of the world is struggling to keep up with.

The world has seen increased cyber threats because more businesses are being conducted online. If you go to any sports stadium or concert event, most of them involve cashless transactions. All transactions must be done with credit or debit cards. This means there are many more chances for your information to be leaked because of the number of companies with which you are interfacing. Granted, it is your choice to use these businesses, but remember, security is only as strong as its weakest link. As our culture demands we go to cashless transactions, the number of available attack surfaces for hackers increases. If we implement the concepts of supersocieties and immerse ourselves in this technology, then if a security vulnerability compromises us, it may do irreparable harm to our lifestyle.

Common Cyber Threats

Cyber threats are constantly evolving, and there are various common cyber threats of which programmers should be aware. Malicious software variants such as viruses, worms, Trojans, spyware, and botnets that allow for unauthorized access or can damage computers is called malware. Modern malware attacks often bypass traditional detection methods, such as antivirus scans for malicious files, by being “fileless.” Types of malware include the following:

Viruses contain code that propagates or replicates across systems by arranging to have itself eventually executed, creating additional, new instances of itself; it generally infects by altering stored documents or code (e.g., boot sector or memory-resident code), typically with the help of a user.
Worms contain code that replicates itself across a network, infecting connected computers without user intervention.
Rootkits modify the operating system (e.g., modify system exploration utilities, replace the target OS with a virtual machine monitor that can attack systems) to hide their existence.
A backdoor is a concealed feature or command within a program that enables users to execute actions they normally would not be permitted to do; sometimes called a trapdoor (e.g., Easter egg in DVDs and software).
A Trojan horse is software that seems to serve a beneficial purpose but is designed to execute covert malicious activities. An example is spyware, which can be installed by seemingly legitimate programs and then provides remote access to the computer for activities such as keylogging or sending back documents.
Botnets involve a network of compromised machines (bots) under (unified) control of an attacker (botmaster); once the botmaster has control, the attacker has access to the devices and their system, enabling the attacker to steal data, execute scams, and perform other malicious tasks.

The scenarios that enable malware to run include

a vulnerable client, such as a browser, connecting to a remote system that delivers an attack;
exploitation of a network-accessible service with buffer overflow vulnerability,
malicious code introduced into a system component during manufacturing, through a compromised software provider, or via a man-in-the-middle (MitM) attack;
using the autorun functionality, especially through the insertion of a USB device;
deceiving a user using social engineering into running or installing malicious software; and
an attacker with local access directly downloading or running malicious software, potentially using a “local root” exploit for elevated privileges.

Some of the dangers that can occur as a result of malware include generating a pop-up message with a brag, exhort, or extort; trashing files; damaging hardware; launching external activity; stealing information; and keylogging via screen, audio, or camera capture or via file encryption, such as ransomware. Malware that encrypts data and demands a ransom to unlock or prevent data exposure is called ransomware. An insider threat is posed by current or former employees, partners, or contractors who misuse their access. It can also include vulnerabilities intentionally created by programmers, such as malware. The form of social engineering that tricks users into providing personal information through fake emails or text messages posing as legitimate companies is called phishing.

Additional threats included a distributed denial-of-service (DDoS) attack, which overloads a server with traffic to crash a server, website, or network. Multiple coordinated systems often overwhelm enterprise networks by attacking devices such as modems, printers, and routers that use the Simple Network Management Protocol (SNMP). An advanced persistent threat (APT) is used by intruders to infiltrate systems to spy on business activities and steal sensitive data while remaining undetected and leaving the networks and systems intact. A man-in-the-middle attack is used by cybercriminals to eavesdrop on and intercept communications between two parties in order to steal data (most often on unsecured Wi-Fi).

Global Issues in Technology

Technology Skills Gap

A big challenge in technology is the growing issue of the skills gap. There is a mismatch between what is needed in the industry and what students learn in colleges and universities and industry certificates. Many companies are advertising for software engineers with three to five years’ experience and not entry-level positions. The demand on companies forces them to look for workers who can hit the ground running with little supervision. The need for entry-level jobs is increasing, but the pay scale is not what people can afford to take after graduating from college. The amount of debt students are accruing to earn their degrees is increasing, and students are forced to take jobs they know they won’t like, only to get the experience to jump to a better-paying job. The skill gap is increasing in the areas of cybersecurity and data analytics.

The cybersecurity skills gap is framed from the perspective of individuals in the industry repurposing themselves. People who have been in the computer industry are retooling themselves to be part of the cybersecurity demands. If you are an IT professional with years of experience and have a certificate or degree in cybersecurity, it is hard to start over at the lower end of the job pool. These individuals want management or higher-end jobs, but they have only entry-level skills. This creates a divide because these individuals have the experience to be successful in the industry and don’t need as much assistance as a pure entry-level employee.

Key Cybersecurity Technologies and Best Practices

Key cybersecurity technology and associated best practices typically fall under three categories: identity and access management, a data security platform, and security information and event management.

The technology used to manage users’ roles and access privileges is identity and access management (IAM). Approaches used to allow access to systems include single sign-on (SSO), multifactor authentication, privileged user accounts, and user life cycle management. SSO keeps users from entering their credentials multiple times within the same session. Multifactor authentication leverages multiple access credentials provided via different devices. Privileged user accounts are only accessible to specific users. User life cycle management handles user credentials from initial registration to retirement. Cybersecurity professionals use IAM tools to investigate and respond to security issues remotely and contain breach damages.

A data security platform is meant to automate the proactive protection of information via monitoring and detecting data vulnerabilities and risks across multiple environments, including hybrid and multicloud platforms. The protection provided by a data security platform simplifies compliance with data privacy regulations and supports backups and encryption to keep data safe.

The practice of security information and event management (SIEM) focuses proactively on the automated detection and remediation of suspicious user activities based on the analysis of security events. SIEM solutions typically analyze user behavior and use artificial intelligence (AI) techniques to detect and remediate suspicious activities according to the organization’s risk management guidelines. SIEM tools may be integrated with security orchestration, automation, and response (SOAR) platforms designed to fully automate the management of cybersecurity incidents without human intervention.

Some organizations also choose to use a zero-trust strategy, which assumes the system is compromised and sets up controls to continuously validate every user, device, and connection in the system for authenticity and purpose.

Link to Learning

Cybersecurity is big business, and many companies in the technology industry have developed cybersecurity solutions and services that they sell to other organizations. In particular, IBM provides a variety of such solutions and services to cover all aspects of cybersecurity, including AI and cloud security.

Securing Information and Communication

Cryptography is an essential tool for securing information systems, which use codes to protect data and other information. With cryptography, information is encrypted and accessible only by those authorized to decrypt and use the information.

Cryptography can help ensure properties such as confidentiality (i.e., secrecy, privacy), integrity (i.e., tamper resilience), authenticity, availability, and nonrepudiability (or deniability).

As shown in Figure 14.8, the components of cryptography include the following:

plaintext: refers to the data that need protection
encryption: handled by an algorithm, and creates ciphertext and an encryption key for plaintext
ciphertext: uses an encryption key to scramble plaintext
decryption: also handled by an algorithm, and uses a decryption key to transform the ciphertext back into plaintext
encryption key: value known to the sender of the data; by inputting the encryption key into the encryption algorithm, the sender converts the plaintext into ciphertext
decryption key: value known to the receiver of the data; by inputting the decryption key into the decryption algorithm, the receiver converts the ciphertext back into plaintext

Illustration of cryptography components: sender, plain text, encryption algorithm (encryption key text), cipher text (interceptor), decryption algorithm (decryption key text), plain text, receiver.

Figure 14.8 Cryptography is a process that enables senders and receivers to secure digital data by using encryption and decryption algorithms. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit key icon: modification of "Accessible" by Phạm Thanh Lộc/Wikimedia Commons, CC BY 4.0)

Cryptography can be used to create different types of ciphers, including the following:

Substitution: Ciphertext is created by substituting plaintext characters, bits, or character blocks with alternate characters, bits, or character blocks. The substitution can be monoalphabetic, meaning that if the letter D is enciphered as P in one part of the ciphertext, D will be used for P throughout the message. The substitution can also be polyalphabetic, meaning that while D may be enciphered as P in one part of the message, in another part, D may be encoded as a different letter.
Transposition: Instead of substituting the letters and characters, they are rearranged using a specific algorithm, such as writing a message vertically to produce a ciphertext that is read horizontally. Transposition ciphers are also known as permutation ciphers.
Polygraphic: Substitution is performed on two or more blocks of letters simultaneously.

Cryptography can also be performed using asymmetric encryption (also known as public-key cryptography), which uses private and public keys. With public-key cryptography, the sender and receiver have a preshared secret key to handle encryption and decryption. In public-key cryptography, the sender uses the receiver’s public key to encrypt the message and then send that ciphertext to the receiver. Only the receiver has the private key that is needed to decrypt the message.

Authentication and Passwords

Authentication confirms the identity and authorization of people and devices that use a system. It is a vital two-step process to help ensure that only authorized users have access to a system. Authentication first requires identification followed by verification to establish and confirm a user’s unique credentials and ensure the user is authorized.

A password is a secret string of characters used to gain entry into a system. It is a critical part of authentication. To function as a cybersecurity tool, passwords must be secure, which can be challenging. Attackers can steal passwords by guessing, installing a hardware or software keylogger, finding written passwords, obtaining them via social engineering/phishing, intercepting the password over the network, or stealing them from a service or third party.

To address these cyber threats, the DHS’s Cybersecurity and Infrastructure Security Agency offers three tips for passwords:¹⁵

Create long passwords with at least 16 characters.
Make passwords random by following NIST’s recommended password rules.
Ensure that passwords are unique by using a different password for every access point in a system.

Access Control

A vital part of cybersecurity is access control, which regulates the people and devices that can use a computer system’s resources. The three most common access control designs include the following:

Mandatory access control (MAC) is a strict system where security decisions are fully controlled by a central policy administrator, making it impossible for users to set permissions irrespective of ownership.
Discretionary access control (DAC) is a system where users can set permissions for their files, including granting access rights to other users on the same system. DAC is the most common access control design approach in commercial operating systems. While generally less secure than mandatory control, DAC is easier to implement and more flexible.
Role-based access control (RBAC) is a system in which access depends on an individual or group’s role in an organization and the access they need to meet job requirements. Typically, roles with more responsibility and authority have greater access to the system.

The typical attacks launched against cryptography generally involve the following:

brute force (e.g., try all possible private keys)
mathematical attacks (e.g., factoring)
timing attacks (e.g., based on knowledge of the time it takes to decrypt)
hardware-based fault attack (e.g., take advantage of faulty hardware to generate digital signatures)
chosen ciphertext attack (e.g., gather information by obtaining the decryptions of chosen ciphertexts and attempt to recover from this information the secret key used for decryption)
architectural changes (e.g., use knowledge of vulnerabilities)

Anonymity and Privacy

Protecting anonymity and privacy is an important aspect of cybersecurity. Being able to interact on the Internet, even publicly, while concealing your identity, is considered anonymity. Anonymity is not the same as secrecy/confidentiality. As discussed previously, confidentiality is about message contents (i.e., what was said). Anonymity is about identities (i.e., who said it and to whom) and must be preserved to ensure certain civil liberties such as autonomy, free association, free speech, and freedom from censorship and surveillance.

There is a wide spectrum of “nimity,” including

linkable anonymity (e.g., loyalty cards, prepaid mobile phone),
(e.g., pen names on blogs),
unlinkable anonymity (e.g., paying in cash), and
verinymity (e.g., credit card numbers, driver licenses, addresses).

By remaining anonymous, users make it difficult for hackers to steal their personal data (e.g., passwords, credit card information). It also allows users to preserve their civil liberties (e.g., free speech and association) on social media. Anonymity can also be important for people concerned about their safety and do not want to create safety risks because of their online activities.

Keeping your actions online, such as messages intended only for certain individuals, concealed from the public is called privacy. Privacy is regarded as a basic human right. While not explicit in the U.S. Constitution, privacy rights are implied by the personal protections offered in the First, Third, Fourth, Fifth, and Ninth Amendments in the Bill of Rights. As Figure 14.9 shows, privacy is related to anonymity but is a separate concept. Both anonymity and privacy are important to promote cybersecurity.

Illustration shows overlapping circles: privacy (keeping online actions concealed from the public) and anonymity (allowing online actions to be visible to the public but cannot be attributed to any individual).

Figure 14.9 Privacy and anonymity are related concepts that are important to promote cybersecurity. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

Internet anonymity is difficult, if not impossible, to achieve for average users, but it seems easy for unethical people. Anonymity is a state-of-the-art technique that relies on proxy intermediaries that relay Internet traffic through trusted third parties. Generally, the process requires setting up an encrypted virtual private network (VPN) to the third party’s site, and all your traffic goes through it.

To understand how unethical people use proxies, assume a scenario depicted in Figure 14.10 where Alice wants to message M to Bob. Bob does not know that the message M is from Alice, and Eve (“Eve” is short for eavesdropper) cannot determine that Alice is communicating with Bob. HMA accepts messages encrypted for it, then extracts the corresponding destination addresses and forwards the message accordingly.

Illustration shows Alice sends an encrypted message, M, to Bob. VPN (HMA) extracts destination addresses. Bob receives the message.

Figure 14.10 When Alice sends a message to Bob, HMA accepts encrypted messages, then extracts the corresponding destination addresses and forwards the messages accordingly. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

Anonymity is meant to counter the type of surveillance that is mandated as part of the Patriot Act (section 215, and national security letters or NSLs) and the FISA Amendment Act.

An older Google transparency report (Table 14.4) shows the number of NSLs that FBI agents issued without a judge’s approval to obtain personal information.

Reporting Period	National Security Letters	Users/Accounts
January to June 2016	0–499	500–999
July to December 2015	1–499	500–999
January to June 2015	0–499	500–999
July to December 2014	0–499	500–999
January to June 2014	500–999	500–999
July to December 2013	500–999	1,000–1,499
January to June 2013	0–499	500–999
July to December 2012	0–499	500–999
January to June 2012	500–999	1,000–1,499
July to December 2011	0–499	500–999
January to June 2011	0–499	500–999
July to December 2010	0–499	1,000–1,499
January to June 2010	500–999	1,500–1,999
July to December 2009	0–499	500–999
January to June 2009	0–499	500–999

Table 14.4 Google Transparency Report Data from Google.

NSLs cover everything except the contents of your communications (i.e., if, when, how much, who). These data are included in the exception because such metadata often provides privileged information that is essentially, according to the FTC, a “proxy for content.” In fact, the U.S. National Security Agency (NSA) collection of bulk call data was ruled illegal in 2015.

Encryption tools such as Pretty Good Privacy (PGP), which was introduced earlier, can be used. GnuPG is a free software recreation of PGP created by Phil Zimmerman (1991). These tools allow users to encrypt emails to hide their content by creating a hash of the email’s content and using digital signatures to sign the hash. Both the message text and the digital signature attached to the message are protected using hybrid encryption. Digital signatures require public-key cryptography (which has separate public and private keys). Before sending a message, the message is signed with the signature key, and both the message and its signature are encrypted with the recipient’s public encryption key. Once the message is received, it is decrypted with the private key to extract it and its signature. The sender’s public verification key is then used to check the signature’s authenticity.

Fingerprints must be used to secure this process further. Because Bob’s public key can be obtained from his website, it needs to be verified via out-of-band communication of fingerprints. A fingerprint is a cryptographic hash of a key. To support this approach, you need key servers that store public keys to look up keys by name/email address and verify them with fingerprints. If you do not know Bob personally, you can rely on the Web of Trust (WoT) or “friend of a friend” mechanism (i.e., Bob introduces Alice to Caro by signing Alice’s key).

There are drawbacks to (just) using encryption because Bob’s private key may be compromised. In that case, the specifics of the keys may become known, and past messages may be decrypted and read. Because the sender’s signature is available as part of the messages sent, it also becomes possible to prove the sender’s identity, which defeats the security scheme. This attack exposes many incriminating records, including the key material that decrypts data sent over the public Internet and signatures with proofs of who said what.

There is nothing better than “off-the-record” conversations where Alice and Bob talk in a room, and no one else can hear them. In that case, no one else knows what they say unless Alice or Bob tells them. Furthermore, no one can prove what was said, not Alice or Bob. Based on this, desirable communication properties are as follows:

Deniability makes it plausible to deny having sent a message.
Forward secrecy allows past messages to be safe even if key material if compromised.
Mimic off-the-record conversations to facilitate deniable authentication. Make it possible to be confident of who you are talking to but unable to prove to a third party what was said.

One technique is to use off-the-record (OTR) messaging.

Use authenticated Diffie-Hellman (DH) protocol to establish a (short-lived) session key:
Diffie-Hellman is a security algorithm with only one (symmetric) private key that is shared by both participants (e.g., Alice and Bob). Alice and Bob agree on values for prime number p and a generator number g (or base), where 1< g < p and g is any number agreed upon by both parties that is a generator of p. The number g is a generator of p because, when raised to positive whole-number powers less than p, it never produces the same result for any two such whole numbers. For ease of computation, g is usually chosen small, and the order of g should be prime and approximately p/2. Alice and Bob pick private values x and y respectively and they generate a key and exchange it publicly. So Alice selects x and generates public key a = gx modulo p and Bob selects y and generates public key b = gy modulo p. For example, if p = 23 and g = 9, the private keys for Alice and Bob are respectively 4 and 3, and the secret keys for Alice and Bob are both 9.
During the DH key (signed) exchange, the only pieces of information that are exposed to the public (and susceptible to interception by malicious actors) are x, y, p, and g. None of which are sufficient to recover Alice and Bob’s private keys. It is also not enough information to recover the shared (symmetric) secret (SS) cryptographic key. SS can then be used by Alice and Bob to send encrypted messages to each other safely, which is done using a secret-key encryption algorithm, using a hash of SS as the encryption key EK, to transmit ciphertext.
The strength of the scheme comes from the fact that (gy)x modulo p = (gx)y modulo p is a one-way function that takes an extremely long time to compute/invert using any known algorithm (just from the knowledge of p, g, gx modulo p, and gy modulo p.

(attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)
Then Alice uses secret-key encryption on message M and sends it across as EEK(M) ... and authenticates the message using a message authentication code MACMK(EEK(M), where MK is computed as a hash of the EK, H(EK):

(attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)
Re-keying is performed using the DH protocol to exchange new private values x′ and y′:

(attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)
Publishing the old MK:

(attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

Note that OTR is more applicable to interactive communication than email. It provides message authentication, confidentiality, deniability, and forward secrecy. However, there are no practical examples of “deniability.” OTR is built into Adium and Pidgin; in that case, some defaults apply. In particular, logging is enabled by default, and etiquette dictates you should disable this, as do past instances where people’s activities were discovered using those logs (e.g., Chelsea Manning, who leaked classified information, was discovered after the Army reviewed access logs). It is very different from Google Hangout’s OTR feature, which does not allow the conversation to be logged.

An interesting anonymity solution is the protocol behind the Signal app (iPhone, Android), which uses the double ratchet algorithm set forth by Trevor Perrin and Moxie Marlinspike. It provides forward secrecy (i.e., today’s messages are secret, even if the key is compromised tomorrow), future secrecy (i.e., tomorrow’s messages are secret, even if the key is compromised today), deniability (no permanent/transferable evidence of what was said), and usability (i.e., tolerates out-of-order message delivery).

Another interesting idea for anonymity is achieved via plausibly deniable storage. In this case, the goal is to encrypt data stored on your hard drive. Someone can be compelled to decrypt it. The idea is to have a “decoy” volume with benign information (e.g., VeraCrypt).

Note that it may be worthwhile to differentiate sender from receiver anonymity. An interesting example of a protocol that achieves sender anonymity is described by David Chaum as the “dining cryptographers problem.”¹⁶ In this case, three cryptographers dining together are told that payment will be made anonymously by either one of them or the agency that employs them. As they respect each other’s right to make an anonymous payment but wonder if their agency will be paying, they carry out the protocol described to achieve sender anonymity.

A naive solution to achieve anonymity for browsing is to use VPNs. Organizations providing these may receive court orders asking for information relating to an account, and they will cooperate with law enforcement if they receive a court order. A better approach is to use Tor by downloading the Tor browser bundle or becoming a volunteer in the Tor network. Tor is built on a modified version of Firefox and is a low-latency anonymous communication system that hides metadata (i.e., who is communicating). As noted earlier, you may get in trouble when an encrypted message you are sending is intercepted, and the included metadata are exposed. To avoid this, Tor completely hides the existence of communication (e.g., web connections). Tor operates at the transport layer and makes it possible to establish Transmission Control Protocol (TCP) connections without revealing your IP address. The Tor network relies on many nodes (i.e., onion routers) operated by volunteers and located worldwide. The Tor approach becomes useful if Alice wants to connect to a web server without revealing her IP address. Simply speaking, onion routing (Figure 14.11) generalizes to an arbitrary number of intermediaries (“mixes” or “mix-nets”). Alice ultimately wants to talk to Bob, with the help of HMA, Dan, and Charlie, and as long as any of the mixes is honest, no one can link Alice with Bob.

Illustration of Tor approach. Tor network relies on many nodes (onion routers) operated by volunteers. Alice sends message to Bob with the help of HMA, Dan, and Charlie.

Figure 14.11 Users can achieve anonymity by using Tor. This is what the industrial-strength Tor anonymity service uses. It also provides bidirectional communication. The key concept here is that no one really knows both you and the destination. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

Tor end-to-end paths are called Circuits, and Tor almost always uses 3-hop circuits (i.e., k = 3). Tor balances anonymity (i.e., k not too small so as to be traceable) and latency (i.e., k not too large). The last node in a Tor circuit is called an "exit” node. To the outside world, the Tor exit node is initiating connections to destinations. Every peer in the Tor network gets to decide whether to be an exit node or to just relay between other Tor nodes. Tor exit nodes also determine what IP addresses/websites to exit. Tor clients learn about other Tor peers by downloading a list of them.¹⁷ The list provides information for each one of the Tor routers, such as IP address and hostname, the country where it resides, the uptime, the average throughput, websites it is willing to be an exit node for, and the node’s public key.

Tor’s attack model is more “relaxed” than the models used by typical mix-nets. In particular, Tor does not assume a global, passive attacker. It does assume that a limited subset of the Tor nodes are malicious and that there may be some level of eavesdropping on small portions of the links but not a global view of all traffic. This relaxed attack model, which assumes a less powerful adversary, makes it possible for Tor to achieve better performance. That said, there are a few aspects that Tor does not cover as compared to typical mix-nets: Tor does not batch or delay packets. If only one client were to communicate over Tor, there would be no anonymity. The philosophy behind the Tor relaxed attack model is to assume that if the performance is reasonable enough, users will be more likely to adopt it. The more users adopt it, the more "cover traffic" there will be, making it harder for an attacker to map packets to any one sender. Figure 14.12 summarizes how Tor works at a high level.

Illustration of multi-step Tor process. Step 1: Alice’s Tor client requests and receives Tor nodes from the directory server. Step 2: Alice’s Tor client uses a random path to the target server. Step 3: If user visits another site, Alice’s Tor client selects another random path to target server.

Figure 14.12 Tor is a multistep process that enables users to achieve anonymity online. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit key icon: modification of "Accessible" by Phạm Thanh Lộc/Wikimedia Commons, CC BY 4.0; credit: reproduced with permission from Dave Levin)

Tor implements the following trust protocol:

Entry node knows that Alice is using Tor, and also knows the identity of the middle node but does not know the identity of the destination.
Exit node knows that some Tor user is connecting to the destination but does not know which user.
Destination knows a Tor user is connecting to it via the exit node.

It should be noted that Tor does not provide encryption (e.g., use HTTPS) between the exit node and the destination). We discussed earlier how senders can hide their identities. Similar to that, Tor is also a means of allowing destinations to hide their identities. An example was The Silk Road, an eBay-like online store where users could purchase illicit and illegal goods and pay for them using Bitcoins. Running such a website requires a certain degree of anonymity. Therefore it was run as what is known as a "Tor hidden service." Hidden services have since been renamed to onion services.¹⁸ Interestingly, onion services can achieve receiver anonymity using techniques that achieve sender anonymity (Figure 14.13).¹⁹

Illustration showing how Tor hidden services, or onion services, achieve receiver-anonymity using techniques that achieve sender-anonymity.

Figure 14.13 Tor hidden services, also known as onion services, achieve receiver-anonymity using techniques that achieve sender-anonymity. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

Note that there are various known onion routing attacks and other issues as follows:

Attacks:
- Rubber-hose cryptanalysis of mix operators; as a defense, use mix servers in different countries
- Adversary operates all of the mixes; as a defense, has lots of mix servers.
- Adversary observes when Alice sends and when Bob receives and links the two together.
- Side channel attack exploits timing information; as a defense, pad messages or introduce significant delays. (Tor does the former, but note that it is not enough for defense.)
Issues include
- impaired performance (i.e., messages may bounce around a lot) and
- traffic leakage (suppose all of your HTTP/HTTPS traffic goes through Tor, but the rest of your traffic does not).

Concerning the traffic leakage problem, Tor’s solution is to inspect the logs of their DNS server to see who looked up sensitive.com just before your connection to their web server arrived. The hard, general problem is that anonymity is often at risk when an adversary can correlate separate sources of information.

To summarize, Tor hides metadata (i.e., the “what,” or message content) via TLS/PGP/OTR/Signal, and also hides the “who” via the onion routing protocol. It also provides a messaging system called Pond (Figure 14.14), which hides the “when” and “how much” parts as illustrated. Note that Pond is not an email; rather, it is a forward, secure, asynchronous messaging system. Pond seeks to protect the possibility of leaking traffic info against all but a global passive adversary (i.e., forward secure, no spam allowed, and messages expire automatically after a week).

Illustration of forward, secure, asynchronous messaging system, Pond, which hides “when” and “how much” parts.

Figure 14.14 Pond is a forward, secure, asynchronous messaging system that seeks to protect against leaking traffic information. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Zero-Knowledge Proofs

A zero-knowledge proof (ZKP) is a cryptographic system that functions as a useful tool to protect privacy. With ZKPs, users rely on cryptographic algorithms to verify information without accessing the supporting data. Digital signatures based on PKI (i.e., RSA-algorithm or ECC) are examples of ZKPs. In these cases, the person holding the private key can convince any public key holder that they know the private key without revealing it.

Properties of ZKPs include:

Completeness: A statement is true if an honest prover can convince an honest verifier.
Soundness: If the prover is dishonest, he cannot fool the verifier.
Zero-knowledge: No private information is revealed to the verifier.

To understand ZKPs, it is important to know about the PCP theorem, which states that every decision problem in the NP complexity class has probabilistically checkable proofs of constant query complexity and logarithmic randomness complexity.

For an error probability of 2-k, 3k bit positions must be verified. (For example, the Soundness error probability of 2-40 ≈ 10-12 with 120-bit verification.)

Computational integrity proofs are probabilistic proof systems based on the PCP theorem, allowing a prover to convince a verifier of the correctness of an arbitrary computation with an exponentially faster efficiency than naive checking of the computation. Computational integrity proofs exhibit the following properties:

Completeness: A statement is true if an honest prover can convince an honest verifier.
Soundness: If the prover is dishonest, they can’t fool the verifier.
Succinctness: There is exponentially faster verification than naive checking of the computation.
Zero-knowledge (bonus): No information is revealed to the verifier.

ZKPs rely on a transcript of the original computation expanded into a proof using an error-correcting code (e.g., a Reed-Solomon code, polynomial commitment), which spreads any errors within the proof. A low number of (e.g., three) of stochastic queries by the verifier is sufficient to prove the correctness of the computation with high probability (Figure 14.15).

Illustration of ZPKs, which rely on transcript of original computation expanded into a proof using an error-correcting code that spreads any errors within the proof. Low number of of stochastic queries by the verifier proves the correctness of the computation.

Figure 14.15 ZKPs can be used to prove the correctness of computations. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

The zk-SNARK acronym stands for zero-knowledge succinct non-interactive argument of knowledge. A zk-SNARK is cryptographic proof that allows one party to prove it possesses certain information without revealing it. The proof is made possible using a secret key created before the transaction occurs. zk-SNARK is used as part of the cryptocurrency Zcash protocol. In a noninteractive proof, the interactive dialog between the prover and verified is replaced by randomness. Query locations are predefined by randomness, which the prover cannot influence.

zk-SNARKs cannot be applied directly to computational problems. Problems need to be first converted into the right “form.” The form is called a quadratic arithmetic program (QAP), and transforming the code of a function into a QAP is highly nontrivial. It requires turning the computation into an algebraic circuit and transforming it into a rank-1constraint system (R1CS) that can then be converted into a QAP. In addition to the process for converting the code of a function into a QAP, another process may run alongside in such a way that if you have an input to the code, you can create a corresponding solution (sometimes called “witness” to the QAP). Once this is done, another fairly intricate process must be followed to create the actual “zero-knowledge proof” for the witness and a separate process for verifying proof that someone else passes along to you. The full machinery between zk-SNARKs is illustrated in Figure 14.16.

Illustration of full machinery between zk-SNARKs includes computation, algebraic circuit, R1CS, QAP, linear PCP, linear interactive proof, ZkSNARK.

Figure 14.16 zk-SNARK’s full machinery includes computation, algebraic circuit, R1CS, QAP, linear PCP, and linear interactive proof. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

zk-SNARKS are not transparent and require a so-called trusted setup to secure the randomness of the parameters determining the selection of queries in the verification procedure. The random parameters are, for example, generated using a hash function. However, the knowledge of the original values fed into the hash function (“toxic waste”) must be kept secret. Otherwise, retrieving the random parameters and manipulating the selection of verification queries would be possible. A trusted setup is a multi-party computation (MPC; “ceremony”), which recursively generates random parameters. If at least one party is honest and forgets its input values, the randomness of the trusted setup is safe. Universal trusted setups can be generated once and can be reused for other applications as well; nonuniversal ones are tied to the specific circuit. The difficult part is to generate a zk-SNARK with reasonable resources for real-world applications.

Zero-knowledge scalable transparent argument of knowledge (zk-STARKs) is a type of ZKP where one party can prove to another that a given statement is true without revealing any other information other than the fact that the statement is true. Attributes of the zk-STARK concept are as follows:

zero-knowledge (refers to privacy preservation)
scalability (indicates that verification time is substantially less than the time taken for naive computations)
transparency (reflects the lack of a trusted setup requirement)
argument and knowledge (related to the security and robustness of the cryptographic scheme).

Zero-knowledge STARKs work by leveraging leaner cryptography, specifically collision-resistant hash functions, to validate the truth of a statement without sharing the details behind it. Unlike zk-SNARKs (Zero-knowledge succinct non-interactive argument of knowledge), which rely on an initial trusted setup and are theoretically vulnerable to quantum computer attacks, zk-STARKs eliminate these issues. That said, it is important to note that this leaner approach results in a significant disadvantage. Specifically, zk-STARKs generate proofs that are typically 10 to 100 times larger than those created by zk-SNARKs, thus making them more expensive and potentially less practical for certain applications.

The trade-offs between different properties of different noninteractive and transparent proof systems amount to a difference in verification time (between 2 ms to 250 ms), prover time (1 s to 100 s), and proof size (between 200 B to 250 kB).

The cryptographic primitives used by ZKPs include the following cryptographic primitives:

Collision-resistant hash function (quantum secure): STARK, Fractal, Aurora
Elliptic curve cryptography: Bulletproofs, Halo
Knowledge of exponent/pairing groups: Groth16, Sonic, Marlin, PLONK
Groups of unknown order: Supersonic
Lattice-based cryptography (quantum secure): under development

There are myriads of recent applications of ZKPs to blockchain technology for privacy and scalability improvements. A related topic is verifiable delay functions (VDFs), which emerged in June 2018. A verifiable delay function (VDF) is a function f : X → Y that takes a prescribed minimum time to compute (even on a parallel computer). However, once computed, anyone can quickly verify the output. They can prevent fraud or frontrunning on exchanges, online auctions, games, or prediction markets. Another related topic is multi-party computations (MPCs), which are methods for parties to jointly compute a function over their inputs while keeping those inputs private. MPCs are different from traditional cryptographic tasks that use cryptography to ensure the security and integrity of communication or storage and assume that the adversary is outside the system of participants. In MPC, cryptography protects participants’ privacy from each other. An example of an application is that of a trusted setup ceremony, which is a secure MPC.

An important related topic is fully homomorphic encryption (FHE), the “holy grail of cryptography.” FHE allows arbitrary mathematical operations on encrypted data (i.e., for every f: y = f(x) -> Encrypted y′ = f(x′)).

Open Problems in Cryptography

The ongoing attacks and the need to defend crypto schemes require staying on top of best practices. Ideally, developers should write code that can be changed easily. Also, remember not to develop your own cryptographic mechanisms. Go through peer review and apply Kerckhoff’s principle. Do not even implement the underlying crypto, and do not misuse existing crypto.

Information about a particular implementation could leak (e.g., power consumption, electromagnetic radiation, timing, errors). Attacks based on this are referred to as “side-channel attacks.” As an example, simple power analysis (SPA) may be used to interpret power traces during a cryptographic operation. Simple power analysis (Figure 14.17) can reveal the sequence of instructions that have been executed.

Graph shows simple power analysis revealing specific instructions (jump taken, no jump taken).

Figure 14.17 Simple power analysis can reveal specific instructions such as those shown here. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Assuming the program execution path depends on the inputs (i.e., key/data), SPA can be used to reveal the keys. Different from SPA, which visually inspects a single run, differential power analysis (DPA) can operate interactively and reactively across multiple samples. Using this approach, DPA can produce new plain text messages that can be passed as inputs repeatedly.

In order to counter these types of attacks, it is necessary to hide information by making sure that the execution paths do not depend heavily on the inputs. This may require dropping optimizations that depend on specific bit values in keys. In the past, Chinese remainder theorem (CRT) optimizations allowed remote timed attacks on Secure Socket Layer (SSL) servers. In general, cryptosystems should be designed to resist information leaks.

A different type of side-channel attack happens when keys are safely stored in memory, and attackers do not have access to a machine. In that case, if the attacker can access the physical machine and reboot it into an OS they control, it becomes possible for the attacker to look at the memory contents. While memory loses its state without content, it does so much more slowly at very cold temperatures. Therefore, an attacker could cool down the memory, shut down the machine, move the memory to a different machine, and boot it into a different OS. All that is left to do then is to scan the memory image for keys, which is difficult but feasible, assuming the keys have a format that is easy to detect. A couple of techniques can be used to counter these types of attacks. One solution is to encrypt all the memory, which requires additional CPU power. Another solution, which is used on Xbox, requires a trusted platform module (TPM) to store hardware keys, making installing them very difficult. Some TPM self-destruct when tampered with or keep keys in memory for a limited time (e.g., the keys are removed from memory when going to sleep mode).

There are new mechanisms to permit new types of interactions. A style of interaction that has been getting a lot of attention is the following:

Alice has proprietary data.
Bob has proprietary code (or computational resources).
The goal is for Bob to run his code on Alice’s data without learning her input or the output.

There are problems introduced earlier that still require usable solutions:

Secure multiparty computation: For example, Alice and Bob both have data and want to know the output of a function over their private data without having to reveal their data to each other (e.g., “which of us has more money” without having to reveal exactly how much either has).²⁰ Communication overhead and vulnerability to attacks from colluding parties are the main challenges. While there are techniques to solve these problems, they usually come with higher computational costs.
Fully homomorphic encryption: Homomorphic encryption (HE) can perform computations on encrypted data without first decrypting it with a secret key. It then encrypts the computation results, and only the owner or the private key can decrypt them.²¹ Partial HE systems have been around since the 1970s, and a fully HE scheme that makes it possible to apply mathematical operations to encrypted data was first developed by Craig Gentry in 2009. However, fully homomorphic encryption in its current form is impractically slow.

Traditional Software Solutions Security

To protect systems, software solutions architects and software developers must consider security as a property of the systems they build. The way software safeguards system resources, including data, to provide access to only authorized users is through software security. Security should be part of the software design process to take a proactive approach to cyber threats and risks. This includes prioritizing security in software requirements, programming, testing, implementation, and maintenance. Practices that should be part of software security include threat modeling and vulnerability management.

Generally, to implement security in the software development process, software architects and developers should follow these steps:

Make software security a priority and focus of the development process.
Identify security risks that should be addressed with software.
Identify vulnerabilities.
Use the appropriate standards, best practices, and frameworks to guide the development process.
Review and analyze the code extensively with an emphasis on cybersecurity.
Implement penetration testing, which includes the following steps:
- Planning process to gather information about the system and define testing goals.
- Scan the system with tools to learn how the system responds to threats.
- Execute attacks and make every effort to gain access to the system, revealing its weaknesses.
- If accessed is gain, make every effort to maintain that access without detection.
- Analyze test results and make system changes before testing again.
- Repeat this process over and over to identify system weaknesses.

Software Security

Software solutions architects and developers must consider security as a property of the systems they build. Many attacks begin by exploiting a vulnerability. In this case, a vulnerability is a software defect that yields an undesired behavior (i.e., the code does not behave correctly). Software defects arise due to flaws in the design or bugs in the implementation. Unfortunately, software can’t be completely bug-free, and fixing every known bug may be too expensive. In general, the focus is to fix what is likely to affect normal users and not focus on bugs that normal users never see or avoid. Because attackers are not normal users, they look for bugs and flaws and try to exploit them. Therefore, to achieve software security, it is necessary to eliminate bugs and design flaws and/or make them harder to exploit. Doing so requires thinking like attackers and developing a foundation for deeply understanding the systems built and used.

Most (interesting) software takes inputs from various sources, and any of these inputs may be malicious, such as the following:

direct user interaction (e.g., user interfaces with software via a command line interface or opens a document)
third-party libraries that are linked to the software
future code updates

Securing software in this context should result in correct operation despite malicious inputs. In order to study how to secure software, we will focus on what should be done to secure software written using the C programming language and investigate program control flow hijacking via buffer overflows, code injection, and other memory safety vulnerabilities. This is motivated by the fact that the C language is consistently used widely, and many mission-critical systems are written in C (e.g., most operating systems kernels such as Linux, high-performance servers such as Microsoft SQL server, many embedded systems such as Mars rover). Furthermore, the same techniques apply more broadly.

Buffer Overflow Attacks and Defenses

Many buffer overflow attacks were perpetrated over the years against software programs written in C. Buffer overflows are prevalent and constitute a significant percentage of all vulnerabilities.²² In 1988 for example, Robert Morris sent a special string via a buffer overflow attack to the fingered daemon on a computer running the VAX operating system and caused it to execute code that created a worm copy that propagated over the network and affected Sun machines running the BSD operating systems. This resulted in $100M worth of damages as well as probation and community services. Robert Morris subsequently became a professor at MIT. In 2001, the Code Red worm leveraged a buffer overflow error in the MS-IIS server. As a result, 300,000 machines were infected within 14 hours. In 2003, the SQL Slammer worm leveraged a similar buffer overflow error in the MS-SQL server. As a result, 75,000 machines were infected within ten minutes. In 2008–2009, the Conficker worm exploited a buffer overflow in Windows RPC, which infected more than ten million machines. In 2009–2010, Stuxnet exploited several buffer overflows in the Windows print spooler service, LNK shortcut display, task scheduler, and the same RPC buffer overflow as Conficker, which led to legitimate cyber warfare. Between 2010 and 2012, Flame exploited the same print spooler and LNK buffer overflows as Stuxnet, which resulted in a cyber-espionage virus. On January 8, 2014, a 23-year-old discovered an X11 server security stack buffer overflow vulnerability (i.e., scanf used when loading early 1990s BDF bitmap fonts) from 1991. The GHOST glibc vulnerability was introduced in 2000 but was only discovered many years later. One last example is the Syslog logging infrastructure daemon bug in macOS and iOS, which relied on the fact that running programs would issue log messages and Syslog would handle storing and disseminating them. The problem was that Syslog used a buffer to propagate these messages, which was not large enough and would sometimes write beyond the end of the buffer.

Based on this, understanding how C programs buffer overflow attacks work and how to defend against them is critical and requires knowledge (refer to Chapter 4 Linguistic Realization of Algorithms: Low-Level Programming Languages and Chapter 5 Hardware Realizations of Algorithms: Computer Systems Design) of the C software compiler, the operating system on which the program is run, and the computer system architecture on which the operating system runs—in other words, a whole-systems view. As a refresher, the stack layout on a 32-bit (Intel IA32) computer when calling a sample C function (Figure 14.18) is shown. Note that in this case, there are two 4-B values between the arguments and the local variables.

Illustration showing stack layout on a 32-bit (Intel IA32) computer when calling a sample C function. Local variables pushed in the same order as they appear in code. Arguments pushed in reverse order of code.

Figure 14.18 Local function variables are pushed on the stack in the order they appear in the code, while function arguments are pushed in the reverse order or their appearance in the code. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

The function func can access variable loc2, using the stack frame pointer %ebp (Figure 14.19). Note that the same would apply on a 64-bit computer by using a 64-bit memory layout and changing the names of the registers accordingly (e.g., %rbp vs. %ebp).

Illustration showing stack frame pointer. Location of loc2 variable in stack frame is always 8 bytes before the address contained in %ebp.

Figure 14.19 The location of the loc2 variable in the stack frame is always 8 B before the address contained in %ebp. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Figure 14.20 illustrates how to properly return from a call to the function func.

Illustration showing how to return from a call to the function func: set %ebp at return, push next %eip before call.

Figure 14.20 This figure shows how to properly return from a call to the function

func

. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

In summary, the steps to follow in order to call and return from a function are as follows:

Calling function:

Push arguments onto the stack (in reverse).
Push return address onto the stack (i.e., the address of the instruction that needs to be run once control returns to the calling program: %eip + something).
Jump to the function’s address.
Called function:
Push old frame pointer onto the stack: %ebp.
Set frame pointer %ebp to where the end of the stack is right at this time: %esp.
Push local variables onto the stack; access them as offsets from %ebp.
Returning function:
Reset previous stack frame: %ebp = (%ebp) /* copy it off first */.
Jump back to return address: %eip = 4(%ebp) /* use the copy */.

Let us investigate a buffer overflow example based on understanding the stack memory layout when calling and returning from functions. Buffers are commonly used in C to store sets of values of a given data type. For example, strings are buffers of characters in C. A buffer overflow occurs when more values are put into the buffer than it can hold. Let us consider this buffer overflow example (Figure 14.21) and right before the “strcpy(buffer, arg1);” statement is executed in the function func.

Illustration of buffer overflow that occurs when more values put into buffer than it can hold, right before the “strcpy(buffer, arg1);” statement is executed in the function func.

Figure 14.21 The strcpy statement executed in the function

func

overflows the size of the buffer specified. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Once the “strcpy(buffer, arg1);” statement is executed in the function func (Figure 14.22), it overwrites the stack memory location were %ebp was stored, which causes a segmentation violation when the function returns.

Illustration of segmentation violation when the function returns. The “strcpy(buffer, arg1);” statement is executed in the function func and overwrites the stack memory location were %ebp was stored.

Figure 14.22 The strcpy statement executed in the function

func

sets %ebp to the wrong address upon return due to the buffer overflow. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Using the GNU debugger (i.e., gdb) is useful to debug programs that run into buffer overflows. It enables users to show information about the current frame and registers, examine bytes of memory starting at a given address, set a breakpoint at a given function address, and step through a call to it.

A safe version of the function func will never cause a buffer overflow. It will simply limit the number of characters read from the command line and could be easily adapted to replace the function func:

    Void nooverflow()
    {
      char buflimit[100];
      fgets (buflimit,
      sizeof(buflimit));
    }

Note that strcpy lets you write as many characters as you want until it reads and end-of-string character (i.e., null character “\0” in a C string); therefore the problem could get worse than just overwriting %ebp. Figure 14.23 illustrates a different type of buffer overflow that would occur if the function func were to execute the code provided. In that case, the input writes from low to high addresses.

Illustration of buffer overflow that occurs if function func executes code provided. Input writes from low to high addresses.

Figure 14.23 The execution of the code shown is such that the input writes from low to high addresses. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Code Injection Attacks and Defenses

The example of buffer overflow shown earlier uses a string provided by the program itself but, in general, inputs could come from different sources (e.g., text input, network packets, environment variables, file inputs). Therefore, the existence of a buffer overflow bug in a program can lead to a code injection attack (Figure 14.24).

Illustration of buffer overflow bug leading to code injection attack, which can be staged by loading code into memory and get %eip to point to it.

Figure 14.24 A code injection attack can be staged by loading code into memory and point %eip to it. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Pulling off this type of attack requires overcoming a few challenges, which we will discuss in detail. In general, making it really hard to overcome the various challenges is key to defending programs against code injection attacks. The challenges are as follows:

Loading the code into memory:
The code that is loaded into memory must be machine code that is ready to run. It should not contain any all-zero bytes. Alternatively, some library functions (e.g., sprintf, gets, scanf) will stop copying. The loader cannot be used because the code is injected. Finally, the code injected cannot use the stack because it is designed to smash it.
The best type of code for this is full-purpose shellcode that can be launched as a shell (Figure 14.25). There are many examples of such code, and there is a lot of competition to write the smallest amount of code. Also, a way to ensure that the injected code will work most effectively is to attempt privilege escalation and go from guest (or nonuser) to root.

Figure 14.25 The best type of code for code injection is full-purpose shellcode that can be launched as a shell. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Concerning privileged escalation, the idea is to exploit knowledge of permissions on the targeted operating system. In the case of Linux, files have read/write/execute permissions owner, group, and others. Permissions are defined for userid and groupid, and the root userid is p. The command passwd may be used as part of an attack by making it possible for any user to execute that command rather than just its owner (i.e., root). The idea is to have a root-owned process run setuid(0) or seteuid(0) in order to get root permissions. While root owns “passwd,” users can run it, and getuid() will return the userid of the person who ran it. Executingseteuid(0) next will set the effective userid to root, which is allowed because root is the process owner.
Getting injected code to run:
Because it is only possible to write forward into a memory buffer, the running code must already be used to jump to the injected code. The typical approach is to hijack the saved %eip and change it to point to the address of the injected code (Figure 14.26).

Figure 14.26 Code injection is performed by hijacking the saved %eip and changing it to point to the address of the injected code. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

However, getting to know the address of the save %eip is a challenge. Furthermore if the %eip is wrong and points to data, the CPU will panic when it attempts to execute an invalid instruction.
Finding the return address:
Because the code cannot be accessed, there is no way to know where the buffer starts based on the saved %ebp. One possibility is to try a lot of different values, and the worst-case situation for a 64-bit memory space involves computing 2⁶⁴ possible answers. If address space layout randomization (ASLR) is disabled, which you cannot count on today, the stack always starts from the same fixed address and then grows. Still, it does not usually grow very deeply unless the code is heavily recursive.
Another approach consists of using nop sleds. Because nop is a single-byte instruction, it makes it possible to move to the next instruction, thereby improving chances to hit the address of %eip (Figure 14.27).

Figure 14.27 Introducing single-byte nop instructions improves the chances to hit the address of %eip. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Finally, putting it all together, the recipe for code injection is to achieve that shown in Figure 14.28.

Illustration of the recipe for code injection, which is to guess the stack return address and use nop sleds to improve chances to hit the address of %eip.

Figure 14.28 The recipe for code injection is to guess the stack return address and use nop sleds to improve the chances to hit the address of %eip. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

A typical way to protect a program against code injection is to prevent data execution by marking memory pages as nonexecutable or make it impossible to put code into the memory by detecting overflows with canaries. Using a canary amounts to placing a known string of characters in memory at the end of the buffer and aborts the program execution if the expected value at that location is changed (Figure 14.29).

Illustration of using a canary or placing a known string of characters at the end of the buffer to detect if the content at the end of the buffer has been changed. Program aborts if the expected value is changed.

Figure 14.29 Placing a canary at the end of the buffer makes it possible to detect whether the content at the end of the buffer has been changed. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

There are a few possibilities for canary values as indicated:

Terminator canaries (e.g., CR, LF, NULL, -1) leverage that scanf and other similar functions do not allow these values.
Random canaries write a new random value when each process starts and saves the real value somewhere in memory; it is necessary to write-protect the stored value in that case.
Random XOR canaries work the same way as random canaries but store “canary XOR <some control info> instead.

Integer Overflow Attacks and Defenses

Programmers have a tendency to think about integers as mathematical integers. It is, therefore, easy for them to write C code that causes integer overflows (e.g., assigning larger types to smaller types, arithmetic overflow). For example, the following multiplication using integers represented by the two complement notations causes an arithmetic overflow that yields a negative result: 15000000*500 = -1089934592. Knowing this, attackers may simply control the value of an integer and cause software to behave unexpectedly. Defending against integer overflow requires using appropriate types (e.g., using sizet in the C language).

Format String Vulnerability and Defenses

A format function is a special kind of ANSI C function used as a conversion function to represent primitive C data types as human-readable strings. Format functions are used in most C programs to output information, print error messages, and process strings. A format string vulnerability occurs when an attacker can provide the format string to an ANSI C format function in part or as a whole. If the attacker can do so, the behavior of the format function is changed, and the attacker may get control over the target application.

For example, by calling the following function using a command line parameter:

int func (char *user)
{
    printf (user);
}

an attacker can get control over the entire ASCII string of the printf function (i.e., the part that contains text and format parameters). To avoid this problem, the function should be written as follows:

int func (char *user)
{
    printf ("%s", user);
}

This kind of vulnerability is more dangerous than the common buffer overflow vulnerability.²³

Heap Control Data Vulnerability

The heap is managed by the malloc() function, which requests pages of memory from the operating system, manages free chunks, and allocates memory for programs. Attackers can use the malloc function to overwrite heap metadata and abuse it to exploit buffer overflows by injecting control data in malloc space.²⁴

Code Reuse Attacks and Return-Oriented Programming (ROP)

We have discussed earlier ways to prevent an attacker from executing any injected code using canaries. While attackers may attempt to bypass stack canaries last time, they may also simply focus on bypassing data execution prevention (DEP) measures by executing existing code such as the program code itself, dynamic libraries, or libc. In particular, libc contains valuable functions such as system (runs a shell command) or protect (changes the memory protection on a region of code). Rather than returning to shellcode, an attacker may decide to return to a standard library function like system and cause a system to crash to exit. Another alternative is to return to protect, inject code, and make it executable. An attacker may also chain two functions together. In fact, attackers may not need to limit themselves to functions and cleanup code. An alternative is to encode arbitrary computation, including conditionals and loops, by returning to sequences of code ending in ret. This last approach is referred to as return-oriented programming (ROP)²⁵ and relies on the following steps:

Disassemble code (i.e., library or program).
Identify useful code sequences (e.g., code sequences usually ending in ret).
Assemble useful sequences into reusable gadgets.
Assemble gadgets into desired shellcode.

Time of Check/Time of Use Problem

Figure 14.30 illustrates the time-of-check to time-of-use (TOCTOU) problem.

Illustration of time of check / time of use (TOCTOU) problem. Check whether the real user who executed setuid program is normally allowed to read the file, so access checks real uid instead of the effective userid euid.

Figure 14.30 Access is intended to check whether the real user who executed the setuid program would normally be allowed to read the file, so access checks the real uid instead of the effective userid euid. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

The code should be modified as in Figure 14.31 to avoid the TOCTOU problem.

Illustration of modifying code to avoid TOCTOU problem by switching the uid that executed the setuid command to the euid.

Figure 14.31 The code is modified to switch the user (uid) that executed the setuid command to the effective user (euid) to ensure that proper permissions are used when accessing the document. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Playing Cat and Mouse to Secure Software

The following illustrates how securing software is like playing cat and mouse:

Defense: Make stack or heap nonexecutable to prevent code injection.
Attack response: Return to libc.
Defense: Hide the address of the desired libc code or return the address using ASLR.
Attack response: Perform a brute force search (for 32- and 64-bit systems) or information leak (i.e., format string vulnerability)
Defense: Avoid using libc code entirely and use code in the program text instead.
Attack response: Construct needed functionality using return-oriented programming (ROP).

Common Cyber Threat Defenses

Common cyber threats were introduced earlier along with various examples of malware. Protection against malware involves the use of an intrusion detection system (IDS) as well as an intrusion prevention system (IPS). An IDS may be host- or network-based (i.e., HIDS or NIDS). In this case, detection happens after the attack (i.e. the memory is already corrupted due to a buffer overflow attack). A preventive measure must stop the attack before it reaches the system (i.e., the shield does packet filtering). Some tools support both IDS and IPS (e.g., Snort).

Malware Detection Methods

In general, some types of malware may rely on some delay based on a trigger to run (e.g., time bomb, logic bomb), and they may include a backdoor to serve as ransom. Other types of malware piggyback on other pieces of code. For example, viruses run when users initiate a task (e.g., run a program, open an attachment, boot the machine). Worms run while another program is running and do not require user intervention. Therefore, a virus propagates by ensuring it is eventually executed, assuming user intervention. Once executed, the virus creates a new separate instance of itself and typically infects by altering stored code. A worm self-propagates by making sure it is immediately executed without user intervention. Once executed, the worm creates a new separate instance of itself, and it typically infects by altering the running code. There is a fine line between viruses and worms; some malware uses both types.

Detecting self-propagating malware (e.g., viruses or worms) is challenging. While antivirus software attempts to detect viruses, virus writers strive to evade human response and avoid detection for as long as possible. In the case of worms, the virus writer wants to spread the worm and hit many machines as quickly as possible to outpace the human response. Viruses have been around since the 1970s. They are opportunistic and eventually run as a result of a user action. Two orthogonal aspects define a virus: the way it propagates and what it does (i.e., the “payload”). A general infection strategy consists of altering existing code to incorporate the virus, share it, and expect users to (unwittingly) re-share it. Viruses infect other programs by taking over their entry point so the virus is run when executing these programs. They infect documents, boot sectors, or run as memory resident code. They increase their chances of running by attaching malicious code to a program a user is likely to run (e.g., email attachments). Once viruses run, they also look for an opportunity to infect other systems (e.g., proactive creation of emails).

An obvious method for detecting viruses is to use signature-based detection, which consists of looking for bytes corresponding to injected virus code and protecting other systems by installing a recognizer for a known virus within them. This approach requires fast scanning algorithms and has resulted in creating a multi-billion-dollar antivirus market. Adding recognized signatures to that market enables marketing and leads to competition. To combat this detection method, virus writers give viruses harder signatures to match by creating polymorphic viruses. In this case, the virus generates a semantically different version of the code every time it propagates. While the higher-level semantics of the virus remain the same, the actual execution code differs (e.g., machine code instructions are different, different algorithms are used to achieve the same purpose, the code makes use of different registers, or different constants are used). This can be accomplished by including a code rewriter with a virus or adding some complex code that never runs to evade detection attempts. Instead of appending the program to the virus, virus writers surround the program with virus code, or overwrite uncommonly used parts of the program in order to confuse virus scanners. They also change the virus code so scanners cannot pin down a signature. Code changes can be mechanized so that the code looks different every time it is injected. To do so, they use public key encryption (and the fact that it is nondeterministic) to generate different virus code each time they encrypt the virus. At the same time, decryption always produces the same virus code (Figure 14.32). Virus writers also iteratively obfuscate the code (i.e., encrypt + jmp + …) using different encryption algorithms until the obfuscated code is fully undetectable.

Illustration of how decryption always produces the same virus code, but each time you encrypt it looks different.

Figure 14.32 Virus writers generate a different virus code each time they encrypt the virus, while decryption always produces the same virus code. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Scanning is insufficient to detect metamorphic viruses, and proper detection requires analysis of the code execution behavior. Two general approaches can be applied to facilitate detection, and both need to be conducted in a safe environment (e.g., gdb or a virtual machine). One of the approaches used by antivirus companies focuses on analyzing a new virus to locate its behavioral signature. Another approach focuses on suspicious code analysis to see if it matches the signature. In general, attackers have the upper hand because antivirus systems share signatures, which provides insight that attackers may use to react. Attackers change viruses slowly to make it hard to create a matching behavioral signature or they can start acting differently to avoid detection. In order to detect polymorphic viruses, antivirus writers can record narrow signatures used by virus writers to catch the associated decrypters. Because these signatures are often very small, this approach can result in many false positives. To counter this approach, attackers may spread small decrypter code around and use a jmp instruction to get to the virus code. Another approach to detecting polymorphic viruses is executing or statically analyzing the suspicious code to see if it decrypts. The issue with this last approach is that it is hard to differentiate an encrypted virus from a valid common “packers” program that does something similar (e.g., decompression). It also depends on how long the code can be executed without any side effects. Virus writers can combat these approaches by changing the decrypter. For example, oligomorphic viruses change from one of a fixed set of decrypters, and true polymorphic viruses can generate an endless number of decrypters (e.g., brute force key break). While this approach leads to inefficiencies, it makes it extremely difficult for antivirus software to detect viruses.

Today, malware detection is a technological arms race between detection and avoidance. Initially, only a few very clever people were capable of creating viruses. Viruses are now commoditized, and anyone can launch one. The creation of viruses remains hard. Still, it is no longer an academic interest focus but is rather driven by economic pursuits (e.g., zero-day markets) and cyberwarfare.

Infection Cleanup

Cleaning up after an infection highly depends on the extent of the damage. It may be necessary to restore and/or repair files; numerous antivirus companies provide this type of service. In some cases, when a virus runs with root privileges, it may be necessary to rebuild the entire system. In this case, recompiling the system may not be sufficient. The malware may have infected the compiler and created a backdoor, such that recompiling will simply reintroduce the malware into the compiler. In that case, it may be necessary to resort to original media and data backups.

Software Solutions Assurance Methodologies

As discussed earlier, software security can be compromised as a result of memory safety attacks that include the following:

buffer overflows, which may be used to read/write data on stack/heap or to inject code (ultimately via a root shell);
format string errors, used to read/write stack data;
integer overflow errors, used to change programs’ control flow; and
TOCTOU problems, used to raise privileges.

Various methodologies and associated approaches that may be used as part of the software development life cycle to prevent these attacks are described in the following sections.

Defensive Programming

If you think of defensive driving as an analogy, it is about avoiding dependence on anyone but yourself. Minimizing trust makes it possible to better react to unexpected events (e.g., avoid a crash, or worse). Defensive programming pretty much works in the same way. Each software module is responsible for checking the validity of all inputs it receives and throwing exceptions or exiting rather than running malicious code and/or trusting inputs, even when they come from callers you know.

Defensive programming requires code reviews. While real or imagined, these reviews force programmers to organize their code and focus on code correctness to address issues that could raise flags. One approach to defensive programming is to provide developers with better languages and libraries that render code less prone to mistakes. For example, Java Runtime checks bounds automatically, and C++ comes with a safe std::string class. Secure coding relies on practices and rules, as illustrated in the following code.

Practice: Analyze all inputs, whatever they are²⁶: char digit_to_char(int i) { char convert[] = "0123456789"; return convert [i]; }
- Think about all potential inputs, no matter how peculiar²⁷: char digit_to_char(int i) { char convert[] = "0123456789"; if(i < 0 || i > 9) return '?'; return convert[i]; }
- Enforce rule compliance at runtime.
Rule: Make use of safe string functions or libraries.
String library routines typically included in libraries assume target buffers have sufficient length²⁸:
char str[4]; char buf[10] = "good"; strcpy(str, "hello"; //overflows str strcat(buf, " day to you"); //overflows buf Safe versions: check the destination length²⁹:
char str[4]; char buf[10] = "good"; strcpy(str, "hello",sizeof(str)); //fails strcat(buf, " day to you",sizeof(buf)); //fails Again, you must know your system’s and language’s semantics.
Note that strncpy/strncat do not null-terminate if they run up against the size limit; therefore, it is better to use strlcpy/strlcat. These functions are not “insecure,” but they are commonly misused.
It is actually even better to use safe string libraries as they are designed to ensure that strings are used safely. The following code illustrates the use of the very secure FTP (vsftp) string library³⁰:
impl hidden void str_alloc_text(struct mystr* p_str, const char* p_src); void str_append_str(struct mystr* p_str, const struct mystr* p_other); int str_equal(const struct mystr* p_strl, const strct mystr* p_str2); int str_contains_space(const struct mystr* p_str); …struc mystr; //impl hidden void str_alloc_text(struct mystr* p_str, const char* p_src); void str_append_str(struct mystr* p_str, const struct mystr* p_other); int str_equal(const struct mystr* p_strl, const strct mystr* p_str2); int str_contains_space(const struct mystr* p_str); …
Rule: Understand pointer arithmetic.

(attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

The command sizeof() returns a number of bytes, but pointer arithmetic multiplies by the size of the type³¹:
int SIZE * sizeof(int); int buf[SIZE] = { …}; int *buf_ptr = buf; while (!done() && buf_ptr < (buf + sizeof(buf))) { *buf_ptr++ = getnext(); // will overflow } so, use the right units: while (!done() && buf_ptr < (buf + SIZE)) { *buf_ptr++ = getnext(); //stays in bounds }
Practice: Defend against dangling pointers.

(attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)
Rule: Use NULL after free

(attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)
Practice: Manage memory properly.
Some programmers commonly use goto chains in C to avoid duplicating or missing code. This approach is similar to using a try/finally clause in Java. A good coding practice is to always review and confirm the logic correctness³².
int foo(int arg1, int arg2) { struct foo *pf1, *pf2; int retc = -1; pf1 = malloc(sizeof(struct foo)); if (!isok(arg1)) goto DONE; … pf2 = malloc(sizeof(struct foo)); if (!isok(arg2)) goto FAIL_ARG2; … retc = 0; FAIL_ARG2: free(pf2); //fallthru DONE: free(pf1); return retc; }
Rule: Always use a safe allocator.
ASLR makes the base address of libraries unpredictable to defeat exploits. Using the same thinking and at the cost of reduced performance, addresses returned by calls to malloc should be made unpredictable to avoid heap-based overflows.
Rule: Favor safe libraries.
Libraries encapsulate well-thought-out design, so take advantage of them. For example, smart pointers libraries (part of C++11 standard) limit pointers to only safe operations and manage lifetimes appropriately. Networking libraries such as Google protocol buffers and Apache Thrift are good for dealing with network-transmitted data: they are both efficient and also ensure inputs are handled securely (e.g., validation, parsing).

Secure Software Implementation

The trusted computer base (TCB) of every system may include the monitor, compiler, OS, CPU, memory, keyboard, and other peripherals. Basic security assumes a correct, complete, and secure TCB. A good TCB is small and separates privileges. Using a small and simple TCB ensures that fewer components must work correctly to ensure security and are less susceptible to compromises. As security software in the TCB grows and becomes more complex (e.g., operating systems kernels used to enforce security often include a large amount of code), it becomes vulnerable and may be bypassed. Rather than compromising a device driver’s security, it is best to reduce the size of the operating system kernel by creating microkernels that leverage device drivers located outside the kernel. The least privilege, a privilege separation approach, should also be applied to keep privileged operations modules as small as possible. It is important to only give the right level of privilege to a task. There is no reason to give more privileges than needed to a task. For example, it is not necessary for a web server daemon to allow root to bind to port 80. Doing so will enable the web server to run as root. Similarly, email editors should not make it possible to access a shell. You need to remember that trust is transitive, trusting something means that you trust what it trusts, which can lead to trouble.

Thinking about code safety is critical to ensure code safety and correctness. Code modularity is important as it helps to gain confidence in code function by function and module by module. It is necessary to verify that pre- and post-conditions hold before and after a function is called, respectively. This helps define contracts for using modules (e.g., a given statement’s post-condition needs to correspond to another statement’s pre-condition). Pre- and post-conditions help document code and facilitate reasoning about code. Invariants help set conditions that are always true within parts of a function. All the aforementioned defensive programming techniques make it possible to verify functions based on code and associated annotations every time the code is invoked. Defensive programming allows reasoning about functions’ safety each time they are called, and pre-conditions act as constraints that users must follow each time they use functions.

The following code illustrates the preconditions that are required to ensure safety. The approach consists of identifying each memory access and annotating them with the preconditions they require and propagate the requirements up.

Code showing the preconditions required to ensure safety. Identify each memory access and annotate them with the preconditions they require.

(attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

In this example, the memory access led to the following annotations³³:

/* requires: a != NULL        */
/* requires: Ø <= i         */
/* requires: i < size(a)        */

The second annotation is taken care of by size_t i, which ensures that 0 <= i always holds. The two other annotations were not guaranteed by this function code so they were moved up as preconditions to ensure that n <= size(a).

Here is another example of the pre- and post-condition checks that are needed when using or creating pointer respectively to ensure safety³⁴:

/* requires: p != NULL (and p is a valid pointer) */
/* ensures: retval is the first four bytes p pointed to */
int deref(int *p) {
  return *p;
}
/* ensures: retval != NULL (and a valid pointer) */
void *myalloc(size_t n) {
  void *p = malloc(n);
  if (!p) {
    perror("malloc");
    exit(1);
  }
  return p;
}

Testing

The goal of testing software quality is to ensure that the specification and implementation of programs match. Furthermore, testing assumes that the specification is correct but it does not necessarily assume that implementation is correct.

Developers should not be end-to-end testers. A developer should focus on the implementation and unit testing while a tester focuses on the specification, which avoids related mistakes at both levels.

Testing approaches may be classified as illustrated in Figure 14.33.

A quadrant graph with two axes. The vertical axis is labeled “Manual” to “Automated,” and the horizontal axis is labeled “Black-box” to “White-box.” Each quadrant is implied to represent a different combination of manual/automated and black-box/white-box testing approaches.

Figure 14.33 Testing approaches can be either manual or automated and test software using black-box or white-box techniques. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

As illustrated, there are various ways to conduct testing. Automated testing involves writing scripts or using testing frameworks to simulate user interactions with a software application (e.g., clicking buttons, entering data, and verifying outcomes). It enhances efficiency by automating repetitive testing tasks, which results in saving valuable time and resources. Automated testing also improves accuracy by minimizing human errors, and it enhances test coverage by enabling regression testing and continuous testing and delivery via CI/CD pipelines. Manual testing assumes the creation of efficient test suites to provide optimal test coverage and may struggle to achieve comprehensive coverage and scalability due to time and resource constraints. Black-box testing does not require analyzing code, which works well for code that cannot be modified or is in a format that makes it difficult to analyze (e.g., obfuscated, managed, or binary code). White-box testing assumes an efficient test suite to provide the detailed tests evaluating the source code.

Test suites must be sized properly. Small numbers of tests cannot identify all the defects and large numbers of tests will slow down testing and make it harder to maintain tests due to bloating and redundancy. For example, the SQLite library (version 3.20.0) included approximately 125.4 thousand source lines of code (KSLOC) as compared to a project using it that had 730 times as much test code and scripts (i.e., 91616.0 KSLOC). It should be noted that KSLOC lines of code exclude blank lines and comments.

Code coverage is a metric used to quantify the extent of program code testing when using a given test suite. Function testing coverage focuses on which functions are called. Statement testing coverage focuses on which statements are executed, and branch testing coverage focuses on which branches are executed. Testing coverage is computed as a percentage of a program’s testing aspects covered by a given test suite. Practically, testing 100% of the code in a program is impossible. Cyclomatic complexity refers to the number of paths that exist in a program and should, in theory, be tested. That said, some code may not be accessible, and even if full testing were possible, it could take an infinite amount of time. Safety-critical applications do require 100% coverage. SQLite, as an example, has 100% branch coverage.

In manual white-box testing, tests are written by hand using full knowledge of the source code/deployment/infrastructure. They can be automated (e.g., run on all saves or commits).

In manual black-box testing, the tester interacts with the system in a black-box fashion and crafts ill-formed inputs, tests them, and records how the system reacts.

Automated testing techniques include:

Code analysis
- Static: Evaluating the source code can identify many of the bugs we have discussed.
- Dynamic: Run in a VM and look for invalid writes (Valgrind).
Fuzz testing
- Generate many random inputs and see if the program fails.
- Typically, it involves many inputs.
- There are various possible kinds of fuzzing:
  - Black-box: The tool knows nothing about the program or its input; it is easy to use and get started, but it will explore only shallow states unless it gets lucky.
  - Grammar-based: The tool generates input informed by grammar; more work is required to use it and to produce the grammar, but it can go deeper into the state space.
  - White-box: The tool generates new inputs at least partially informed by the code of the program being fuzzed; it is often easy to use but computationally expensive.
- Fuzzing inputs may be provided in different ways:
  - Mutation: Take a legal input and mutate it, using that as input; the legal input might be human-produced or automated (e.g., from a grammar or SMT solver query); mutation might also be forced to adhere to grammar.
  - Generational: Generate input from scratch (e.g., from a grammar).
  - Combinations: Generate initial input, mutate, generate new inputs, and generate mutations according to grammar.
- File-based fuzzing mutates or generates inputs and then runs the target program with them to see what happens; an example is Radamsa,³⁵ a mutation-based, black-box fuzzer, which mutates inputs that are given and passes them along³⁶:
  % echo "1 + (2 + (3 + 4))" | radamsa --seed 12 -n 4 5!++ (3 + -5)) 1 + (3 + 41907596644) 1 + (-4 + (3 + 4)) 1 + (2 + (3 + 4 % echo … | radamsa --seed 12 -n 4 | bc -1
  Another example is Blab, which generates inputs according to grammar (i.e., it is grammar-based), specified as regexps and CFGs³⁷:
  % blab -e '(([wrstp][aeiouy]{1,2}){1,4} 32}{5} 10' soty wypisi tisyro to patu
- Network-based fuzzing can act as half of a communicating pair; inputs could be produced by replaying previously recorded interaction, and altering it, or producing it from scratch (e.g., from a protocol grammar). It can also act as a “man-in-the-middle” by mutating inputs exchanged between parties (perhaps informed by grammar).
- There are many fuzzers out there, such as American Fuzzy Lop (mutation-based white-box buzzer), SPIKE (library for creating network-based fuzzers), Burp Intruder (automates customized attacks against web apps), BFF, and Sulley. Fuzzers help find the root cause of a crash by answering questions such as: Is there a smaller input that crashes in the same spot (makes it easier to understand)? Are there multiple crashes that point back to the same bug? Can you determine if a crash represents an exploitable vulnerability (in particular, is there a buffer overrun)?
- Fuzzing may help find memory errors:
  First, compile the program with AddressSanitizer (ASan), which instruments accesses to arrays to check for overflows, and use-after-free errors, then fuzz it and check if the program crashed with an ASan-signaled error. If that is the case, worry about exploitability; similarly, you can compile with other sorts of error checkers for testing (e.g., Valgrind memcheck).
Automated black-box testing uses fuzzing components as explained to generate test cases, execute the applications, and perform detection and logging.
In automated white-box testing, tests are created automatically/dynamically. Tools exist to perform this type of testing and may record a trace of the tested program on well-formed inputs and perform symbolic execution to capture constraints on inputs. Automated testing may then negate a constraint or use a constraint solver to derive a new input and run on that input. When using American fuzzy lop, compile-time instrumentation is provided, and the instrumentation guides genetic algorithms.
Penetration testing
Another testing technique is penetration (pen) testing. Fuzz testing is a form of pen testing. Pen testing assesses security by actively trying to find exploitable vulnerabilities, which is useful for both attackers and defenders. Pen testing is useful at many different levels (e.g., for testing programs, testing applications, testing a network, testing a server).

Reverse Engineering

Reverse engineering (RE) is the process of discovering the technological principles of a program through analysis of its structure, function, and operation. It corresponds to the solution development life cycle run backward. Reverse engineering is useful for malware analysis, vulnerability or exploit research, copyright/patent violations check, interoperability assessment (e.g. understanding a file or protocol format), and copy protection removal. The legality of RE is a gray area and usually breaches the end-user license agreement (EULA) software contract. Additionally, the Digital Millennium Contract Act (DMCA) governs reverse engineering in the United States. You “may circumvent a technological measure . . . solely for the purpose of enabling interoperability of an independently created computer program.”

There are two techniques used for RE, and a combination of the two works best in general:

Static code analysis focuses on the code structure and uses a disassembler.
Dynamic code analysis focuses on the code operation and uses tracing, hooking, and debuggers.

Disassembling code is difficult and often imperfect due to benign optimizations (e.g., constant folding, dead code elimination, inline expansion) and intentional obfuscation (e.g., packing, no-op instructions). Malware uses a lot of packing; overall, 90% of the code is packed.

Dynamic analysis takes advantage of debuggers’ features (e.g., trace every instruction a program executes via single stepping; let the program execute normally until an exception; at every step or exception, observe/modify instructions/stack/heap/register set; inject exceptions at arbitrary code locations; use INT3 instruction to generate a breakpoint exception). Debugging has many benefits as it is sometimes easier to see what the code does or allow unpacking to let the code unpack itself and debug as normal. Most debuggers have built-in disassemblers anyway. It is always possible to combine static and dynamic analysis. However, it is possible to run into difficulties with debugging when executing potentially malicious code (using an isolated virtual machine). The attacker may have used anti-debugging methods to detect the debugger and changed the program behavior so that it runs differently than when not being debugged (e.g., used IsDebuggerPresent(), INT3 scanning, timing, VM-detection, pop ss trick). Anti-anti-debugging can be tedious.

A common way of evasion is to detect evidence of monitoring systems (e.g., fingerprint a machine/look for fingerprints) or hide real malicious intents if necessary as follows:

IF VM_PRESENT() or DEBUGGER_PRESENT()
    Terminate()        // hide real intents
ELSE    
    Malicious_Behavior()    //real intent

The general taxonomy of malware evasion is illustrated in Table 14.5.

Difficulty	Layer of Abstraction	Examples
Easiest	Application	Installation, execution
Easy	Hardware	Device name, driver
Somewhat difficult	Environment	Memory, execution artifacts
More difficult	Behavior	Timing

Table 14.5 Malware Evasion Taxonomy

In general, there is a prevalence of evasion, and 40% of malware samples exhibit fewer malicious events with a debugger attached.

Internet Solutions Cybersecurity

The Internet is a network of networks, which is an interconnected set of nodes. Nodes at the edge of the network are called (end-)hosts while nodes within the core of the network are routers. The network uses IP addresses to name the nodes, while humans use more easily memorable host names that point to the corresponding IP addresses. The Dynamic Host Configuration Protocol (DHCP) can create IP addresses and associate them with hosts as they connect to the network. The Domain Name Service (DNS) maps domain names to corresponding routable IP addresses (Figure 14.34).

Illustration of how DHCP can create IP addresses and associate them with hosts as they connect to the network. DNS maps domains names to corresponding routable IP addresses. DHCP client, switch 1, router 1. Cloud, router 2, switch 2, server.

Figure 14.34 Users obtain IP addresses from the DHCP server when they boot up a system. This process helps ensure that only authorized users can access the Internet. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

Web Infrastructure Assurance

The World Wide Web (the Web) is an organizational system for information that is accessible by using the Internet. In other words, the Web infrastructure is provided by the TCP/IP network stack. In particular, review the roles of the various layers of the Internet TCP/IP network stack, the protocols associated to each layer, and how packets are created and exchanged over the Internet. It is important to know what each layer is responsible for and what the predominant protocols are at each layer. Finally, experimenting with existing network protocol analyzers (e.g., Wireshark) to study packets and communication is recommended. The overall design principles of the TCP/IP network stack have been critical to making an Internet that can evolve with changing needs (at least for the most part), but the the details really matter. In the following, we will dig into specific protocols to understand the kinds of attacks that can happen at the networking layer and how to protect against them.

As noted earlier, the DHCP protocol can create IP addresses dynamically and associate them to hosts as they connect on the network. The DNS maps domains names to corresponding routable IP addresses (Figure 14.35).

Illustration of how DHCP protocol creates IP addresses dynamically and associates them to hosts as they connect on the network. Host doesn’t have IP address yet, DHCP discover, DHCP server offer. Host doesn’t know who to ask for one, DHCP request asks for the offered IP address, DHCP server DHCP ACK. Solution: Discover one on the local subnet.

Figure 14.35 The DHCP protocol creates IP addresses dynamically and associates them to hosts as they connect on the network. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Because DHCP requests are broadcasted to all nodes neighboring nodes initially, attackers on the same subnet can hear a new host request and can race the actual DHCP server to replace the DNS server (i.e., redirect any of a host’s lookups such as “what IP address should I use when trying to connect to google.com?” to a machine of the attacker’s choice) or the gateway, where the host sends all of its outgoing traffic so that the host does not have to figure out routes by itself; by making a machine of the attacker’s choice the gateway, the attacker would be able to act as the MitM and gain access to all of the traffic to and from the user’s machine. So, how can a user detect such an attack?

The DNS service divides the domain name namespace into zones for administrative reasons. Subdomains do not need to be in the same zone, which allows the owner of one zone (e.g., nyu.edu) to delegate responsibility to another (e.g., cs.nyu.edu). The name server is the piece of code that answers queries of the form “What is the IP address for cs.nyu.edu?” Every zone must run at least two name servers. Caching is central to the success of the DNS service. Unfortunately, it is also central to attacks such as cache poisoning, which consists of filling a victim’s cache with false information (Figure 14.36).

Illustration of DNS cache poisoning consisting of filling local nameserver DNS cache with false information. Will cache www.bank.com = 6.6.6.6 and ignore authority’s answer.

Figure 14.36 A DNS cache poisoning example consists of filling a local name server DNS cache with false information. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

In the diagram, the recursive name server is the name server that does the heavy lifting and issues DNS queries on behalf of the client resolver (i.e., the host that asks DNS queries on behalf of the client) until an authoritative answer returns. Because the local resolver has a lot of incoming/outgoing queries at any point in time, it determines which response maps to which queries by using a query ID (i.e., a 16-bit field in the DNS header shown as 16322 as an example in the diagram). The requester sets the query ID to whatever it wants and the responder must provide the same value in its response. For a cache poisoning attack to work, the attacker must guess the query ID, ask for it, and go from there. Note that a partial defense is to randomize query IDs, but this takes space, and the attacker can issue a lot of query IDs. Once the attacker has guessed the query ID, it must guess the source port number, which is typically constant for a given server (often always 53). Note that if the answer is already in the cache, the attacker will avoid issuing a query in the first place.

The same cache poisoning approach may be used to poison more than one record (Figure 14.37).

Illustration of using cache poisoning to poison more than one cached record. Will cache "the person to ask for all bank.com queries is 6.6.6.6".

Figure 14.37 The same cache poisoning approach may be used to poison more than one cached record. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Note that randomizing query ID is not sufficient in itself because there are only 16 bits of entropy. So the source port should be randomized as well because there is no reason for it to stay and it is possible to obtain another 16 bits of entropy in this way. Another solution is to use Domain Name System Security Extensions (DNSSEC). If everyone has deployed it, and if you know the root’s keys, then DNSSEC prevents spoofed responses. DNSSEC uses public key infrastructure (PKI) to secure communication between the DNS servers in the various zones, and the authoritative answer is signed. But unlike PKIs, if one or more name servers has not deployed DNSSEC (which is the case in incremental deployments), then DNSSEC is not very useful. While it is possible to ignore name server responses without DNSSEC, this would improve security but it prevents the user from connecting to a number of hosts.

Now, let us focus on the networking protocols and study possible TCP/IP attacks and defenses. In particular, let us look at the (inter)network layer, which works across different link technologies, bridges multiple “subnets” to provide end-to-end Internet connectivity between nodes, and provides global addressing (IP addresses). Note that if the transport layer uses the TCP protocol, it will only result in best-effort delivery of data (i.e., no retransmissions). The IPv4 packet header used by the IP protocol is 20 B long, and one of the header fields is the source IP address. Nothing in the IP protocol enforces that the source IP address is yours. Furthermore, the IP protocol does not protect the payload or headers. Source spoofing exploits this (Figure 14.38).

Illustration of how source spoofing exploits the fact that IP protocol does not protect the payload or headers. There is nothing in IP that enforces your source IP address is really yours. With eavesdrop/tamper, IP provides no protection of the payload or header.

Figure 14.38 Source spoofing exploits the fact that the IP protocol does not protect the payload or headers. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Source spoofing may be used to send many emails from one computer (i.e., email spamming). The recipient may, in return, block emails from a given (source) IP address but the attacker could spoof the source IP address as a countermeasure. So, does a packet you receive have a spoofed source?

Because the Internet operates via destination-based routing, the response to a spoofed-source message sent by an attacker goes to the spoofed source rather than the attacker (i.e., pkt (spoofed source) -> destination: pkt -> spoofed source). Therefore, to know whether a packet you receive has a spoofed source, you can send a challenge packet to the possibly spoofed source (e.g., a difficult-to-guess, random number used once [nonce]). If the recipient can answer the challenge, then it is likely that the source was not spoofed. The problem with this approach is that you have to do this for every packet (i.e., every packet should have something difficult to guess). This is analogous to the easily predicted query IDs in the DNS query poisoning attacks that facilitated Kaminsky’s attack.

Source spoofing may also be used for denial of service (DoS) attacks. The idea is to generate as much traffic as possible to congest the victim’s network. An easy defense is to block all traffic from a given source near the edge of your network. An easy countermeasure is to spoof the source address. Challenges will not help here because the damage has been done by the time the packets reach the core of your network. So, ideally, you would need to detect such spoofing near the source, and egress filtering does exactly that. The point (router/switch) at which traffic enters your network is the ingress point, and the point (router/switch) at which traffic leaves your network is the egress point. While you do not know who owns all IP addresses worldwide, you do know who in your network gets what IP addresses. Therefore, your egress point can drop any packets whose source IP address does not match the IP address your network assigned to that machine. This egress filtering approach is not often deployed because your egress point bears the costs but your network does not gain any benefit.

The defense methods suggested earlier to counter eavesdropping/tampering with IP headers are clearly not bulletproof. A better protection method against the fact that no security is built into IP is to deeply secure IP over IP. This is done by using a virtual private network (VPN). The goal of a VPN is to allow a client to connect to a trusted network from within an untrusted network. For example, you could use a VPN to connect to your company’s network for payroll file access while visiting a competitor’s office. In that case, a VPN client and server would create an end-to-end encrypted/authenticated channel, as illustrated in Figure 14.39. A predominant way of achieving this is to use Internet Protocol Security (IPSec) to secure IP datagrams (instead of using TLS or secure shell at the application layer level). This was considered a good idea circa 1992–1993 as it would secure all traffic (not just TCP/UDP) and automatically secure applications (without requiring changes). It also provides built-in firewalling/access control. Initial proposed standards were published in 1988, and a revision (i.e., Internet Key Exchange version 2) was approved in 2005.

A diagram showing a “Trusted Client” (labeled C) connected to a server (labeled S) via an encrypted connection over an untrusted network. The server connects to other “Servers” within a trusted network, with a note that this part is not necessarily encrypted.

Figure 14.39 A VPN client and server can be used to create an end-to-end encrypted/authenticated channel. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

The Internet Key Exchange addressed anonymity issues and DoS prevention. There have been many implementations of IPSec, and nearly all deployments are in VPN settings. People ended up switching over to SSL/VPN, but that was not how SSL was intended to be used. IPSec is regarded today as a semifailure as it is complex, hard to use, and exhibits design flaws. IPsec did not get the usage model right, but SSL/TLS and SSH (discussed later in this subsection) got it right.

IPSec operates in a few different modes:

Transport mode: Encrypts the payload but not the headers
Tunnel mode: Encrypts the payload and the headers

The corresponding packet formats are illustrated in Figure 14.40.

Illustration of how IPsec uses corresponding packet formats for transport mode and tunnel mode. Transport mode: IP Hdr, IPsec Hdr, TCP Hdr, Data. Tunnel mode: IP Hdr, IPsec, IP Hdr, TCP Hdr, Data.

Figure 14.40 IPsec uses specific packet formats for transport mode and tunnel mode. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

For routing to work when tunnel mode encrypts the headers, IPSec encrypts the entire IP packet and makes it the payload of another IP packet.

Figure 14.41 illustrates using IPSec in tunnel mode. In this case, the VPN server decrypts and then sends the payload (itself a full IP packet) as if it had just received it from the network. From the client/server’s perspective, it looks like the client is physically connected to the network.

A diagram showing a “Trusted Client” (labeled C) connected to a server (labeled S) via an encrypted connection over an untrusted network. The connection carries an encrypted packet labeled “{E(P)}”. The server then forwards the packet (labeled P) to other “Servers” within a trusted network, which is not necessarily encrypted.

Figure 14.41 In IPsec tunnel mode, the VPN server decrypts and then sends the payload (itself a full IP packet) as if it had just received it from the network. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Now, let us focus on the transport layer of the TCP/IP stack and study possible TCP/IP attacks and defenses. The transport layers ensure end-to-end communication between processes. It provides different services, including UDP (unreliable datagrams) and TCP (reliable byte stream). Reliable means that it keeps track of the data appropriately received and retransmits packets as necessary. Given best-effort delivery, the goal is to ensure reliability. All packets are delivered to applications in unmodified order (reasonably high probability) and TCP must robustly detect and retransmit corrupt or lost data. TCP’s second job is flow and congestion control. The idea is to try to use as much of the network as is safe (not adversely affecting others’ performance) and efficient (using network capacity). The TCP solution is to dynamically adapt how quickly it sends packets based on the network path’s capacity. Furthermore, when an ACK doesn’t return, the network may be beyond capacity and slow down. TCP is a connection-driven protocol, and there are various TCP flags in the TCP header that are used to manage connections as indicated:

SYN: Used for setting up a connection
ACK: Acknowledgments for data and “control” packets
FIN: Used for shutting down a connection (two-way) using FIN and FIN+ACK
RST: Used for shutting down notification (says “delete all your local state, because I do not know what you are talking about”)

Various attacks are known to take advantage of the transport layer vulnerability. For example, SYN flooding takes advantage of a vulnerability in TCP’s connection setup’s three-way handshake as illustrated in Figure 14.42.

A waterfall diagram representing a sequence of messages exchanged between two entities, labeled A and B, over time. The steps include “SYN” from A to B, “SYN + ACK” from B to A, and “ACK” from A to B. After the “SYN + ACK” is received by A, there is a second “SYN + ACK X” sent from B to A. A note on the right side indicates that at this point, B allocates state for the new connection, including IP, port, and maximum segment size.

Figure 14.42 SYN flooding takes advantage of a vulnerability in TCP’s connection setup’s three-way handshake. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

If B does not receive an acknowledgment, it will hold onto this local state and retransmit SYN+ACK until it hears back or times out (up to 63 s). It is easy to detect many incomplete handshakes from a single IP address and then spoof the source IP address as illustrated in Figure 14.43 (it is just a field in a header as described earlier that can be set to whatever the attacker “C” likes). A possible problem is that the host who owns that spoofed IP address may respond to the SYN+ACK with a RST, deleting the local state at the victim. Therefore, an attacker should spoof an IP address of a host that it knows will not respond.

A diagram showing a series of “SYN” requests sent from A to B, eventually exhausting memory at B (the victim). Multiple blocks labeled “IP/port, MSS,...” accumulate at B, indicating that state is being allocated for each connection. The diagram shows that new connections will fail due to insufficient memory. An attempt from C to send a “SYN” request to B is blocked, indicated by a red X.

Figure 14.43 This IP address spoofing example illustrates how to detect many incomplete handshakes from a single IP address and then spoof the source IP address. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

A defense against SYN flooding is to use SYN cookies, as illustrated in Figure 14.44.

A diagram showing a sequence between A and B. A sends “SYN” to B, B responds with “SYN + ACK” where the sequence number (seqno) is a function of the data (f(data)). A replies with “ACK f(data)+1”. On the right, it is noted that instead of storing the data, B sends it to the initiating host (A) and checks if the returned data is valid for the connection before allocating state.

Figure 14.44 Using SYN cookies provides a defense against SYN flooding. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

The SYN cookie format is illustrated in Figure 14.45.

A diagram showing a sequence where A sends “SYN” to B, B responds with “SYN + ACK” where the sequence number is calculated as a function of the data (seqno = f(data)), and A replies with “ACK f(data)+1”. On the right, there is an explanation of the function f(.) for the 32-bit sequence number, which includes a slow-moving timestamp (to prevent replay attacks), MSS (the information needed for the connection), and a secure hash (including IPs, ports, MSS, and timestamp). A note explains that the secure hash makes it difficult for an attacker to guess the correct ACK in case of spoofing.

Figure 14.45 SYN cookies use a specific format. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

Injection attacks take advantage of having a node on the path between source and destination. In that case, injecting packets with the correct sequence number is trivial. If the node is not on the path, it would need to guess the sequence number, which is difficult. Initial sequence numbers used to be deterministic, and it was easy to wreak havoc by sending RSTs, injecting data packets into an existing connection (i.e., TCP veto attacks), or initiating and using an entire connection without ever hearing the other end. Figure 14.46 illustrates one type of attack known as the Mitnick attack.

A diagram illustrating an attack where an attacker spoofs a connection between an X-terminal server and a trusted server. The attacker sends a “SYN” flood to the trusted server, then spoofs the trusted server’s IP address in a “SYN” request to the X-terminal. Due to the flood, the trusted server is too busy to respond with RST (reset). The attacker then sends an “ACK” with a guessed sequence number (seqno). If successful, the X-terminal grants access to the attacker. Steps 1-6 are listed to explain the attack process, and a note states that any connection from the spoofed IP is granted access to the X-terminal server.

Figure 14.46 A Mitnick attack spoofs a trusted server’s IP address to gain access to the X-terminal server to which it connects. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

A typical defense is to ensure that the initial sequence number is difficult to predict.

OPT-ACK attacks take advantage of the fact that TCP uses ACKs not only for reliability but also for congestion control (i.e., the more ACKs come back, the faster it can send), as illustrated in Figure 14.47.

A diagram showing data transmission between A and B. A sends “Bytes 1000-1500” to B, and B responds with “ACK 1501”, indicating it is expecting the next byte to be 1501. A then sends “Bytes 1501-2001” and “Bytes 2002-2502”, with B continuing to expect the next byte starting from 1501.

Figure 14.47 An OPT-ACK attack takes advantage of the fact that TCP uses ACKs not only for reliability but also for congestion control. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

An attacker can exploit this as illustrated in Figure 14.48.

A diagram showing A sending “Bytes 1000-1500,” “Bytes 1501-2001,” and “Bytes 2002-2502” to B, with B responding with “ACK 1501.” On the right, a note from B explains that normally B would need to receive data to send ACKs quickly, but it suggests that convincing A to send much faster could effectively cause A to perform a denial-of-service (DoS) attack on its own network.

Figure 14.48 An OPT-ACK exploit convinces TCP to send packets quickly, which results in a DoS attack on the network. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license; credit: reproduced with permission from Dave Levin)

The actual attack scheme is illustrated in Figure 14.49.

A diagram showing A sending “Bytes 1000-1500,” “Bytes 1501-2001,” and “Bytes 2002-2502” to B, with B sending early “ACK 1501,” “ACK 2001,” and “ACK 2502” (optimistically). B predicts the sequence numbers and timing. A perceives this as a fast connection, but eventually, A's packets start to get dropped. B continues sending ACKs, making the dropped packets irrelevant as long as the ACKs are correct.

Figure 14.49 The OPT-ACK attack scheme results in packets getting dropped while TCP is fooled into thinking that packets were acknowledged. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

The big deal with this attack is its amplification factor. An attacker sends many bytes of data, causing the victim to send many more in response. There are examples of such attacks on NTP and DNSSEC. The attack is amplified in TCP due to its support for cumulative ACKs (i.e., “ACK x” says “I’ve seen all bytes up to but not including x”). Figure 14.50 illustrates the maximum number of bytes that can be sent by a victim per ACK.

A formula showing the calculation for “Packets sent per ACK,” represented by the maximum window size divided by the MSS (Maximum Segment Size), multiplied by the “Bytes per packet,” which includes the sum of Ethernet (14 bytes), TCP/IP (40 bytes), and MSS.

Figure 14.50 This illustration shows the maximum number of bytes per ACK. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

Figure 14.51 shows the maximum number of ACKs that an attacker can send per second.

A formula showing “Attacker bandwidth (bytes/sec)” divided by the sum of 14 (Ethernet) and 40 (TCP/IP), representing the size of an ACK packet.

Figure 14.51 This illustration shows the maximum number of ACKs sent by the attacker. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

Therefore, the amount of damage is the value of max window size and MSS (e.g., default max window size: 65,536 and default MSS: 536). In that case:

Default amp factor: 65536 * (1/536 + 1/54) ~ 1336x
Window scaling lets you increase this by a factor of 2^14
Window scaling amp factor: ~1336 * 2^14 ~ 22M
Using a minimum MSS of 88: ~ 32M

A challenge is to find a solution to defend against OPT-ACK in a way that is still compatible with existing implementations of TCP. Also, note that an essential goal in networking is incremental deployment. Ideally, we should be able to benefit from a system/modification when even a subset of hosts deploy it.

Now, let us focus on the application layer of the TCP/IP stack and study possible TCP/IP attacks and defenses. The Secure Socket Layer (SSL) protocol was originally a Netscape proprietary protocol that targeted e-commerce applications (i.e., what people thought the Web was for in 1994). The objective was to address outcomes such as “send my credit card to Amazon securely.” The basic principle (circa 1994) was to authenticate the server (via certificate) and let the client access the server unauthenticated. SSLv1 was designed by Kipp Hickman and had serious security flaws. He addressed some of the flaws in SSLv2, which still had security issues but was widely deployed. SSLv3 fixed these problems. The Transport Layer Security (TLS) 1.0 protocol was the first standardized version of SSL with some improvements for key derivation (refer to ietf.org RFC 2246). TLS 1.1 (RFC 4346) addressed some security flaws, and TLS 1.2 (RFC 5246) added more flexibility in using hash functions. TLS 1.3 brought significant changes (e.g., no RSA key exchange for forward secrecy, authenticated encryption modes, no RTT handshakes). As explained earlier, a trusted CA may vouch that a certain public key belongs to a particular site and issue a TLS certificate that abides by the x.509 format.

Web applications use HTTP over SSL/TLS (HTTPS), in which case the client knows that the server expects HTTPS (because it is specified in the URL and supported on a separate port on the server). Furthermore, the server certificate has its domain name. HTTP is stateless, and the lifetime of an HTTP session is as follows:

The client connects to the server.
The client issues a request.
The server responds.
The client issues a request for something in the response.
The interaction continues until client has received all the information it needs.
The client disconnects.

Because asymmetric (private key operations) are expensive (and HTTPS tends to involve a lot of SSL/TCP connections), caching pays off. Each handshake establishes a session, and clients can resume the session with the same keying material, skipping the key exchange. If the client and servers do not know each other’s capabilities, they can discover them and automatically upgrade to TLS. This, however, may allow downgrade attacks.

DoS attacks on SSL/TLS rely on the SSL/TLS connection requiring TCP handshake, and TCP connections are easy attack with a DoS. Protection against these types of attacks needs to be at a lower layer and is provided by Datagram TLS (RFC 4347). DTLS is a slight modification of TLS that provides reliability for the handshake and ensures that data records are independent.

Tatu Ylonen originally designed the Secure Shell (SSH) Protocol, which is a replacement for rsh. It is now the standard tool for secure remote login, and it provides a lot of authentication mechanisms, such as remote X, file transfer, and port forwarding. The transport protocol used by SSH looks a lot like TLS. SSL does not use certificates; the server just has a raw public key and it provides the key when the client connects. The client stores the server’s key on the first connection. Any changes in the key result in an error. The key can be authenticated from the band (i.e., the server operator tells the client the key fingerprint/hash over the phone; only the most concerned people do this). The SSH leap of faith authentication was considered extreme initially but is now considered clever. SSH client authentication first requires server authentication and then authenticates the client using various negotiated mechanisms (e.g., raw password, challenge-response, public key, GSS-API, Kerberos). SSH provides port forwarding/tunneling features. SSH port forwarding redirects network traffic to a particular port/IP address so that a remote host is made directly accessible by applications on the local host. The destination may be on the remote SSH server, or that server may be configured to forward to another remote host. SSH tunnels are powerful tools for IT administrators and malicious actors because they can transit an enterprise firewall undetected. As a result, tools are available to prevent unauthorized use of SSH tunnels through a corporate firewall. Figure 14.52 illustrates how an X11 remote connection can be established using SSH.

A diagram showing an X server on the left connected to an SSH client, with communication happening through “localhost:6000.” On the right, an X client is connected to an SSH server, with communication through “localhost.” The diagram illustrates an X11 forwarding setup using SSH.

Figure 14.52 With an X11 remote SSH connection, the server simply has to “setenv DISPLAY localhost: XXXX” and the applications just automatically work. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

Finally, SSH is backward and compatible with rsh, so other applications can be securely remoted without port forwarding. This is also useful when applications need insecure remote access.

The various application layer protocols we discussed are subject to a variety of attacks, including attack vectors, site design attacks, UI interface-based attacks, PKI attacks, implementation attacks (e.g., null termination attacks), Goto fail, heartbleed, BERserk attack, Logjam, Cloudbleed, and client-side HTTP interception.

Attack vectors are designed to attack the weakest certificate authority. They attack browser implementations. They may also find and exploit a key generation library bug that leads the attacker to discover all the private keys issued by that authority. They also attacked the cryptographic primitives, although this was more difficult to achieve.

SSLStrip attacks are examples of attacks that go after site design by proxying through the content without HTTPS. The defense is to default to HTTPS for all websites. You can also use HSTS (hypertext strict transport security), which is enforced by browsers; the header states to always expect HTTPS. HTTPS everywhere can also be forced using a browser extension. Some site design attacks use mixed content attacks. In this case, a page loads over HTTPS but contains content over HTTP (e.g., JavaScript). An active attacker can tamper with HTTP content to hijack the session. The defense is to issue browser warnings (e.g., "This page contains insecure content"), but the use of these warnings is inconsistent and the warnings are often ignored.

UI interface-based attacks exploit invalid certs (i.e., expired, misidentified URL, unknown CA such as a self-signed certificate). The defense is to issue browser warnings and require users to go through an anti-usability page to continue. Another type of UI interface-based attack is a picture-in-picture attack that spoofs the user interface (i.e., the attacker page draws a fake browser window with a lock icon). In this case, the defense is to create an individualized image.

PKI attacks compromise CAs. There was an example of such an attack in 2011 against a Dutch CA named DigiNotar. It issued a *.google.com certificate to an attacker that subsequently used it to orchestrate MitM attacks in Iran. Nobody noticed the attack until someone found the certificate in the wild. DigiNotar later admitted that dozens of fraudulent certificates were created. Google, Microsoft, Apple, and Mozilla all revoked the root DigiNotar certificate. The Dutch government took over DigiNotar, and the company subsequently went bankrupt and closed. In general, MD5/SHA1 is known to break and can generate collisions. In 2008, researchers showed they could create a rogue CA certificate using an MD5 collision. The attack consisted of colliding messages, A and B, with the same MD5 hash as follows:

A: Site certificate: "cn=attack.com, pubkey=....”
B: Delegated CA certificate: "pubkey=.... is allowed to sign certs for *”
Get CA to sign A -- Signature is Sign(MD5(message))
Signature also valid for B (same hash)
The attacker is now a CA!
Make a cert for any site, browsers will accept it

MD5 CA certificates still exist, but CAs have stopped signing certificates with them. SHA-1 should not be used either.

There are numerous other types of attacks:

Goto fail (Feb. 2014) was an Apple SSL bug that resulted in skipping certificate check for almost a year.
Heartbleed (April 2014) was an OpenSSL bug that leaked data, possibly including private key!
Mozilla BERserk vulnerability (Oct 2014) was a bug in verifying certificate signatures that allowed spoofing certificates.
Logjam (Oct 2016) took advantage of a TLS vulnerability to man-in-the middle “downgrade” attacks.
Cloudbleed, one of the most popular “content delivery networks,” acts as the SSL endpoint for many servers; a buffer overflow attack caused it to leak HTTPS data.
Client-side HTTP interception leverages the fact that most antivirus software intercepts your HTTPS, generating poor implementations and introducing new vulnerabilities.

Web/Mobile Applications Frameworks Assurance

The most typical web/mobile application frameworks are web servers and web browsers. Other web/mobile application frameworks (e.g., application servers, business process management suites) operate using web browsers and extended web servers and use the same security mechanisms. This subsection will only focus on web servers and web browsers. Various types of attacks affect web information and interactions, including SQL injection, cross-site scripting (XSS), path (directory) traversal, cross-site request forgery, remote file inclusion (RFI), phishing, clickjacking, authentication/authorization attack, buffer errors, web browser attack, information leak/disclosure, and web server attack.

Here are several security risks we try to protect web servers (and web browsers) against:

Risk 1: We want data stored on a web server to be protected from unauthorized access.
Risk 2: We do not want malicious (or compromised) sites to be able to trash files/programs on user computers.
Risk 3: We do not want a malicious site to be able to spy on or tamper with information or interactions with other websites.

The Federal Communications Commission has identified cybersecurity tips to help organizations protect against cyber threats when using the Internet. These include³⁸:

Ensure employees are trained regarding cybersecurity, including the organization’s security principles.
Keep the system updated with the latest security software and frequently run antivirus software.
Install and maintain a firewall, which is a collection of programs designed to prevent hackers and other unauthorized users from accessing a system.
Ensure that mobile devices and laptops are included in the security plan and secure all such devices.
Back up important files and data and store the backed-up information separately.
Ensure all employees have passwords and other credentials required to access the system and provide access on an as-needed basis.
Ensure that Wi-Fi networks are hidden and secured through encryption.

The following design and implementation guidelines can be accessed online through a browser search:

OMG cybersecurity initiatives (and related standards, guidelines, best practices, and other resources)
NIST cybersecurity standards (in particular NIST SP 800-53, SP 800-171, CSF, SP 1800 Series)
Other global IT security frameworks and standards: ISO 27000 Series, COBIT, CIS Controls, HITRUST Common Security Framework, GDPR, COSO
Industry IT security standards: PCI, HIPAA, PCI DSS, Sarbanes-Oxley (SOX), GLBA
CyBok
SAFECode
Open Source Security Testing Methodology Manual (OSSTMM)
Open Web Application Security Project (OWASP)
Web Application Security Consortium Threat Classification (WASC-TC)
Penetration Testing Execution Standard (PTES)
Information Systems Security Assessment Framework (ISSAF)

Cloud-Centric Solutions Cybersecurity

The need to reduce costs and make IT more responsive to business changes are driving more and more Internet solutions (e.g., web/mobile information systems) to various cloud platforms. There are numerous obstacles that make it difficult for end users/organizations to adopt the cloud from a TRM assurance standpoint. Providing security for cloud environments that matches the levels found in commercial internal data centers is essential to helping modern organizations compete, and to allowing cloud service providers (CSPs) to meet their end users’ needs.

Managing risk in the cloud requires that users fully consider exposure to threats and vulnerabilities, not only during procurement but also as an on-going process. Security in the cloud is a constant process and cloud users should continually monitor their cloud resources and work to improve their security posture. Threat actors in the cloud may target the same types of weaknesses as the ones found in traditional system architectures. However, when organizations use the cloud, they face additional cyber threats, including the following:

Malicious CSP administrators:
They can leverage privileged credentials or position to access, modify, or destroy information stored on the cloud platform. They can also leverage privileged credentials or positions to modify the cloud platform to gain access to networks connected to or consuming cloud resources.
Cyber criminals and/or nation-state-sponsored actors:
They can leverage a cloud architecture or configuration weakness to obtain sensitive data or consume cloud resources at the victim’s expense. They may exploit weak cloud-based authentication mechanisms to obtain user credentials (e.g., password spray attacks). They may leverage compromised credentials or incorrect access privileges to access cloud resources. They may gain privileged access to the cloud environment to compromise tenant resources. They may leverage the trust relationship between an end user or organization’s networks and cloud resources to pivot from clouds into protected networks or vice versa.
Untrained or neglectful customer cloud administrators:
They may expose sensitive data or cloud resources unintentionally.

Cloud Infrastructure Assurance

To match the levels of security that end users experience on premise, CSPs must make the proper investments in providing, proving and assuring appropriate levels of security over time. This requires building security and trust architectures that can assure each end user’s applications and data are isolated and secured from those of others. Before moving mission-critical information systems to the cloud, end users require robust cybersecurity, trustworthy cybersecurity assurance, and cloud governance as follows:

Robust security requires moving beyond a traditional perimeter-based approach to a layered cloud security architecture and an approach that assures the proper isolation of data, even in a shared, multitenant cloud. This includes content protection at different layers in the cloud infrastructure, such as at the storage, hypervisor, virtual machine, and database. It also requires mechanisms to assure confidentiality and access control. These may include encryption, obfuscation and key management as well as isolation and containment, robust log management, and an audit infrastructure. The security architecture provides the isolation, confidentiality, and access control required to protect end users’ data and applications.
Trustworthy cybersecurity assurance requires that the end users have confidence in the integrity of the complete cloud environment. This ranges from the physical data centers to the hardware and software, as well as the people and processes, employed by the CSP. This requires establishing an evidence-based trust architecture and control of the cloud environment provided by the CSP. It requires that the CSP provide adequate monitoring and reporting capabilities to assure the end user of transparency around security vulnerabilities and events. This should include audit trails that help the end user meet internal or external demands for provable security. A CSP should also deliver automated notification and alerts that support the end user’s existing problem or incident management protocols so they can manage their total security profile most easily. All of these collectively help assure the end user of the operational quality and security of the CSP.
Cloud governance requires the CSP to offer utilities that allow the end user to monitor their environment for security and other key performance indicators (KPIs) such as performance and reliability almost as well as they could in their own on-premises environment (or data center).

Think It Through

Cybersecurity on the Internet vs. the Cloud

Think about cybersecurity for systems that use the Internet and compare that to cybersecurity for systems that access the cloud. Note that moving systems to the cloud requires robust cybersecurity, trustworthy cybersecurity assurance, and cloud governance as explained earlier in this section. Evaluate how these requirements differ from those set forth for systems that use the Internet to determine whether cybersecurity on the cloud can be handled in the same way as security on the Web.

Cloud Services Assurance

Because end users leverage various service types such as Infrastructure as a Service (IaaS) vs. Platform as a Service (PaaS) to create solutions in the cloud, CSPs and cloud end users share unique and overlapping responsibilities to ensure the security of services and sensitive data stored in public clouds. Shared responsibility considerations include threat detection, incident response, and patching/updating. An example of a PaaS service provided by cloud platforms today is IoT PaaS.

As noted earlier, there are lots of challenges faced by IoT cloud platforms that affects the TRM security and integrity/privacy assurance quality. IoT devices and data are vulnerable to various threats, such as cyberattacks, data breaches, unauthorized access, and malicious manipulation. These threats can compromise the functionality, integrity, and confidentiality of IoT systems, as well as expose sensitive and personal information of customers and users. IoT cloud platforms also need to adhere to the evolving regulations and standards that govern the collection, storage, and use of IoT data, such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). This is why it is important to have a cloud security and trust architectures as well as cloud governance that provide robust security and privacy measures, such as encryption, authentication, authorization, monitoring, and compliance, that can protect IoT devices and data from end to end.

Another example of a PaaS service provided by cloud platforms today is big data analytics PaaS. As noted earlier and from a TRM assurance/security/privacy quality standpoint, security is clearly one of the major concerns with big data analytics. Hacking and various attacks to cloud infrastructure do happen and may affect multiple clients even if only one site is attacked. To optimize making sense from the big data, organizations need to integrate parts of their sensitive data into the bigger data. To do this, companies need to establish security policies which are self-configurable. These policies must leverage existing trust relationships, and promote data and resource sharing within the organizations, while ensuring that data analytics are optimized and not limited because of such policies. This is why it is important to have a cloud security and trust architectures as well as cloud governance to mitigate risks using security applications, encrypted file systems, data loss software, and buying security hardware to track unusual behavior across servers.

One last example of PaaS service provided by cloud platforms today is cloud robotics PaaS. From a security standpoint, when robots are connected to the cloud, they are susceptible to hacking and cyberattacks. This can pose a serious risk to both the safety of robots and the privacy of the data that they are collecting. This is again why it is important to have a cloud security and trust architectures as well as cloud governance to help companies develop cloud-connected robots and invest in robust cloud security measures.

Cloud Applications Frameworks Assurance

Cloud application frameworks are fully managed by CSPs or designed to leverage IaaS/PaaS services on secure cloud platforms. A cloud server (e.g., AWS, GCP, Azure, IBM Cloud) can replace traditional application frameworks at the cost of migrating traditional applications that used these frameworks to the cloud, which is a costly and time-consuming proposition. For that reason, big tech application frameworks are now available on secure cloud platforms (e.g., IBM WebSphere Hybrid Edition, Oracle WebLogic Server for Oracle Cloud Infrastructure). Traditional database management systems/frameworks are easier to migrate to the cloud. For example, traditional database management systems such as MySQL, PostgresSQL, or SQL Server, can be migrated to Google Cloud SQL. Google also provides cloud-based NoSQL database systems (e.g. Firestone document database, Bigtable key-value database, Memorystore in-memory database). Similar cloud database systems/frameworks are available on Microsoft Azure (e.g., Azure SQL database) and other cloud platforms. Cloud support for application frameworks is not limited to database systems/frameworks. For example, Azure App Service is a cloud platform framework that may be used to securely host web applications, REST APIs, and mobile applications. More recently, VMware developed a cloud application server called Tanzu, previously Cloud-Foundry, that fully operates on the cloud and leverages the latest container management and cloud cybersecurity technology. IBM Bluemix is also derived from Cloud-Foundry. Furthermore, all social media platforms (e.g., Facebook, Twitter, TikTok) can be considered as examples of secure cloud application frameworks. Because there are no standards for using/implementing secure application frameworks on the cloud, end users need to consult the various CSPs’ websites and stay up-to-date regarding the availability of such application frameworks.

Cloud Applications Assurance

Cloud applications are typically implemented on cloud servers, and developers take full advantage of established cloud security/trust architectures as well as cloud governance processes. Big tech organizations typically provide security best practices, models, and patterns to facilitate the creation of secure cloud applications on their own platform.

Cloud Information Assurance

In cloud computing, much of the large and critical databases are under the control of CSPs. These resources are located away from the end user’s physical location, and often in physical locations unknown to the end user. The possibility, or even likelihood, of data being stored in other regions and countries, also requires meeting those region’s legal and regulatory requirements for data protection. All this makes it more challenging to create trustworthy controls for the monitoring, governance, and auditing of the CSP environment. Therefore, as explained earlier, it is necessary to develop cloud security/trust architectures as well as cloud governance processes prior to developing information systems on the cloud or migrating traditional/legacy information systems and their data to the cloud.

Cloud Assurance Methodologies

Various organizations provide cloud security best practices and cloud cybersecurity assessment methodologies. In particular, the Cloud Security Alliance (CSA) and the European Union Agency for Cybersecurity (ENISA) promote best practices developed for providing security assurance within cloud computing. The CSA Security Guidance provides fourteen domains of cloud security best practices. It is built on dedicated research and public participation, incorporating advances in cloud, security, and supporting technologies. The Security Assurance Methodology (SECAM), is a security assurance framework developed by the 3rd Generation Partnership Project (3GPP) specifically for network products used in mobile communications. Big tech organizations typically provide risk assessment methodologies geared toward using their cloud platform. Microsoft, for example, publishes a risk assessment guide for Azure. Other CSPs provide similar.

Think It Through

Cloud Computing vs. Privacy

You are a software engineer, and you work for a company that provides open access to a mapping software for realtors. Your company merges with a real estate company that provides online services, such as sales and appraisals that your company did not previously do. As part of this new company, your boss wants you to take the customer database from the real estate company and add the personal information of homeowners in the local market to a private software package that lists all the houses in neighborhoods. As part of the new software, you are required to provide one-click access to this information from the mapping software. While creating these new features, you realize that the database of information contains fields such as social security number, mother’s maiden name, and primary email addresses. The mapping software company will be stored in the cloud and shared with ten offices. All employees in the company will have access to the new software.

What are some of your concerns?
What are some security concerns?
What recommendations can you make to help ensure the security of the PII?

Hardware Crisis

In October 2020, a warehouse fire severely damaged the Asahi Kasei Microdevices (AKM) semiconductor plant in Miyazaki, Japan. At the time, this was one of two RAM manufacturers worldwide. The global crisis for computer memory soon commenced shortly after this disaster. Hardware manufacturers had to slow the production of computers because the memory was not available for production, and the cost of the existing memory escalated. Japan’s other chip manufacturing plants could not meet worldwide demand. Today, three manufacturing plants account for more than 90% of the world’s RAM production. It seems that the world has not learned its lesson and diversified the production of this precious resource.

The other side of this story is that software advancements also stopped. New software packages are typically developed to incorporate newer technology features. If the newer computer manufacturing slowed down, the new innovative software packages also slowed down. This was not the same for cloud technologies. Because the resources are distributed and the hardware is not as essential, the cloud environment continues to thrive, and there has only been an increase in cloud usage since 2020.

Industry 4.0 Metaverse Smart Ecosystems Cybersecurity

While the metaverse and Web3 offer organizations new frontiers for customer engagement and business growth, these also create the potential for new cybersecurity risks that could lead to financial losses, brand and reputational damage, and legal challenges. These threats include system outages and disruptions because of data overload and threats that apply to other areas of computing, including ransomware and bots.

Smart Ecosystems Platforms Assurance

As noted earlier, Industry 4.0 smart ecosystems combine various platforms/services (e.g., 3-D Modeling, AR/VR, Edge Computing, Blockchain, AI/ML, and 3-D/4-D printing) and are typically deployed on top of a hybrid cloud/blockchain environment today. For example, they may use AI/ML platforms/services to support the rendering and management of realistic models of a 3-D world or digital twins. The various platforms/services operate within secure cloud/blockchain platforms that are managed using established cloud/blockchain security/trust architectures as well as cloud governance processes. Because the platforms/services can be assembled as mashups by combining offerings from multiple cloud platforms, it is necessary to consider the cybersecurity mechanisms discussed in Cloud Applications Frameworks Assurance in order to understand how to best secure cloud mashups. In this subsection, we briefly discuss the security vulnerability and defenses required when leveraging specific smart ecosystems platforms/services.

As artificial intelligence and other advanced technologies become more prominent and create supersocieties, we also face additional cybersecurity threats. For example, cybercriminals can use AI to leverage more sophisticated cyberattacks. At the same time, AI can be a tool against cybercrime, providing organizations with sophisticated technology to handle security tasks such as detecting suspicious activity in the system. In addition, AI can be used in testing, such as simulating system attacks to help cybersecurity professionals identify areas of risk and vulnerabilities that should be addressed. According to IBM, AI can help protect data in hybrid cloud environments with tools such as shadow data identification and monitoring for data abnormalities. AI can also create incident summaries and automate responses to these incidents, improving investigations and outcomes. AI’s ability to analyze login attempts and verify users can reduce fraud costs by as much as 90%.³⁹

Technology in Everyday Life

Using AI in Cybersecurity

AI creates new cyber threats but also provides additional tools to improve cybersecurity. Think about how AI is used in everyday life and consider AI as both a threat and a tool in cybersecurity.

Provide a few scenarios illustrating the benefits and drawbacks of AI-driven cybersecurity (e.g., monitoring and analyzing behavior patterns, preventing bad actions and outcomes) and explain your opinion.

Metaverse Smart Ecosystems Platform Cybersecurity Assurance Methodologies

To protect against cybersecurity threats in the metaverse, organizations should use many of the same security measures to protect against cyber threats on the Internet. They also need to address new cybersecurity risks that could lead to financial losses, brand and reputational damage, and legal challenges.

Figure 14.53 illustrates the typical four layers of metaverse platforms along with eight major threats in virtual world of the metaverse.

Diagram showing the metaverse with four layers of interaction, application, computing, network, and eight categories of crypto and NFTs, AI and Data Management, Ethics, Privacy, and Compliance, IoT, Wearables, and AR/VR, Decentralized Identity and Apps, Blockchain and Smart Contract, Identity Theft and Cybercrime, Phishing and Social Engineering.

Figure 14.53 The metaverse has four platform layers that generally face eight categories of cyber threats. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

Industry 5.0 Supersociety Solutions Cybersecurity

In addition to the smart ecosystem services mentioned in the previous subsection, Industry 5.0 supersociety solutions combine various platforms and technologies such as autonomous systems platforms, advanced robotics platforms, nanotechnology, super compute, and autonomous super systems platforms. These solutions are typically deployed on top of a hybrid cloud/blockchain environment today. Supersociety platforms enable AI-powered robots, and supersociety technologies (e.g., nanotechnology, super compute) make it possible to improve existing services to further enable smart ecosystems services. The various supersociety platforms operate on top of secure cloud/blockchain platforms that are managed using established cloud/blockchain security/trust architectures as well as cloud governance processes. In general, because the underlying services that are part of these platforms can be assembled as mashups by combining services from multiple cloud platforms, it is necessary to consider the cybersecurity mechanisms discussed in Cloud Applications Frameworks Assurance in order to understand how to best secure cloud mashups. In this subsection, we briefly discuss the security vulnerability and defenses required when leveraging specific supersociety platforms and technologies.

Supersociety Autonomous Systems Platform Assurance

Autonomous systems platforms are an essential component of the future of artificial intelligence. They provide the tools and frameworks for building, testing, and deploying autonomous systems (e.g., self-driving cars, drones) that can operate in a variety of environments. However, there are various data security/privacy, regulatory challenges, and ethical concerns associated with autonomous systems platforms. The fact that autonomous systems rely on large amounts of data raises concerns around data security and privacy. As was the case for smart ecosystems AI/ML platforms discussed earlier, which face the same type of issues, it may not always be possible to simply rely on the establishment of cloud security/trust architectures and cloud governance processes. Using third-party tools to address the lack of scalability of a CSP’s cloud platform may introduce vulnerabilities and requires additional defense mechanisms. The deployment of autonomous systems is also subject to various regulations and standards that are not typically covered by cloud security/trust architectures and governance processes. Therefore additional security architecture components and processes will need to be researched and provided based on the domain of application of the autonomous platform. Finally, autonomous systems raise ethical concerns around issues such as accountability, transparency, and bias. These aspects should be covered in the cloud security/trust architectures and cloud governance processes.

Supersociety Advanced Robotics Platform Assurance

Supersociety advanced robotics platforms provide the tools and frameworks for building, testing, and deploying AI-powered robots (e.g., cyborgs, swarmbots) that can work alongside humans. The technical challenges associated with securing advanced robotics platforms are analogous to those of securing autonomous systems platforms, which were covered in Supersociety Autonomous Systems Platform Assurance. Refer to that discussion to review the security challenges and associated defenses that must be put in place. In addition to these technical challenges, there are also allied social, legal, and ethical issues for seamless integration of humanoids into our societies. There is a lot of research focused on this aspect today. One question is whether there should be special laws to govern robots.

Supersociety Nanotechnology Platform Assurance

Nanotechnology is one of the supersociety technologies that make it possible to improve existing services to further enable smart ecosystems services and support supersociety platforms. The emergence of nanotechnology presents an entirely new set of potential risks, as well as potential solutions to cybersecurity. Because nanotechnology involves the manipulation of matter on an atomic or molecular scale, it is not too far-fetched to think that it could enable the development of “smart” materials that could detect and react to malicious software or threats. Nanotechnology could also enable the creation of tiny sensors that could detect unauthorized access to networks or data. The use of nanotechnology in cybersecurity could provide users with a greater level of privacy and security. For example, nanomaterials could be used to create encryption keys that are much more difficult to crack than current methods. In addition, the use of nanotechnology could make it easier to detect and prevent data breaches. However, the use of nanotechnology also presents some potential risks. For example, the use of nanomaterials could create new vulnerabilities that could be exploited by malicious actors. Additionally, the use of nanotechnology could lead to the creation of devices or systems that are too complex for humans to understand or control.

One of the most promising applications of advanced materials in cybersecurity today is the use of graphene. This two-dimensional material, which is composed of a single layer of carbon atoms, has many properties that make it ideal for security applications. It is highly conductive, strong, and lightweight, and is impermeable to many substances. Graphene has already been used in a variety of devices, including computer chips, touchscreens, and RFID tags, and its potential for cybersecurity applications is vast. Another application of advanced materials in cybersecurity is the use of nanomaterials. Nanomaterials, such as nanotubes and nanowires, are incredibly small, making them difficult to detect. This makes them ideal for use in encryption and authentication systems, where the smallest of details can make all the difference. Additionally, nanomaterials can be used to develop new types of sensors that can detect intrusions and unauthorized access attempts.

Overall, nanotechnology has the potential to revolutionize the world of cybersecurity, both in terms of the solutions it offers and the risks it creates. As this technology continues to develop, it is important that the security industry works to ensure that the benefits of nanotechnology are maximized while minimizing the risks. This appears to be the only way to implement supersociety nanotechnology platforms assurance.

Supersociety Supercompute Platform Assurance

Supercompute is yet another supersociety technology that makes it possible to improve existing services to further enable smart ecosystems services and support supersociety platforms. It is also a technology that has the potential to revolutionize the world of cybersecurity, both in terms of the solutions it offers and the risks it creates. With respect to risks, the vastly increased processing speed associated with the use of quantum or neuromorphic computers will definitely have an impact on some of the cryptography algorithms that are in use today. In particular, while symmetric key encryption and collision-resistant hash functions are considered to be relatively secure against attacks by quantum computers, signature schemes based on the integer factorization problem, the discrete logarithm problem, or the elliptic curve discrete logarithm problem can be solved with Shor’s algorithm with a quantum computer that is powerful enough. On the positive side, noninteractive ZKPs that only collision-resistant hash functions are plausibly post-quantum secure and can be used to replace traditional signature schemes based on public-key cryptography that are not quantum-resistant. Supercompute will also improve the speed of cryptographic computations that are needed to operate secure platforms such as blockchain and further optimize the verification of the transactions. Again, as this technology continues to develop, it is important that the security industry works to ensure that the benefits of supercompute from a cybersecurity standpoint are maximized while minimizing the risks. This appears to be the only way to implement supersociety supercompute platform assurance.

Supersociety Autonomous Supersystems Platform Assurance

The technical challenges associated with securing autonomous supersystems are analogous to those of securing autonomous systems platforms, which were covered in Supersociety Supercompute Platform Assurance. Refer to that discussion to review the security challenges and associated defenses that must be put in place. The only difference is the fact that supersystems will make use of new supersociety technologies that are emerging such as nanotechnology and supercompute. Given the ethical and social usability challenges (from a TRM quality standpoint), there are growing concerns that the combined use of the various supersociety technologies described earlier to power autonomous supersystems could cause threats to humanity and future civilizations. It will therefore be important for the security industry to ensure that the benefits of these technologies are maximized while minimizing the risks. This appears to be the only way to implement supersociety autonomous supersystem platform assurance.

Footnotes

12IBM. 2023. “Cost of a Data Breach Report 2023. https://www.ibm.com/reports/data-breach.
13Morgan, Steve. 2024. “Top 10 Cybersecurity Predictions and Statistics for 2024.” Cybercrime Magazine. https://cybersecurityventures.com/top-5-cybersecurity-facts-figures-predictions-and-statistics-for-2021-to-2025/
14Kizzee, Ken. 2024. “Cyber Attack Statistics to Know,” Parachute. https://parachute.cloud/cyber-attack-statistics-data-and-trends/.
15Cybersecurity & Infrastructure Security Agency. No Date. “Use Strong Passwords.” https://www.cisa.gov/secure-our-world/use-strong-passwords.
16Chaum, David. “The Dining Cryptographers Problem: Unconditional Sender and Recipient Untraceability. Journal of Cryptology, vol. 1, 65–75, 1998. https://chaum.com/wp-content/uploads/2022/01/chaum-dc.pdf
17A list of active Tor peers (i.e., Tor routers) is available at https://torstatus.rueckgr.at.
18To read more about how onion services work, visit https://community.torproject.org/onion-services/overview/, and to read more about how to set them up, visit https://community.torproject.org/onion-services/setup/.
19For more information on Tor, how it is used, and where it is being censored, check out the Tor metrics site at https://metrics.torproject.org/.
20For a recent assessment of the state of the practice of this technology, see https://research.aimultiple.com/secure-multi-party-computation/
21For a recent assessment of the state of the practice of this technology, see https://research.aimultiple.com/homomorphic-encryption/
22Search for “buffer overflow” in the national vulnerability database at https://nvd.nist.gov/vuln/search?adv_search=true&cves=on&cwe_id=CWE-119 and look for MITRE’s top-25 most dangerous software errors for 2011 as an example at https://cwe.mitre.org/top25/
23See https://julianor.tripod.com/bc/formatstring-1.2.pdf for more details.
24See https://web.archive.org/web/20220911001330/http://www.phrack.org/issues/57/8.html for more details on this vulnerability.
25See https://hovav.net/ucsd/dist/geometry.pdf for more details on ROP.
27Code reproduced with permission from Dave Levin
26Code reproduced with permission from Dave Levin
28Code reproduced with permission from Dave Levin
29Code reproduced with permission from Dave Levin
30Code reproduced with permission from Dave Levin
31Code reproduced with permission from Dave Levin
32Code reproduced with permission from Dave Levin
33Code reproduced with permission from Dave Levin
34Code reproduced with permission from Dave Levin
35See https://gitlab.com/akihe/radamsa
36Code reproduced with permission from Dave Levin
37Code reproduced with permission from Dave Levin
38Federal Communications Commission. No Date. “Cybersecurity for Small Businesses.” https://www.fcc.gov/communications-business-opportunities/cybersecurity-small-businesses#:.
39IBM. 2024. “Artificial Intelligence (AI) Cybersecurity.” https://www.ibm.com/ai-cybersecurity#:.

14.2 Cybersecurity Deep Dive

Learning Objectives

What Is Cybersecurity?

Why Is Cybersecurity Important?

Think like a Hacker

Domains of Cybersecurity and Associated Cryptography Techniques

Cybersecurity Misconceptions

Supply Chain and Security Issues

Common Cyber Threats

Technology Skills Gap

Key Cybersecurity Technologies and Best Practices

Securing Information and Communication

Authentication and Passwords

Access Control

Anonymity and Privacy

Zero-Knowledge Proofs

Open Problems in Cryptography

Traditional Software Solutions Security

Software Security

Buffer Overflow Attacks and Defenses

Code Injection Attacks and Defenses

Integer Overflow Attacks and Defenses

Format String Vulnerability and Defenses

Heap Control Data Vulnerability

Code Reuse Attacks and Return-Oriented Programming (ROP)

Time of Check/Time of Use Problem

Playing Cat and Mouse to Secure Software

Common Cyber Threat Defenses

Malware Detection Methods

Infection Cleanup

Software Solutions Assurance Methodologies

Defensive Programming

Secure Software Implementation

Testing

Reverse Engineering

Internet Solutions Cybersecurity

Web Infrastructure Assurance

Web/Mobile Applications Frameworks Assurance

Cloud-Centric Solutions Cybersecurity

Cloud Infrastructure Assurance

Cybersecurity on the Internet vs. the Cloud

Cloud Services Assurance

Cloud Applications Frameworks Assurance

Cloud Applications Assurance

Cloud Information Assurance

Cloud Assurance Methodologies

Cloud Computing vs. Privacy

Hardware Crisis

Industry 4.0 Metaverse Smart Ecosystems Cybersecurity

Smart Ecosystems Platforms Assurance

Using AI in Cybersecurity

Metaverse Smart Ecosystems Platform Cybersecurity Assurance Methodologies

Industry 5.0 Supersociety Solutions Cybersecurity

Supersociety Autonomous Systems Platform Assurance

Supersociety Advanced Robotics Platform Assurance

Supersociety Nanotechnology Platform Assurance

Supersociety Supercompute Platform Assurance

Supersociety Autonomous Supersystems Platform Assurance

Footnotes