CWE-20: Improper Input Validation

ID CWE-20

Abstraction Class

Structure Simple

Status Stable

Number of CVEs 10834

The product receives input or data, but it does not validate or incorrectly validates that the input has the properties that are required to process the data safely and correctly.

Input validation is a frequently-used technique for checking potentially dangerous inputs in order to ensure that the inputs are safe for processing within the code, or when communicating with other components. When software does not validate input properly, an attacker is able to craft the input in a form that is not expected by the rest of the application. This will lead to parts of the system receiving unintended input, which may result in altered control flow, arbitrary control of a resource, or arbitrary code execution.

Input validation is not the only technique for processing input, however. Other techniques attempt to transform potentially-dangerous input into something safe, such as filtering (CWE-790) - which attempts to remove dangerous inputs - or encoding/escaping (CWE-116), which attempts to ensure that the input is not misinterpreted when it is included in output to another component. Other techniques exist as well (see CWE-138 for more examples.)

Input validation can be applied to:

raw data - strings, numbers, parameters, file contents, etc.
metadata - information about the raw data, such as headers or size

Data can be simple or structured. Structured data can be composed of many nested layers, composed of combinations of metadata and raw data, with other simple or structured data.

Many properties of raw data or metadata may need to be validated upon entry into the code, such as:

specified quantities such as size, length, frequency, price, rate, number of operations, time, etc.
implied or derived quantities, such as the actual size of a file instead of a specified size
indexes, offsets, or positions into more complex data structures
symbolic keys or other elements into hash tables, associative arrays, etc.
well-formedness, i.e. syntactic correctness - compliance with expected syntax
lexical token correctness - compliance with rules for what is treated as a token
specified or derived type - the actual type of the input (or what the input appears to be)
consistency - between individual data elements, between raw data and metadata, between references, etc.
conformance to domain-specific rules, e.g. business logic
equivalence - ensuring that equivalent inputs are treated the same
authenticity, ownership, or other attestations about the input, e.g. a cryptographic signature to prove the source of the data

Implied or derived properties of data must often be calculated or inferred by the code itself. Errors in deriving properties may be considered a contributing factor to improper input validation.

Note that "input validation" has very different meanings to different people, or within different classification schemes. Caution must be used when referencing this CWE entry or mapping to it. For example, some weaknesses might involve inadvertently giving control to an attacker over an input when they should not be able to provide an input at all, but sometimes this is referred to as input validation.

Finally, it is important to emphasize that the distinctions between input validation and output escaping are often blurred, and developers must be careful to understand the difference, including how input validation is not always sufficient to prevent vulnerabilities, especially when less stringent data types must be supported, such as free-form text. Consider a SQL injection scenario in which a person's last name is inserted into a query. The name "O'Reilly" would likely pass the validation step since it is a common last name in the English language. However, this valid name cannot be directly inserted into the database because it contains the "'" apostrophe character, which would need to be escaped or otherwise transformed. In this case, removing the apostrophe might reduce the risk of SQL injection, but it would produce incorrect behavior because the wrong name would be recorded.

Modes of Introduction

Phase	Note
Architecture and Design
Implementation	REALIZATION: This weakness is caused during implementation of an architectural security tactic. If a programmer believes that an attacker cannot modify certain inputs, then the programmer might not perform any input validation at all. For example, in web applications, many programmers believe that cookies and hidden form fields can not be modified from a web browser (CWE-472), although they can be altered using a proxy or a custom program. In a client-server architecture, the programmer might assume that client-side security checks cannot be bypassed, even when a custom client could be written that skips those checks (CWE-602).

Applicable Platforms

Type	Class	Name	Prevalence
Language	Not Language-Specific

Relationships

View			Weakness
# ID	View	Status	# ID	Name	Abstraction	Structure	Status
CWE-1000	Research Concepts	Draft	CWE-707	Improper Neutralization	Pillar	Simple	Incomplete
CWE-1000	Research Concepts	Draft	CWE-345	Insufficient Verification of Data Authenticity	Class	Simple	Draft
CWE-1000	Research Concepts	Draft	CWE-22	Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')	Base	Simple	Stable
CWE-1000	Research Concepts	Draft	CWE-41	Improper Resolution of Path Equivalence	Base	Simple	Incomplete
CWE-1000	Research Concepts	Draft	CWE-74	Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection')	Class	Simple	Incomplete
CWE-1000	Research Concepts	Draft	CWE-119	Improper Restriction of Operations within the Bounds of a Memory Buffer	Class	Simple	Stable
CWE-1000	Research Concepts	Draft	CWE-770	Allocation of Resources Without Limits or Throttling	Base	Simple	Incomplete

Common Attack Pattern Enumeration and Classification (CAPEC)

The Common Attack Pattern Enumeration and Classification (CAPEC™) effort provides a publicly available catalog of common attack patterns that helps users understand how adversaries exploit weaknesses in applications and other cyber-enabled capabilities.

CAPEC at Mitre.org

# ID	Name	Weaknesses
CAPEC-3	Using Leading 'Ghost' Character Sequences to Bypass Input Filters	CWE-20
CAPEC-7	Blind SQL Injection	CWE-20
CAPEC-8	Buffer Overflow in an API Call	CWE-20
CAPEC-9	Buffer Overflow in Local Command-Line Utilities	CWE-20
CAPEC-10	Buffer Overflow via Environment Variables	CWE-20
CAPEC-13	Subverting Environment Variable Values	CWE-20
CAPEC-14	Client-side Injection-induced Buffer Overflow	CWE-20
CAPEC-22	Exploiting Trust in Client	CWE-20
CAPEC-23	File Content Injection	CWE-20
CAPEC-24	Filter Failure through Buffer Overflow	CWE-20
CAPEC-28	Fuzzing	CWE-20
CAPEC-31	Accessing/Intercepting/Modifying HTTP Cookies	CWE-20
CAPEC-42	MIME Conversion	CWE-20
CAPEC-43	Exploiting Multiple Input Interpretation Layers	CWE-20
CAPEC-45	Buffer Overflow via Symbolic Links	CWE-20
CAPEC-46	Overflow Variables and Tags	CWE-20
CAPEC-47	Buffer Overflow via Parameter Expansion	CWE-20
CAPEC-52	Embedding NULL Bytes	CWE-20
CAPEC-53	Postfix, Null Terminate, and Backslash	CWE-20
CAPEC-63	Cross-Site Scripting (XSS)	CWE-20
CAPEC-64	Using Slashes and URL Encoding Combined to Bypass Validation Logic	CWE-20
CAPEC-67	String Format Overflow in syslog()	CWE-20
CAPEC-71	Using Unicode Encoding to Bypass Validation Logic	CWE-20
CAPEC-72	URL Encoding	CWE-20
CAPEC-73	User-Controlled Filename	CWE-20
CAPEC-78	Using Escaped Slashes in Alternate Encoding	CWE-20
CAPEC-79	Using Slashes in Alternate Encoding	CWE-20
CAPEC-80	Using UTF-8 Encoding to Bypass Validation Logic	CWE-20
CAPEC-81	Web Server Logs Tampering	CWE-20
CAPEC-83	XPath Injection	CWE-20
CAPEC-85	AJAX Footprinting	CWE-20
CAPEC-88	OS Command Injection	CWE-20
CAPEC-101	Server Side Include (SSI) Injection	CWE-20
CAPEC-104	Cross Zone Scripting	CWE-20
CAPEC-108	Command Line Execution through SQL Injection	CWE-20
CAPEC-109	Object Relational Mapping Injection	CWE-20
CAPEC-110	SQL Injection through SOAP Parameter Tampering	CWE-20
CAPEC-120	Double Encoding	CWE-20
CAPEC-135	Format String Injection	CWE-20
CAPEC-136	LDAP Injection	CWE-20
CAPEC-153	Input Data Manipulation	CWE-20
CAPEC-182	Flash Injection	CWE-20
CAPEC-209	XSS Using MIME Type Mismatch	CWE-20
CAPEC-230	Serialized Data with Nested Payloads	CWE-20
CAPEC-231	Oversized Serialized Data Payloads	CWE-20
CAPEC-250	XML Injection	CWE-20
CAPEC-261	Fuzzing for garnering other adjacent user/sensitive data	CWE-20
CAPEC-267	Leverage Alternate Encoding	CWE-20
CAPEC-473	Signature Spoof	CWE-20
CAPEC-588	DOM-Based XSS	CWE-20
CAPEC-664	Server Side Request Forgery	CWE-20

CVEs Published

Based on CVE published date

CVSS Severity

CVSS Severity - By Year

CVSS Base Score

# CVE	Description	CVSS	EPSS	EPSS Trend (30 days)	Affected Products	Weaknesses	Security Advisories	Exploits	PoC	Pubblication Date	Modification Date
# CVE	Description	CVSS	EPSS	EPSS Trend (30 days)	Affected Products	Weaknesses	Security Advisories	PoC	Pubblication Date	Modification Date