CAPEC-71 - Using Unicode Encoding to Bypass Validation Logic

An attacker may provide a Unicode string to a system component that is not Unicode aware and use that to circumvent the filter or cause the classifying mechanism to fail to properly understanding the request. That may allow the attacker to slip malicious data past the content filter and/or possibly cause the application to route the request incorrectly.

Severity

Likelihood

Confidentiality

Integrity

Availability

  • Attack Methods 3
  • Modification of Resources
  • API Abuse
  • Injection
  • Purposes 1
  • Penetration
  • Scopes 4
  • Bypass protection mechanism
  • Authorization
  • Access_Control
  • Confidentiality
  • Execute unauthorized code or commands
  • Availability
  • Integrity
  • Confidentiality
  • Modify application data
  • Integrity
  • DoS: crash / exit / restart
  • Availability

Medium level: An attacker needs to understand Unicode encodings and have an idea (or be able to find out) what system components may not be Unicode aware.

Filtering is performed on data that has not be properly canonicalized.

Step 1 - Survey the application for user-controllable inputs

Using a browser or an automated tool, an attacker follows all public links and actions on a web site. He records all the links, the forms, the resources accessed and all other potential entry-points for the web application..

Tecnique ID: 1 - Environment(s) env-Web

Use a spidering tool to follow and record all links and analyze the web pages to find entry points. Make special note of any links that include parameters in the URL.

Tecnique ID: 2 - Environment(s) env-Web

Use a proxy tool to record all user input entry points visited during a manual traversal of the web application.

Tecnique ID: 3 - Environment(s) env-Web

Use a browser to manually explore the website and analyze how it is constructed. Many browsers' plugins are available to facilitate the analysis or automate the discovery.

Indicator ID: 1 - Environment(s) env-Web

Type: Positive

Inputs are used by the application or the browser (DOM)

Indicator ID: 2 - Environment(s) env-Web

Type: Inconclusive

Using URL rewriting, parameters may be part of the URL path.

Indicator ID: 3 - Environment(s) env-Web

Type: Inconclusive

No parameters appear to be used on the current page. Even though none appear, the web application may still use them if they are provided.

Indicator ID: 4 - Environment(s) env-Web

Type: Negative

Applications that have only static pages or that simply present information without accepting input are unlikely to be susceptible.


Security Control ID: 1

Type: Detective

Monitor velocity of page fetching in web logs. Humans who view a page and select a link from it will click far slower and far less regularly than tools. Tools make requests very quickly and the requests are typically spaced apart regularly (e.g. 0.8 seconds between them).

Security Control ID: 2

Type: Detective

Create links on some pages that are visually hidden from web browsers. Using iframes, images, or other HTML techniques, the links can be hidden from web browsing humans, but visible to spiders and programs. A request for the page, then, becomes a good predictor of an automated tool probing the application.

Security Control ID: 3

Type: Preventative

Use CAPTCHA to prevent the use of the application by an automated tool.

Security Control ID: 4

Type: Preventative

Actively monitor the application and either deny or redirect requests from origins that appear to be automated.


Outcome ID: 1

Type: Success

A list of URLs, with their corresponding parameters (POST, GET, COOKIE, etc.) is created by the attacker.

Outcome ID: 2

Type: Success

A list of application user interface entry fields is created by the attacker.

Outcome ID: 3

Type: Success

A list of resources accessed by the application is created by the attacker.



Step 1 - Probe entry points to locate vulnerabilities

The attacker uses the entry points gathered in the "Explore" phase as a target list and injects various Unicode encoded payloads to determine if an entry point actually represents a vulnerability with insufficient validation logic and to characterize the extent to which the vulnerability can be exploited..

Tecnique ID: 1 - Environment(s) env-Web

Try to use Unicode encoding of content in Scripts in order to bypass validation routines.

Tecnique ID: 2 - Environment(s) env-Web

Try to use Unicode encoding of content in HTML in order to bypass validation routines.

Tecnique ID: 3 - Environment(s) env-Web

Try to use Unicode encoding of content in CSS in order to bypass validation routines.

Indicator ID: 1 - Environment(s) env-Web

Type: Positive

The application accepts user-controllable input.


Security Control ID: 1

Type: Preventative

Implement input validation routines that filter or transcode for Unicode content.

Security Control ID: 2

Type: Preventative

Specify the charset of the HTTP transaction/content.

Security Control ID: 3

Type: Detective

Monitor inputs to web servers. Alert on unusual charset and/or characters.

Security Control ID: 4

Type: Preventative

Actively monitor the application and either deny or redirect requests from origins that appear to be attack attempts.


Outcome ID: 1

Type: Success

The attacker's Unicode encoded payload is processed and acted on by the application without filtering or transcoding

Outcome ID: 2

Type: Failure

The application decodes the charset and filters the inputs.



Canonicalize data prior to performing any validation or filtering on it. Be aware of alternate encodings.

Ensure that the system is Unicode aware and can properly process Unicode data. Do not make an assumption that data will be in ASCII.

Ensure that filtering or input validation is applied to canonical data.

Assume all input is malicious. Create a white list that defines all valid input to the software system based on the requirements specifications. Input that does not match against the white list should not be permitted to enter into the system.