The magic behind the Google Search


Have you ever thought that how does Google perform a fast search on a wide variety of file types? For example, how Google is able to suggest you a list of search expressions while you are typing in your keywords?

Another example is Google image search: You upload an image and Google finds the similar photos for you in no time.

The key to this magic is SimHash. SimHash is a mechanism/algorithm invented by Charikar (see the full patent). SimHash comes from the combination of Similarity and Hash. This means that instead of comparing the objects with each other to find the similarity, we convert them to an N-bit number that represents the object (known as Hash) and compare them. In the other words, if instead of the object, we maintain a number that represents the object, so that we will be able to compare those numbers to find the similarity of the two objects.

The basics of SimHash are as below:

  1. Convert the object to a hash value. (From my experience, this is better to be an unsigned integer number).
  2. Count the number of matching bits. For example, are the bit 1 in two hash values the same? Are the 2nd bits the same?
  3. Depending on the size of the hash value (number of bits), you will have a number between 0 and N, where N is the length of the hash value. This is called the Hamming Distance. Hamming distance was introduced by Richard Hamming in 1950 (see here).
  4. The number you have achieved must be normalized and finally be represented in a value such as percentage. To do so, we can use a simple formula such as Similarity = (HashSize-HamminDistance)/HashSize.

Since the hash value can be used to represent any kind of data, such as a text or an image file, it can be used to perform a fast search on almost any file type.

To calculate the hash value, we have to decide the hash size, which normally is 32 or 64. As I said before, an unsigned value works better. We also need to choose a chunk size. The chunk size will be used to break the data down to small pieces, called shingles. For example, if we decide to convert a string such as “Hello World” to a hash value, If the chunk size is 3, our chunks would be:

1-Hel

2-ell

3-llo

Etc.

To convert a binary data to a hash value, you will have to break it down to chunks of bits. E.g. you have to pick every K bits. Google says that N=64 and K=3 are recommended.

To calculate the hash value in a SimHash manner, we have to take the following steps:

  1. Tokenize the data. To tokenize the data, we will have to break it down to small chunks as mentioned above and store the chunks in an array.
  2. Create an array (called a vector) of size N, where N is the size of the hash (let’s call this array V).
  3. Loop over the array of tokens (assume that i is the index of each token),
  4. Loop over the bits of each token (assume that j is the index of each bit),
  5. If Bit[j] of Token[i] is 1, then add 1 to V[j] otherwise subsidize 1 from V[j]
  6. Assume that the fingerprint is an unsigned value (32 or 64 bit) and is named F.
  7. Once the loops finish, go through the array V, and if V[i] is greater than 0, set bit i in F to 1 otherwise to 0.
  8. Return F as the fingerprint.

Here is the code:

private int DoCalculateSimHash(string input)

{

ITokeniser tokeniser = new Tokeniser();

var hashedtokens = DoHashTokens(tokeniser.Tokenise(input));

var vector = new int[HashSize];

for (var i = 0; i < HashSize; i++)

{

vector[i] = 0;

}

foreach (var value in hashedtokens)

{

for (var j = 0; j < HashSize; j++)

{

if (IsBitSet(value, j))

{

vector[j] += 1;

}

else

{

vector[j] -= 1;

}

}

}

var fingerprint = 0;

for (var i = 0; i < HashSize; i++)

{

if (vector[i] > 0)

{

fingerprint += 1 << i;

}

}

return fingerprint;

}

And the code to calculate the hamming distance is as below:

private static int GetHammingDistance(int firstValue, int secondValue)

{

var hammingBits = firstValue ^ secondValue;

var hammingValue = 0;

for (int i = 0; i < 32; i++)

{

if (IsBitSet(hammingBits, i))

{

hammingValue += 1;

}

}

return hammingValue;

}

You may use different ways to tokenize a given data. For example you may break a string value down to words, or to n-letter words, or to n-letter overlapping pieces. From my experience, if we assume that N is the size of each chunk and M is the number of overlapping characters, N=4 and M=3 are the best choices.

You may download the full source code of SimHash from SimHash.CodePlex.com. Bear in mind that SimHash is patented by Google!

Single Sign On with WCF and Asp.NET Custom Membership Provider


I recently was involved in designing enterprise software which contained several ASP.NET web sites and desktop applications. Since same users would use this software system a single-sign-on feature was necessary. Thus I took advantage of Microsoft Identity Foundation which is based on Claim Based Authentication. Also as the applications would be used by internal users, I used Active Directory Federation for authenticating users against an existing Active Directory.

Later I though what would be the solution if Claim Based Authentication did not fit in the solution? So I decided to design and implement a simple single-sign-on (SSO) authentication with WCF. This SSO has the following specifications:

  • Only performs Authentication. It does not contain any role management capability. The reason is that roles and access rights are defined in the scope of each application (or sub-system) so it is left to be done by applications.
  • It is based on Web Services so it can be consumed any technology that understands Web Services (e.g. Java apps.)
  • It can support many kind of user storage, such as Active Directory, ASP.NET Application Services (ASPNETDB) , Custom Authentication etc.
  • It can be used by Web, Desktop and Mobile applications.

Figure 1, Components of SSO

As seen in figure 1, user information can be retrieved from Active Directory, ASP.NET Application Services, Custom Database or 3rd Party Web Services. Components that encapsulate the details of each user storage or services are called Federations. This term is used by Microsoft in Windows Identity Foundation so we keep using that!

The Single Sign-On Service relies on Federation Services. Each federation service is simply a .NET Class Library that contains a class which implements an interface which is common between itself and SSO service. A federation module is plugged into SSO service via a configuration file. The Visual Studio solution that is downloadable at the bottom of this post includes a custom federation module that uses SQL Server to store the user information.

Client applications can consume the Web Service directly to perform sign-in, sign-out, authentication and other related operations. However, this solution includes a custom ASP.NET membership provider which allows ASP.NET applications consume SSO with no hassle. It also enables the existing ASP.NET applications to use this SSO service with a small configuration change.

Figure 2, Package diagram of SSO

Classes and Interfaces

The key classes and interfaces are as below:

  • ISSOFederation Interface: is implemented by Federation classes.
  • AuthenticatedUser: Is used by the federation classes and SSO service to represent a user. This type is also emitted by the service to the clients.

Figure 3, Key types of Common package

  • CustomFederation: Implements ISSOFederation and encapsulates the details of authentication and user storage.
  • SSOFederationHelper (SignInServices package): Provides a service to load the nominated federation service. Only one instance of the federation object exists (Singleton) so to plug another federation module the service must be restarted (e.g. restart the ISS web site).

Figure 4, SSOFederationHelper class

A federation object is plugged to the service using reflection. To do so, first the Fully Qualified Name of the federation type is placed in the Web.Config file:

<appSettings>

<add
key=federationType
value=SignInServices.CustomFederation.SSOCustomFederation,SignInServices.CustomFederation/>

</appSettings>

 

 

This type is instanciated by reflection and provided to the consumers by SSOFederationHelper.FederationObject property. For example:

 

public
List<AuthenticatedUser>
FindUsersByEmail(string
EmailAddress)

{


return
SSOFederationHelper.FederationObject
.FindUsersByEmail(EmailAddress).ToList();

}

 

  • SSOMembershipProvider (SSOClientServices package) is also a custom Asp.net membership provider. This class enables the ASP.NET applications take advantage of the SSOService without being dependent on it. The key point here is that all the ASP.NET applications that take advantage of SSO service must give a same name to the Asp.net membership cookie. Example:

<authentication
mode=Forms >

<forms
loginUrl=~/Account/Login.aspx
timeout=2880” name=.SSOV1Auth />

</authentication>

<authorization>

 

The custom membership provider has a proprietary app.config file. This configuration file includes the WCF client configuration (e.g. binding configuration). However, the URI of the service is configured in the Web.config file of the ASP.NET client. For example, an ASP.NET client configures the membership providers as below:

<membership
defaultProvider=SSOMembershipProvider>

<providers>

<clear/>

<add
name=SSOMembershipProvider
type=SSOClientServices.SSOMembershipProvider,SSOClientServices


endPointUri=http://localhost/SSO/service/sso.svc


enablePasswordRetrieval=false
enablePasswordReset=true
requiresQuestionAndAnswer=false
requiresUniqueEmail=false

/>

</providers>

</membership>

 

The value of endPointUri entry will be used to configure the service proxy.

 

The Visual Studio solution

The source code provided here has been developed using Visual Studio 2010 and requires the following components be installed:

  1. Visual Studio 2010 Express Edition
  2. Entity Framework 4.0
  3. WCF
  4. C# 4.0
  5. IIS 7
  6. SQL Server 2008 or 2008 Express

The following Visual Studio (C#) projects are included:

  1. SignInServices.Common: Includes common types
  2. SignInServices.CustomFederation: Is a sample Federation provider
  3. SignInServices: The Single Sign On WCF Service
  4. SSOClientServices: Includes a custom ASP.NET membership provider
  5. TestWebSite: An ASP.NET web site to test the SSO

How to download the source code?

The source code has been uploaded to CodePlex. Please go to http://wcfsso.codeplex.com and download the latest change set.

 

How to deploy?

In order to deploy the solution take the following actions:

  1. Restore the SQL Server 2008 Database in a SQL Server 2008 Server under the name of Framework
  2. Create a Windows login in SQL Server 2008 and grant access to the restored database (e.g. Domain\SSOUser).
  3. Launch IIS
  4. Create an application pool that works with .NET 4 and uses “Integrated” mode. Name this Application Pool as SSO.
  5. Set the identity account of the newly created application pool. This account must be equal to the Windows account that you added to SQL Server (e.g. Domain\SSOUser). This is required because the existing custom federation project is using Windows Authentication to access database.
  6. Open the solution file in Visual Studio 2010
  7. Publish SignInServices application to a folder
  8. Go to the folder
  9. Open web.config file and configure the following entries:
    1. Under <appSettings> set SessionTimeOutMinutes. This value indicates that how long a sign-in ticket is valid.
    2. Under <appSettings> set federationType to any other federation type that you may want to use. If you want to use the custom federation type shipped with this sample, leave the current value as is.
    3. Under <connectionStrings> update the connection strings either if you have restored the database under any name other than “Framework” or you want to use SQL Server authentication rather Windows authentication.
  10. Build SignInService.CustomFederation project and copy the .DLL file to the \BIN folder of SignInService
  11. Build SSOClientServices project and copy the .DLL file to the \BIN folder of SignInService
  12. Go back to IIS
  13. Under IIS create a new Web Application that uses SSO as its Application Pool and points to the folder to which you published the SignInService application.
  14. Publish TestWebSite to IIS or simply run it in VS 2010.

 

Configuring the custom federation

The custom federation uses SMTP to send a new password to users once a password is requested to reset. The configuration of SMTP server is in Web.Config file of the WCF application (SignInSerivice). You must configure SMTP in order to reset passwords.

<system.net>

<mailSettings>

<smtp
deliveryMethod=Network
from=name@domain.com>

<network
host=smtp.mail.com
userName=name@domain.com
password=password of sender
port=25 />

</smtp>

</mailSettings>

</system.net>

 

Important notice: The solution file uses a custom database to store the user information. To add new users, simply install the SignInService under IIS and then navigate to http://service-url/Admin/ManageUsers.aspx for example if the SSO WCF service is deployed to http://127.0.0.1 /SSO/SSO.SVC, you may access the user management page via http://127.0.0.1/SSO/SSO.SVC/Admin/ManageUsers.aspx

(the admin page must be completed as I focused on developing the SSO service rather than implementing the admin web site)

Important notice: None of the applications or the WCF service is protected for simplicity purposes. If you deploy this solution to a production environment, you must protect them using ASP.NET authentication or WCF security practices.

Important notice: Once you launch the TestWebSite, you’ll see a login screen. Enter the following default credential to enter:

User: aref.karimi

Password: 123

 

 

What is a Software Architecture Document and how would you build it?


Howdy!

After working as a senior designer and/or a software architect in three sub-continents, I came across to a kind of phenomenon in Australia! I call it a phenomenon because first of all, terms such as ‘solution architect’, ‘software architect’ and/or ‘enterprise architect’ are used interchangeably and sometimes incorrectly. Second, architecture is often ignored and contractors or consultants usually start doing the detailed design as soon as they receive a requirements document.

Grafana

This leaves the client (the owner of the project) with a whole bunch of documents which are not understandable to them so that they have to hand them over to a development team without even knowing if the design is what they really wanted.

This happens because such a crucial role is assigned to a senior developer or a designer who thinks purely technical whilst an Architect must be able to look at the problem from different aspects (This happens because in Australia titles are given away for free, just ask for one!).

What is Architecture?

Architecture is the fundamental organization of a system embodied in its components, their relationships to each other, and to the environment, and the principles guiding its design and evolution (IEEE 1471).

The definition suggested by IEEE (above) refers to a solution architect and/or software architect. However, as Microsoft suggests there are other kinds of architects such as a Business Strategy Architect.

There are basically six types of Architects:

·        Business Strategy Architect

The purpose of this role is to change business focus and define the enterprise’s to-be status. This role, he says, is about the long view and about forecasting.

·        Business Architect

The mission of business architects is to improve the functionality of the business. Their job isn’t to architect software but to architect the business itself and the way it is run.

·        Solution Architect

Solution architect is a relatively new term, and it should refer also to an equally new concept. Sometimes, however, it doesn’t; it tends to be used as a synonym for application architect.

·        Software Architect

Software architecture is about architecting software meant to support, automate, or even totally change the business and the business architecture.

·        Infrastructure Architect

The technical infrastructure exists for deployment of the solutions of the solution architect, which means that the solution architect and the technical infrastructure architect should work together to ensure safe and productive deployment and operation of the system

·        Enterprise Architect

Enterprise Architecture is the practice of applying a comprehensive and rigorous method for describing a current and/or future structure and behaviour for an organization’s processes, information systems, personnel and organizational subunits, so that they align with the organization’s core goals and strategic direction. Although often associated strictly with information technology, it relates more broadly to the practice of business optimization in that it addresses business architecture, performance management, and process architecture as well (Wikipedia).

Solution Architect

As we are techies let’s focus on Solution Architect role:

It tends to be used as a synonym for application architect. In an application-centric world, each application is created to solve a specific business problem, or a specific set of business problems. All parts of the application are tightly knit together, each integral part being owned by the application. An application architect designs the structure of the entire application, a job that’s rather technical in nature. Typically, the application architect doesn’t create, or even help create, the application requirements; other people, often called business analysts, less technically and more business-oriented than the typical application architect, do that.

So if you are asked to get on board and architecture a system based on a whole bunch of requirements, you are very likely to be asked to do solution architecture.

How to do that?

A while back a person who does not have a technical background, but he has money so he is the boss, was lecturing that in an ideal world no team member has to talk to other team members. At that time I was thinking that in my ideal world, which is very close to the Agile world, everybody can (or should) speak to everybody else. This points out that how you architecture a system is strongly tight to your methodology. It does not really make a big difference that which methodology you follow as long as you stick to the correct concepts. Likewise, he was saying that the Software Architecture Document is part of the BRD (Business Requirement Document) as if it was technical a business person (e.g. the stake holders) would not understand it. And I was thinking to me that: mate! There are different views being analyzed in a SAD. Some of them are technical, some of them are not.

 What the above story points out to me is that solution architecture is the art of mapping the business stuff to technical stuff, or in the other words, it’s actually speaking about technical things in a language which is understandable to business people.

A very good way to do this is to putting yourself in the stakeholders’ shoes. There are several types of stakeholders in each project who have their own views and their own concerns. This is the biggest difference between the design and the architecture. A designer thinks very technically while an architect can think broadly and can look at a problem from different views. Designers usually make a huge mistake, which happens a lot in Australia: They put everything in one document. Where I am doing a solution architecture job now, I was given a 21-mega-byte MS Word document which included everything, from requirements to detailed class and database design. Such a document is very unlikely to be understandable by the stakeholders and very hard to use by developers. I reckon that this happens because firstly designers don’t consider the separation of stake holders and developers concerns. Second, because it’s easier to write down everything in a document. But I have to say that this is wrong as SAD and design document (e.g. TSD) are built for different purposes and for different audiences (and in different phases if you are following a phase-based methodology such as RUP). If you put everything in a document, it’s like you are cooking dinner and you put the ingredients along with the utensils in a pot and boil them!!

A very good approach for looking at the problem from the stakeholder’s point of view is the 4+1 approach. At this model, scenarios (or Use Cases) are the base and we look at them from a logical view (what are the building blocks of the system), Process view (processes such as asynchronous operations), Development (aka Implementation) view and Physical (aka Deployment) view. There are also optional views such as Data View that you can use if you need to. Some of the views are technical and some of them are not, however they must match and there must be a consistency in the architecture so that technical views can cover business views (e.g. demonstration of a business process with a UML Activity Diagram and/or State Diagram).

I believe that each software project is like a spectrum that each stakeholder sees a limited part of it. The role of an architect is to see the entire spectrum. A good approach to do so (that I use a lot) is to include a business vision (this might not be a good term) in your SAD. It can be a billeted list, a diagram or both, which shows what the application looks like from a business perspective. Label each part of the business vision with a letter or a number. Then add an architectural overview and then map it to the items of business vision indicating that which part of the architecture is meant to address which part of the business vision.

In a nutshell, Architecture is early design decisions, it is not the design.

What to put in an SAD?

There are a whole bunch of SAD templates on the internet, such as the template offered by RUP. However the following items seem to be necessary for each architecture document:

  • Introduction. This can include Purpose, Glossary, Background of the project, Assumptions, References etc. I personally suggest that you explain that what kind of methodology you are following? This will avoid lots of debates, I promise!

It is very important to clear the scope of the document. Without a clear scope not only you will never know that when you are finished, you won’t be able to convince the stakeholder that the architecture is comprehensive enough and addresses all their needs.

  • Architectural goals and constraints: This can include the goals, as well as your business and architectural visions. Also explain the constraints (e.g. if the business has decided to develop the software system with Microsoft .NET, it is a constraint). I would suggest that you mention the components (or modules) of the system when you mention your architectural vision. For example say that it will include Identity Management, Reporting etc. And explain what your strategy to address them is. As this section is intended to help the business people to understand your architecture, try to include clear and well-organised diagrams.

A very important item that you want to mention is the architectural principles that you are following. This is even more important when the client organization maintains a set of architectural principles.

    • Quality of service requirements: Quality of service requirements address the quality attributes of the system, such as performance, scalability, security etc. These items must not be mentioned in a technical language and must not contain any details (e.g. the use of Microsoft Enterprise Library 5).
    • Use Case View: Views basically come from 4+1 model so if you follow a different model you might not have it. However, it is very important that you detect key scenarios (or Use Cases) and mention them in a high-level. Again, diagrams, such as Use Case Diagram, help.
    • Logical View: Logical view demonstrates the logical decomposition of the system, such as packages the build it. It will help the business people and the designers to understand the system better.
    • Process View: Use activity diagrams as well as state diagrams (if necessary) to explain the key processes of the system (e.g. the process of approving a leave request).
    • Deployment View: Deployment view demonstrates that how the system will work in a real production environment. I suggest that you put 2 types of diagrams: one (normal) human understandable diagram, such a Visio Diagram that shows the network, firewall, application server, database, etc.  Also a UML deployment diagram that demonstrates the nodes and dependencies. This will again helps the business and technical people have same understanding of the physical structure of the system.
    • Implementation View: This part is the most interesting section of the techies. I like to include the implementation options (e.g. Java and .NET) and provide a list of pros and cons for each of them. Again, technical pros and cons don’t make much sense to business people. They are mostly interested in Cost of Ownership and availability of the resources and so on.  If you suggest a technology or if it has already been selected, list the products and services that are needed on a production environment (e.g. IIS 7, SQL Server 2008).  Also it’ll be good to include a very high-level diagram of the system.

Also I like to explain the architectural patterns that I’m going to use. If you are including this section in the Implementation View, explain them enough so that a business person can quite understand what that pattern is for. For instance if you are using Lazy Loading patter, explain that what problem does it solve and why you are using it.

Needless to say that you have to also decide which kind of Architecture style you are suggesting, such as 3-Tier and N-Tier, Client-Server etc. Once you have declared that, explain the components of the system (Layers, Tiers and their relationships) by diagrams.

This part also must include your implementation strategy for addressing the Quality of Service Requirements, such as how will you address scaling out.

  • Data View: If the application is data centric, explain the overall solution of data management (never put a database design in this part), your backup and restore strategy as well as disaster recovery strategy.
Grafana

Grafana, Graphite and StatsD: Visualize Your Metrics!

Be iterative

It is suggested that the architecture (and in result the Software Architecture Document) be developed through two or more iterations. It’s impossible to build a comprehensive architecture document in one iteration as not only Architecture has an impact on the requirements, but also architecture  begins in an early stage and many of the scenarios are likely to change.

How to prove that?

Now that after doing lots of endeavor you have prepared your SAD, how will you prove it to the stakeholders? I assume that many of business people do not have any idea about the content and structure of an SAD and the amount of information that you must include in it.

A good approach is to prepare a presentation about the mission of the system, scope, goals, visions and your approach. Invite the stakeholders to a meeting and present the architecture to them and explain that how the architecture covers their business needs. If they are not satisfied, your architecture is very likely to be incomplete.

References

 

ASP.NET MVC 3 Sample Application


Here is an ASP.NET MVC 3 sample web site to let ASP.NET MVC learners see how a real application is developed and works. This will be the first version of the app so more versions are lined up to emerge.

This ASP.NET MVC application is intended to be as simple as possible so you have to know that it is not such a comprehensive commercial application. The database, user interface and code snippets are tend to be concise and clear. Next versions will cover the utilization of Ajax and Asp.NET MVC UI controls.

What is this application about?

WebAdvert is an online advertisement web site. Basically users can sign up and issue their own adverts. They also will be able to manage (view/edit/delete) the ads. Anonymous users can browse the existing ads. Finally, administrators can view/create/delete the members and also assign them to the “Admins” role if necessary.

WebAdvert uses ASP.NET Forms authentication in an MVC fashion. This means that the application includes the AspnetDB database. It also has a SQL Server Express database file named WebAdvert. WebAdvert database contains one table only which is named “Adverts”. Ads are stored in Adverts table. The structure of this table is as bellow:

Figure 1: Structure of Adverts table

Use the following credentials to login:

User name: admin

Password: password)_

 

Figure 2 the browsing page

 

Prerequisites

  • Visual Studio 2010
  • Visual C# Express Edition 2010
  • Visual Web Developer 2010
  • ASP.NET MVC 3
  • Entity Framework 4
  • SQL Server 2008 Express Edition

 

Download

Download WebAdvert ASP.NET MVC 3.0 source code from webadvert.codeplex.com

Speed up your ASP.NET site with compression


Loading speed is a crucial issue for every professional web site. If your website takes too long to load, your visitors, who are your potential customers, will go away and never come back no matter how much effort you have put to develop a professional code or design a charming user interface. There are several techniques to speed up your website, pre creation of pages, caching … and compression.

Server level and page level compression

Compression can be performed in two stages: IIS and each individual web site. If you enable compression on IIS (6.0 and later), as described here, all web sites and virtual directories are affected. Therefore, this technique is recommended only if you have the full control over that server and you own all the web sites on it. You must also get sure that all web sites on the server have no problem with getting compressed. If there are several web sites on the server who you are not the owner or administrator of them, you would better not bother by trying to compress them.

Page level compression lets you compress your own web site or even only specific pages or file types. For example we can set our compression mechanism to compress .aspx files only, and not html files, or not pages that use ASP.NET Ajax controls.

What do I need?

If you are using ASP.NET 2.0 and above (not 1.1) you do not need any 3rd party compression module or tool. All you need is provided by .NET Framework.

How to do it?

All we need to do is to compress the response output. It can be easily done by GzipStream or DeflateStream classes. Both of them use the same algorithm and the only difference is that GzipStream stores more header information on compressed file/stream to be used by 3rd party zip tools. Therefore DeflateStream is more light-weight. Before we compress the response output, we should get sure that the page we are doing to compress, supports compression. Otherwise, the user’s browser might not be able to decompress and display the page.

There are two ways of enabling response compression: First via HttpHandlers and Web.Config file, second by adding some code to global.asax file. Using a HttpHandler lets you change your compression module later without recompiling your web application and re-deploying it.

To compress the response output, all you need to do is to compress the Response.OutPut and assign it to Response.Filter. Response.Filter allows you wrap the Response.Output before it is transmitted towards the web browser.

context.Response.Filter = new
DeflateStream(context.Response.OutputStream,CompressionMode.Compress);

To create a HttpHandler, add a new class library project to your solution, including a public class which inplements IHttpHandler interface. Then implement it as the following code:

public
class
CompressMe:IHttpHandler

{

:

public
void ProcessRequest(HttpContext context)

{

string pageEncoding = context.Request.Headers[“Accept-Encoding”];

if (!string.IsNullOrEmpty(pageEncoding))

if (pageEncoding.ToLower().Contains(“gzip”))

{

context.Response.AppendHeader(“Content-encoding”, “gzip”);

context.Response.Filter = new
GZipStream(context.Response.Filter, CompressionMode.Compress);

} }

As seen in the above code, first we check to see if the page header indicates compression is supported or not. If it is supported, we create a new instance of DeflateStream (belongs to .NET Framework 2.0 +) and pass Response. Filter to it. This is the stream that we like to compress. The resulting stream is assigned to Response.Filter , meaning that a compressed copy of Response. Filter is transmitted to the browser.

The context.Response.AppendHeader(“Content-encoding”, “gzip”) thing says the browser that it must decompress the page before displaying it. Today’s browsers usually are smart enough to figure it out, but better safe than sorry.

Now build the class library and copy it into /bin folder of your web site or reference to it from within your web application. Suppose that your class library is called CompressionModule, it includes a namespace called CompressionModule and class named CompressMe (which implements IHttpHandler interface). Now add the following line of code to web.config file, under <httpHandlers> tag:

<add
verb=*
path=*.aspx
type=

CompressionModule.
CompressionModule,
CompressMe
/>

This means that all .aspx files, with both GET and POST methods, must be redirected to CompressMe class before being sent to the browser.

NOTICE: YOU MUST RUN YOUR WEB SITE UNDER IIS AND NOT VISUAL STUDIO DEVELOPMENT SERVER. OTHERWISE YOUR PAGE SIZE WILL BE ZERO BYTES.

Another way to activate the compression is to implement Application_PreRequestHandlerExecute in global.asax file. To do so, add an event handler to global.asax file, convert sender argument to HttpApplication and use its Reponse property exactly like you did in above code:

void Application_PreRequestHandlerExecute(object sender, EventArgs e)

{


HttpApplication app = sender as
HttpApplication;

…………

}

More benefits of compression in ASP.NET applications

Compression can even be more of help in your ASP.NET web site. For example you can compress uploaded files, especially if you store them in a data base since databases are very expensive and accessing them is time/resource consuming.

The following code demonstrates how to compress an uploaded file and save it:

As the code exposes, we access to the uploading file via FileUpload.PostedFile.InputStream which helps us to create a GzipStream object based on it. We also create a FileStream to store the compressed file in a .zip file. Afterwards, we start to read 2048-byte packets of the GzipStream and write it into the FileStream. That’s all J


private
void DoUpload(FileUpload fileUpload)

{


const
int bufferSize = 2048;


string fileName = Path.ChangeExtension(Path.Combine(Server.MapPath(Request.Url.AbsolutePath), fileUpload.PostedFile.FileName), “.zip”);


FileStream fs = new
FileStream(, FileMode.Create);


GZipStream zip = new
GZipStream(fileUpload.PostedFile.InputStream, CompressionMode.Compress);


try

{


byte[] buffer = new
byte[bufferSize];


int offset = 0;


int count = 0;

count = zip.Read(buffer, offset, bufferSize);


while (count > 0)

{

fs.Write(buffer, offset, count);

offset += count;

}

}


finally

{

fs.Close();

zip.Close();

}

}

You can download the sample projects from here.

Some MyCaptcha Improvements


Hi All,

I recently received a review about MyCaptcha control in CodePlex that proposed a couple of improvements in the source code.  This review indicates that :

  1. Thread.Sleep is not a good way to ensure that unique characters are selected for teh Captcha text.
  2. Sending the Captcha text to rendering component (.ashx file) is insecure.

Based on the above objections, I modified the source code and re-uploaded it to Codeplex. The modifications are as follows:

First, I removed Thread.Sleep thing from MyCaptcha.ascx.cs and replaced it with a do {..} while (…) loop to ensure that no letter is duplicate:

char newChar = (char)0;
do
{
newChar = Char.ToUpper(Valichars[new Random(DateTime.Now.Millisecond).Next(Valichars.Count() – 1)]);
}
while (Captcha.Contains(newChar));
Captcha += newChar;

To improve security when passing the Captcha text to rendering component (GetImgText.ashx) to ways come to my mind:

  • To include text generation logic into .ashx file
  • To encrypt captcha text before passing it to .ashx file.

As I mentioned before, generating a random text should not be depended on the way we display it. Therefore I don’t like mingling up the two codes. So I added some code to use a Symmetric algorithm to encrypt/decrypt Captcha text.

Thus, I added a new static class called SecurityHelper which includes two encryption and decryption methods . Symmetric algorithms  require a 16-letter key. Therefore I added a private property to give back the key. This key is optional and you can change it to whatever you would like:

private static byte[] SymmetricKey
{
get
{
return  Encoding.UTF8.GetBytes(“1B2c3D4e5F6g7H81”);
}
}

This post is too short to include the body of EncryptString and DecryptString methods. You can download the source code and see to it.

Back to MyCaptcha.ascx.cs , we need to use SecurityHelper.EncryptString to encrypt Captcha text before passing it to GetImgText.ashx generic handler:

ImgCaptcha.ImageUrl = “GetImgText.ashx?CaptchaText=” + SecurityHelper.EncryptString(Captcha);

We also need to change GetImgText.ashx.cs to make it decrypt QueryString value:

var CaptchaText = SecurityHelper.DecryptString(
Convert.FromBase64String(context.Request.QueryString[“CaptchaText”]));

This will make sure that nobody can sniff the captcha text. If you get the Captcha image properties in your browser, it will look like this:

http://localhost:51093/MyCaptcha/GetImgText.ashx?CaptchaText=V1rX8fZjL%2f8ghm1ZkCCsRoTryWDde8tJ7sejoIjoJKA%3d

I also tried to make the text more hard-to-read and a bit prettier so I changed the GetImgText.ashx file to draw an image background for Captcha. Background images come from /images/captcha folder so you need to create such a folder in your web site or modify the source code to point to an appropriate location. I have also added three sample background images, like a Valentine Day one:

vday

try other background images and decide which one do you like the most.

Please visit CodePlex page of this project and download the source code.

Pluggable modules for ASP.NET


When you design a modular ASP.NET application, soon or later you will need to think about adding extensibility features to your project so that it will be possible to add new modules at runtime.  There are a few architectures and designs that let you develop an extensible application, like ASP.NET MVP. However, many of them add a lot of complexity almost to everything and one should learn many concepts to use them.  Therefore, it’s a good idea to use other simple but innovative methods like the one I will explain bellow.

The method I am going to mention lets you develop an ASP.NET application and add some more modules to it later at runtime. In a nutshell, it has the following benefits:

  1. Allows adding new pages to an existing web application at runtime and does not need any recompilation.
  2. Allows adding new web parts to an existing content management system (Portal) at run-time.
  3. Several developers can develop different parts of an application.
  4. It is very easy to understand, develop and use.
  5. Does not exploit any 3rd party library so nothing is needed except Visual Studio.

And it has the following drawbacks:

  1. One should know the exact structure of an existing ASP.NET application, like folder hierarchies.
  2. May not cover all possible scenarios (Actually I have not taught about many scenarios).

How to implement it?

The design I am going to explain is possible only if you develop an ASP.NET Web Project rather than an ASP.NET web site.  As far as I remember, visual studio 2005 does not let you create a web project. If I am right, you need to use Visual Studio 2008. However, there are two main parts that we need to develop:

  • A web application project that includes main modules, main pages, and loads plugged modules, checks licensing, perform security tasks etc.
  • Plugged modules, which will add more pages, web parts and functionalities.

Main application and modules must match. It means that they must have same structure (i.e. folders), use same master pages and follow same rules.

The main reason that I used a Web Application Project, rather than a Web Site, was the benefits of a Web Application Project for developing a plug-in based web site. After building a web application project, there will be one assembly and several .aspx, .ascx, .ashx … files. After the web application is published, there is possibility to add more pages and files to it. Therefore, if at a later time we add several .aspx pages along with their .dll files, the web application will be able to work with those pages with no problem.

When developing the main application, you should consider a well formed directory structure, language specific contents, master pages etc. For example, your application should have a master page with a general name, like Site.Master. It also needs to maintain each module’s pages in a separate folder so that new modules can follow the same rule and avoid naming conflicts etc.

To develop the main application, follow the steps bellow:

  1. Create an empty solution in VS 2008.
  2. Add a new ASP.NET Web Project (not a web site) to the solution.
  3. Add any required folders like App_Themes and implement any required authentication, authorization and personalization mechanisms. Your web application must be complete and working.
  4. Add a master page to the web application project and name it Site.Master or another general name.
  5. Add a new Class Library Project and call it Framework (i.e. mycompany.myproject.Framework), common or whatever name that indicates this class library will be shared between the main application and dynamic modules.
  6. Add a new interface to the mentioned class library and call it IModuleInfo. This interface will be implemented with a class inside any pluggable module and will return root menu items that must be added to main application’s menu (or items to be added to a site navigation). It also can return a list of WebParts that introduces web parts that exist inside the module.

public interface IModuleInfo

{

List<MenuItem> GetRootMenuItems(string[] UserRoles);

}

UserRoles parameter is not mandatory. It shows that you can pass parameters to the method that returns a module’s main menu items. In this example, it indicates which Roles the current user has so that menu items will be filtered properly.

  1. Add a new ASP.NET Web Application project to the solution and name it SampleModule.
  2. Add a folder called SampleModule and if necessary, add more sub-folders.
  3. Add a web.config file to SampleModule folder and define which users/roles can access which folder.
  4. Add a master page named Site.Master. In fact , it must have same name with your master page in the main application.
  5. Add a public class with any name (I call it ModulePresenter) that implements IModuleInfo (this interface was added to Common or Framework library).

ModulePresnter class will return a list me menu items to main application. Main application will add those menu items as root items to its main menu afterwards. I will not bring a detailed code for the part that a module creates these items; it is dependent on your project.

public class ModulePresenter : IModuleInfo

{

#region IModuleInfo Members

public List<System.Web.UI.WebControls.MenuItem> GetRootMenuItems(string[] UserRoles)

{

List<MenuItem> items = new List<MenuItem>();

//:

//:

return items;

}

#endregion

}

  1. Compile this application and go back to the main application.
  2. Add an XML file and call it PluggedModules.xml. This file maintains the qualified type name of each module that must be loaded. A qualified type name includes assembly, namespace and class name

<?xml version=”1.0″ encoding=”utf-8″ ?>

<modules>

<module name=”SampleModule” type=” SampleModule.ModulePresenter, SampleModule.dll”></module>

</modules>

  1. Write a code to query PluggbedModules.xml, get menu items and attach them to main menu:

public static void LoadModules(Menu menuControl , string[] userRoles, string xmlName)

{

XDocument document = XDocument.Load(HttpContext.Current.Server.MapPath(string.Format(“~/{0}” , xmlName)));

var allModules = document.Elements(“modules”);

foreach(XElement module in allModules.Elements())

{

string type = module.Attribute(“type”).Value;

IModuleInfo moduleInfo = Activator.CreateInstance(Type.GetType(type)) as IModuleInfo;

List<MenuItem> allItems = moduleInfo.GetRootMenuItems(userRoles);

foreach(MenuItem item in allItems)

{

menuControl.Items.Add(item);

}

}

}

As seen in the above code, we query PluggedModule.xml file , extract introduced files and create an instance of it using Activator.CreateInstance method. Then extract IModuleInfo implementation, call GetRootMenuItems to get module’s menu items and add it to main menu.

After doing all the above steps, copy modules .dll file (generated after you build the project) to main application’s \bin folder and add it’s main folder (SampleModule) to main application’s root folder. It will work fine until all naming matches (for example both use master pages with a same name) and when specifying target URL in menu items, they point to a relative path, i.e. SampleModule/MyPage.aspx.

Please download the sample code from here.