Monday, August 3, 2009

N-Tier Architecture !!

Dev Palmistry

What is n-Tier Architecture?

This is a very important topic to consider when developing an application. Many elements need to be considered when deciding on the architecture of the application, such as performance, scalability and future development issues. When you are deciding on which architecture to use, first decide on which of the three aforementioned elements you think is most valuable -- as some choices you make will impact on others. For example, some choices that boost performance will impact on the scalability or future development of your design, etc.

Here we will talk generally about what n-Tier architecture is, and then we will have a look at different n-Tier architectures you can use to develop ASP.NET applications and issues that arise relating to performance, scalability and future development issues for each one.

Firstly, what is n-Tier architecture? N-Tier architecture refers to the architecture of an application that has at least 3 "logical" layers -- or parts -- that are separate. Each layer interacts with only the layer directly below, and has specific function that it is responsible for.

Why use n-Tier architecture? Because each layer can be located on physically different servers with only minor code changes, hence they scale out and handle more server load. Also, what each layer does internally is completely hidden to other layers and this makes it possible to change or update one layer without recompiling or modifying other layers.

This is a very powerful feature of n-Tier architecture, as additional features or change to a layer can be done without redeploying the whole application. For example, by separating data access code from the business logic code, when the database servers change you only needs to change the data access code. Because business logic code stays the same, the business logic code does not need to be modified or recompiled.

[Note] tier and layer mean the same thing [End Note]

An n-Tier application usually has three tiers, and they are called the presentation tier, the business tier and the data tier. Let's have a look at what each tier is responsible for.

Presentation Layer
Presentation Layer is the layer responsible for displaying user interface and "driving" that interface using business tier classes and objects. In ASP.NET it includes ASPX pages, user controls, server controls and sometimes security related classes and objects.

Business Tier
Business Tier is the layer responsible for accessing the data tier to retrieve, modify and delete data to and from the data tier and send the results to the presentation tier. This layer is also responsible for processing the data retrieved and sent to the presentation layer.

In ASP.NET it includes using SqlClient or OleDb objects to retrieve, update and delete data from SQL Server or Access databases, and also passing the data retrieved to the presentation layer in a DataReader or DataSet object, or a custom collection object. It might also include the sending of just an integer, but the integer would have been calculated using the data in the data tier such as the number of records a table has.

BLL and DAL
Often this layer is divided into two sub layers: the Business Logic Layer (BLL), and the Data Access Layers (DAL). Business Logic Layers are above Data Access Layers, meaning BLL uses DAL classes and objects. DAL is responsible for accessing data and forwarding it to BLL.

In ASP.NET it might be using SqlClient or OleDb to retrieve the data and sending it to BLL in the form of a DataSet or DataReader. BLL is responsible for preparing or processing the data retrieved and sends it to the presentation layer. In ASP.NET it might be using the DataSet and DataReader objects to fill up a custom collection or process it to come up with a value, and then sending it to Presentation Layer. BLL sometimes works as just transparent layer. For example, if you want to pass a DataSet or DataReader object directly to the presentation layer.

Data Tier
Data tier is the database or the source of the data itself. Often in .NET it's an SQL Server or Access database, however it's not limited to just those. It could also be Oracle, mySQL or even XML. In this article we will focus on SQL Server, as it has been proven to be the fastest database within a .NET Application.

Logical Layers vs. Physical Layers (Distributed)
Logical Layers and Physical Layers are the ones that confuse people. Firstly, a logical layer means that layers are separate in terms of assembly or sets of classes, but are still hosted on the same server. Physical layer means that those assemblies or sets of classes are hosted on different servers with some additional code to handle the communication between the layers. E.g. remoting and web services.

Deciding to separate the layers physically or not is very important. It really depends on the load your application expects to get. I think it's worth mentioning some of the facts that might affect your decision.

Please DO note that separating the layers physically WILL slow your application down due to the delay in communicating between the servers throughout the network, so if you are using the physical layer approach, make sure the performance gain is worth the performance loss from this.

Hopefully you would have designed your application using the n-Tier approach. If this is the case, then note that you can separate the layers in the future.

Cost for deploying and maintaining physically separated applications is much greater. First of all, you will need more servers. You also need network hardware connecting them. At this point, deploying the application becomes more complex too! So decide if these things will be worth it or not.

Another fact that might affect your decision is how each of the tiers in the application are going to be used. You will probably want to host a tier on a separate server if more than 1 service is dependent on it, e.g. You might want to host business logic somewhere else if you have multiple presentation layers for different clients. You might also want a separate SQL server if you have other applications using the same data.


#####################################################################################


3-Tier Architecture

As the 3-Tier Architecture is the most commonly used architecture in the world, I will start my blogging on this topic.

I have noticed in the past 6 years of writing software that many developers ignore this software engineering paradigm for many reasons. Some include ...

*
Return on Investment
*
Software Lifecycle Turnaround time
*
Knowledgeable Resources
*
and the list goes on

I will discuss in more details of these reasons in another blog entry so back to the topic of 3-Tier architecture.

A 3-Tier architecture uses the Divide and Conquer strategy and is broken down into 3 logical layers.

*
Presentation Layer (PL)
*
Business Logic Layer (BLL)
*
Data Access Layer (DAL)

Ideally, each layer specializes in one or a handful of functionalities that service the upper layer. Each of the three layers should be designed so that the layer above it does not need to understand nor know the implementation details of any of the layers below it. This is accomplished by providing well defined interfaces that the above layers use. The advantage of "Programming to the Interface" is that, you can change the implementation details and still have the application work as defined. One caveat is that, if the interfaces change, then it will take more effort and time to update the layer above it. Therefore, when designing an application, its important to define the interfaces properly.

Here is an example:

public interface IDatasource
{
Customer GetCustomer(int id)
}

public TextDataSource : IDataSource
{
public Customer GetCustomer(int id)
{
// reads data from a text file
}
}

public SQLDataSource : IDataSource
{
public Customer GetCustomer(int id)
{
// uses ADO.NET to read data from a SQL Server database
}
}

public class MyProgram
{
public static void Main(string[] args)
{
// the DataSourceFactory will create a data source depending on some settings
// and return the appropriate implementation of the data source
IDataSource ds = DataSourceFactory.GetInstance().GetDataSource();
Console.WriteLine(ds.GetCustomer(15).FirstName)
}
}

As you can see from the example, MyProgram does not need to know which datasource it is querying. All it needs to know is the interface it must use to retrieve a customer record. If we have numerous implementations then by changing only the configurations which are declarative and available outside of the compiled code, we can change the datasource the application should use to retrieve data without changing the application logic itself.

Now lets see how we can use the "Programming to the Interface" paradigm to create a 3-Tier architecture.

The Presentation Layer is responsible for rendering the data retrieved by the BLL (with the help of the DAL). The only logic that is necessary in this layer is how to manipulate the data and display it to the user in an easy to consume manner. Along with rendering the content, it should be responsible for rudimentary data validation such as missing fields, regular expression matching for emails and other content, numeric validation, range validations, etc.

In .NET, there are a slew of UI specific controls that one may use to render the data. Some controls include the DataList, DataGrid, Label, TextBox and of course custom controls for the advanced developers. There are also pre-built validation controls that are bundled with the .NET framework. I'll post some links later with examples of how to use these controls soon.

Depending on the application you are building, the presentation layer may be one or more of the following types of applications: web, windows, windows service, smart client or console. By properly defining the responsibilities of each layer, the only logic that is necessary for developers to write is the presentation layer. Since retrieving a customer record is the same throughout the application (through the BLL) you can abstract out all the details hence simplifying your software.

On the same note, your application may also expose Web services and Remoting services so it is essential to centralize the code. Otherwise the logic for retrieving a customer (which may include security authorization and authentication, data validation, pre-processing and post-processing) will need to be duplicated in many places. Code duplication may seem like a viable solution at the early stages of the software lifecycle, but it is extremely hard to maintain such pieces of software.

The Business Logic Layer is like your kernel for your application. It should be the component that performs all the business logic for your application. This ranges from validations, running business logic, running application processes such as sending emails and retrieving and persisting data using the Data Access Layer.

Although, validations were performed on the presentation layer, it is imperative that you revalidate the data because browsers could have been spoofed or older browsers might have completely ignored some of the validations or the developers working on the presentation layer did not validate the data properly.

Depending on the complexity of your application, businesses logic code may not reside on the same server or in a centralized location so there are advanced means of executing such logic remotely. With .NET this process has been extremely simplified and available to you within a few clicks of your mouse. One of which is .NET Remoting which is an advanced topic that I'll opt out for now. And the other is the buzz word that many have heard; Web Services.

Both writing and using a Web Service is once again simplified by Microsoft. Visual Studio 2003 and 2005 will be able to download the WSDL and generate the proxies for you so you can invoke the functions as you may within business object in your application. If you don't have Visual Studio, then you may use the "wsdl.exe" utility that is bundled with .NET Framework on the command prompt.

If you have business logic on legacy systems that were built with Microsoft Technologies such as COM+ that can not be rewritten for what ever reason, not to worry. You can use COM+ wrappers provided by the .NET Framework to communicate with the legacy systems.

The Data Access Layer is responsible for accessing and manipulating data from data sources such as SQL Server, Microsoft Access, Oracle, MySQL, etc. Many applications on the Internet today rely heavily on data found in many databases and it is important to centralize the access to this data. Some reasons are ...

*
Security
*
Code Maintenance and Resue
*
Scalability

Databases contain confidential information about people and it is not necessary for everyone in your organizational hierarchy to have access to such data. For example, credit card information stored on Amazon.com shouldn't be available for an entry level employee working in a warehouse. By centralizing the access the database, we are able to authenticate and authorize the users requesting data and manipulating the data.

Since our economy is constantly in a flux, it is never safe to assume that once you have created your data model, it will not change for a decade. In fact, that data model may change tomorrow, a week from now or in an year, but it will change and as software architects, it is our responsibility to foresee such events and design systems that will be able to change with time. If the code isn't centralized, a database change as simple as adding a new column may result in days of changes to many systems, regression testing and deployment of many applications. Is this really necessary?

By creating simple reusable components, developers are able to abstract out all of the details of creating connections, handling errors, invoking appropriate stored procedures or executing Transact-SQL or SQL/PL code, retrieving the data, closing the connection away from the Business Logic Layer.

Typical code (using ADO.NET) to retrieve a customer record may look as the following:

SqlConnection connection = null;

try
{
connection = new SqlConnection(mySqlConnection);
SqlCommand cmd = connection.CreateCommand();
cmd.CommandText = "SELECT * FROM Customers WHERE CustomerID = " + cid
return CustomerFactory.GetInstance().GetCustomer(cmd.ExecuteReader());
}
catch (Exception e)
{
Console.WriteLine("Exception :: " + e.Message);
}
finally
{
if (connection != null && connection.State != ConnectionState.Closed)
connection.Close();
}

So what's the big deal about writing a few lines of code? Well, imagine repeating these lines in 5 different places and 2 weeks later, making a change and remembering all the pieces of code to change. Compared to this solution, I suggest the following:

Customer customer = CustomerDAO.GetInstance().GetCustomer(customerID);

Where the ADO.NET code is abstracted away by the GetCustomer(int) function and now making a change to the function will take minutes and the change will affect all pieces of code that depends on retrieving customer records. The above examples use Design Patterns, more specifically the Factory Pattern and the Singleton Pattern and you may further read about them at your leisure.

Scalability is huge to enterprise level applications that require time crucial data. It’s beyond the scope of this article so I will leave it out for the time being.


##################################################################################

1 comment:

Anonymous said...

Good day, sun shines!
There have been times of hardship when I felt unhappy missing knowledge about opportunities of getting high yields on investments. I was a dump and downright pessimistic person.
I have never imagined that there weren't any need in large initial investment.
Nowadays, I'm happy and lucky , I begin take up real money.
It's all about how to select a proper partner who utilizes your money in a right way - that is incorporate it in real business, and shares the income with me.

You may get interested, if there are such firms? I'm obliged to answer the truth, YES, there are. Please be informed of one of them:
http://theblogmoney.com