Wednesday, May 05, 2010

Lessons from Designing a Two Billion Page View Mobile Application

In his guest post on High Scalability Jamie Hall, CTO MocoSpace detailed some key architectural lessons that can be a good guide in designing large scale enterprise and web applications. Below are my views on some of them.

1. Make your boxes/servers sweat.

In my experience most of the enterprises have their server resources under utilized. This is mostly due improper or no capacity planning, over-sizing, less emphasis on monitoring the utilization etc. Surprisingly, inspite very little utilization of resources, they never achieve the performance SLAs. This brings us to the second point that Jamie mentions...

2. Understand where your bottlenecks are in each tier

There is limited understanding of the application and technologies... for example, whether an application is CPU, memory or IO intensive? Enterprises going for COTS applications like SAP very rarely understand the application architecture and internals. Without this knowledge they have to blindly depend on the vendors to do the intial sizing for them and no way ready to understand the application behaviour themselves during the application life-cycle and resolve the associated issues. Also there is limited load and performance testing carried out in house... All this results in having extra processing power in machines that needs more memory and vice versa !

3. Profile the database religiously.

Database is normally the most critical component of any business application. All performance related issues can be generally traced back to the database.  While optimizing databases for performance apart from doing database profiling, focus should also be on caching of read-only data on app layer, database sharding and alternative data stores (NoSQL key value stores).

4. Design to disable.

Hot deployment and disabling of rolled out features through configuration are very critical for the application life-cycle management. That's where evolving languages like Erlang that provide hot deployment of code are very promising inspite of the fact that there is  still some way to go for their enterprise adoption.  

5. Communicate synchronously only when absolutely necessary

This is the key to identify failures and error conditions, and thereby easily manage distributed applications. Yet I see people find ways implement synchronous interfaces between applications.

6. Think about monitoring during design, not after.

Do have your applications designed for monitoring. Identify the KPIs that needs monitoring. Otherwise you would have no way to troubleshoot when you end up with issues in production.

7. Distributed sessions can be a lot of overhead.  

Inspite of the fact that distributed session management feature using technologies like application clustering are common with all server side applications and tools, this is always a bottleneck for scalability particularly when you want to scale-out. If you can design applications where you can have stateless sessions or the session info stored in the client to be passed with every request, life can become very easy. Jamie also advises to use sticky sessions that are now a days available with all loadbalancing appliances.

8. N+1 design.

Have N+1 design rather than clustering and local failover for the web and app servers.

Finally, a few other things that could be important are..


a. Keep it simple... Use the best tool/framework for your requirement that has a low CPU and memory footprint...  You can compromise some of the non-realistic non-functional requirements to keep it simple.

b. Design your application to manage failure rather than to avoid failure.

c. Try and leverage client side processing as much as possible keeping in mind the browser or other client capabilities and the client devices to be used... Ofcourse for mobile applications client side processing should be kept to minimum.

Thursday, March 04, 2010

Cynicism is not always bad...

“The Paradoxical Success of Aspect-Oriented Programming” by Friedrich Steimann includes a fantastic quote and graphic from an IEEE editorial by James Bezdek in IEEE Transactions on Fuzzy Systems.

Every new technology begins with naive euphoria—its inventor(s) are usually submersed in the ideas themselves; it is their immediate colleagues that experience most of the wild enthusiasm. Most technologies are overpromised, more often than not simply to generate funds to continue the work, for funding is an integral part of scientific development; without it, only the most imaginative and revolutionary ideas make it beyond the embryonic stage. Hype is a natural handmaiden to overpromise, and most technologies build rapidly to a peak of hype. Following this, there is almost always an overreaction to ideas that are not fully developed, and this inevitably leads to a crash of sorts, followed by a period of wallowing in the depths of cynicism. Many new technologies evolve to this point, and then fade away. The ones that survive do so because someone finds a good use (= true user benefit) for the basic ideas.

How true.....Without cynicism true potential of a technological innovation can not be discovered...

Note: I ended up on this while reading an
post from Dennis Forbes on his defence of SQL/RDBMS.

Wednesday, February 17, 2010

Where not to Cloud?

Latest Buzz... Intel's Cloud Chip. It will have 48 cores and will increase the power of what is today by 10-20 times (as quoted by Intel)...

Many more cloudy things popping up everyday... However, it would be good to know the shortcomings of the cloud and related technolgies before jumping onto it like every other chap round the corner...

As rightly pointed out by Gojko Adzic in his excellent post, with cloud platforms all you have is "a bunch of cheap servers with poor IO" .

He also mentions the key constraints of Cloud Deployment

- All cloud servers are equally unreliable
- All servers will be equally impacted by network and IO constraints
- Fundamentally Network is unreliable
- There is no fast shared storage

He mentions some fundamental guidelines while doing cloud deployments

Partition, partition, partition: avoid funnels or single points of failure. Remember that all you have is a bunch of cheap web servers with poor IO. This will prevent bottlenecks and scoring an own-goal by designing a denial of service attack in the system yourself.

Plan on resources not being there for short periods of time. Break the system apart into pieces that work together, but can keep working in isolation at least for several minutes. This will help make the system resilient to networking issues and help with deployment.

Plan on any machine going down at any time. Build in mechanisms for automated recovery and reconfiguration of the cluster. We accept failure in hardware as a fact of life – that’s why people buy database servers with redundant disks and power supplies, and buy them in pairs. Designing applications for cloud deployment simply makes us accept this as a fact with software as well.

Tuesday, October 27, 2009

Application Design or Hosting Strategy.. What should be addressed first?

Larry O'Brein recently interviewed three of Gang of Four (GoF) on the applicability of design patterns to application design after 15 years. The consensus among the authors was that these patterns are more or less associated with object oriented languages like c++, Java, smalltalk and C# etc. Some of the current languages have different ways of solving the same problem (ex. for functional languages there are different set of design principles/patterns). It makes lot of sense to understand the different ways to resolve a problem within the constraints before jumping onto something. Constraints can be of any nature (may be the language of choice, deployment options, computing resources available etc)

I am at present working on a solution (a transformation project) where the vendor packaged applications and their technologies more or less decide the deployment architecture, sizing and infrastructure requirements. There are cases where virtualization of servers can add up to 50% overhead on the server infrastructure. So the question is “do you decide on the deployment/hosting strategy first (where and how you want to deploy your application) before designing it or design the application and then decide the deployment strategy and infrastructure requirements ”.

With new paradigms in computing emerging day by day (ex. Cloud, grid and space based architecture, REST etc) application can now be designed based on how you plan to host them (i.e. what is the cost effective way of deploying them). However, you are bound to fixed application designs when you are using packaged applications (Most of the business application vendors like SAP, Oracle are still mostly in the standard client server or three tiered architecture space) and can not do much about it like my current project.

Normally infrastructure and operations are an afterthought with no consideration for them during application design. However, future trends are more towards using the existing/available infrastructure options and operations requirements to help drive the application design thereby closing the gaps between apps and ops in an organization.

Sunday, September 06, 2009

Amazon Virtual Private Cloud - A Sliver Lining in the Cloud !

Cloud as a technology is gathering momentum. It is quite an onerous job to keep track of the developments everyday with cloud service providers mushrooming as minutes go by and lots of venture capitalists throwing their weight around it. It is not uncommon for the skeptics to expect a 'Cloud Burst' in the times to come.

Who does not want to be there at the center of attention. Every vendor has thrown a substantial amount of their R&D budget for cloud offerings and research. There has been efforts by number organizations to 'standardize the cloud' with their versions of standardization requirements around Cloud Resource Definition, Cloud Federation, Cloud Interops et al. There has been number of ongoing efforts, including US Government to create communities and de-facto standards for cloud computing.

Inspite of the so much hype around the technology, there has been efforts by many vendors to make Cloud as a feasible alternative for many enterprises. In my opinion Amazons latest effort around virtual private cloud (VPC) that allows customers to seamlessly extend their IT infrastructure into the cloud while maintaining the levels of isolation required for their enterprise management tools to do their work, is a step in the right direction.

Elasticity and Pay as you Go are the two key requirements for any cloud Platform. Till the time Cloud Platforms can truly prove themselves as extensions of the existing data centers of an enterprise leveraging the existing investments in tools and technologies, every IT decision maker has a difficult task of sell it to all stake holders. Amazon CTO Werner Vogels has a good post introducing Amazon VPC.

Introducing Amazon Virtual Private Cloud

We have developed Amazon Virtual Private Cloud (Amazon VPC) to allow our customers to seamlessly extend their IT infrastructure into the cloud while maintaining the levels of isolation required for their enterprise management tools to do their work.

With Amazon VPC you can:

  • Create a Virtual Private Cloud and assign an IP address block to the VPC. The address block needs to be CIDR block such that it will be easy for your internal networking to route traffic to and from the VPC instance. These are addresses you own and control, most likely as part of your current datacenter addressing practice.
  • Divide the VPC addressing up into subnets in a manner that is convenient for managing the applications and services you want run in the VPC.
  • Create a VPN connection between the VPN Gateway that is part of the VPC instance and an IPSec-based VPN router on your own premises. Configure your internal routers such that traffic for the VPC address block will flow over the VPN.
  • Start adding AWS cloud resources to your VPC. These resources are fully isolated and can only communicate to other resources in the same VPC and with those resources accessible via the VPN router. Accessibility of other resources, including those on the public internet, is subject to the standard enterprise routing and firewall policies

Amazon VPC offers customers the best of both the cloud and the enterprise managed data center:

  • Full flexibility in creating a network layout in the cloud that complies with the manner in which IT resources are managed in your own infrastructure.
  • Isolating resources allocated in the cloud by only making them accessible through industry standard IPSec VPNs.
  • Familiar cloud paradigm to acquire and release resources on demand within your VPC, making sure that you only use those resources you really need.
  • Only pay for what you use. The resources that you place within a VPC are metered and billed using the familiar pay-as-you-go approach at the standard pricing levels published for all cloud customers. The creation of VPCs, subnets and VPN gateways is free of charge. VPN usage and VPN traffic are also priced at the familiar usage based structure
  • All the benefits from the cloud with respect to scalability and reliability, freeing up your engineers to work on things that really matter to your business.

Friday, May 08, 2009

Cloud Ecosystem - US Federal View

Peter Mell and Tim Grance - National Institute of Standards and Technology, Information Technology Laboratory has put the following definition of Cloud Computing in Draft NIST definition of Cloud Computing. This is the most exhaustive cloud definition I have seen till date.

Definition of Cloud Computing:

Cloud computing is a pay-per-use model for enabling available, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model promotes availability and is comprised of five key characteristics, three delivery models, and four deployment models.

Key Characteristics:

· On-demand self-service. A consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed without requiring human interaction with each service’s provider.
· Ubiquitous network access. Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous thin or thick client platforms (e.g., mobile phones, laptops, and PDAs).
· Location independent resource pooling. The provider’s computing resources are pooled to serve all consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned according to consumer demand. The customer generally has no control or knowledge over the exact location of the provided resources. Examples of resources include storage, processing, memory, network bandwidth, and virtual machines.
· Rapid elasticity. Capabilities can be rapidly and elastically provisioned to quickly scale up and rapidly released to quickly scale down. To the consumer, the capabilities available for rent often appear to be infinite and can be purchased in any quantity at any time.
· Pay per use. Capabilities are charged using a metered, fee-for-service, or advertising based billing model to promote optimization of resource use. Examples are measuring the storage, bandwidth, and computing resources consumed and charging for the number of active user accounts per month. Clouds within an organization accrue cost between business units and may or may not use actual currency.
· Note: Cloud software takes full advantage of the cloud paradigm by being service oriented with a focus on statelessness, low coupling, modularity, and semantic interoperability.

Delivery Models:

· Cloud Software as a Service (SaaS). The capability provided to the consumer is to use the provider’s applications running on a cloud infrastructure and accessible from various client devices through a thin client interface such as a Web browser (e.g., web-based email). The consumer does not manage or control the underlying cloud infrastructure, network, servers, operating systems, storage, or even individual application capabilities, with the possible exception of limited user-specific application configuration settings.
· Cloud Platform as a Service (PaaS). The capability provided to the consumer is to deploy onto the cloud infrastructure consumer-created applications using programming languages and tools supported by the provider (e.g., java, python, .Net). The consumer does not manage or control the underlying cloud infrastructure, network, servers, operating systems, or storage, but the consumer has control over the deployed applications and possibly application hosting environment configurations.
· Cloud Infrastructure as a Service (IaaS). The capability provided to the consumer is to rent processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, deployed applications, and possibly select networking components (e.g., firewalls, load balancers).

Deployment Models:

· Private cloud. The cloud infrastructure is owned or leased by a single organization and is operated solely for that organization.
· Community cloud. The cloud infrastructure is shared by several organizations and supports a specific community that has shared concerns (e.g., mission, security requirements, policy, and compliance considerations).
· Public cloud. The cloud infrastructure is owned by an organization selling cloud services to the general public or to a large industry group.
· Hybrid cloud. The cloud infrastructure is a composition of two or more clouds (internal, community, or public) that remain unique entities but are bound together by standardized or proprietary technology that enables data and application portability (e.g., cloud bursting).

Each deployment model instance has one of two types: internal or external. Internal clouds reside within an organizations network security perimeter and external clouds reside outside the same perimeter.

Tuesday, April 28, 2009

Rising from the ashes!

This is no spiritual awakening post, nor an essay on how to rebuild the world from the economic meltdown.

There are numerous obituaries for SOA in the recent past in the blogosphere with some lamenting on the untimely demise of iconic superstar and pointing fingers towards the economic recession. There have been many discussions on this topic with diverging views starting from clear rebuttal to the acceptance of the fact that hype is over, at least the one created by the vendors !

It is quite natural for things to fail and the rebuilding process to drive for newer things. This keeps the wheel moving. There have been instances where a new idea has been rejected outrightly, only to resurface again after some time or lead to something new and disruptive.

It is a fact that not many IT decision makers are now interested to listen to SOA eulogy and how this can bring transformational change to business aka. business agility. In my view rather than debating on whether SOA is dead or alive, industry should now focus on the learnings and the technologies that can help the industry move forward.

SOA has many off-springs(ex. Cloud Computing, RESTful Services, Mashups etc) that can bring changes to the way IT can deliver value to business. I strongly believe that the focus and evangelism around SOA has helped build distributed solutions with web as the 'common gateway'.

Distributed computing is going through a metamorphosis with many new concepts like Cloud computing, Map Reduce distributed computing, distributed file systems, non-relational distributed databases driving some of the new solutions and offerings. There have been others like multicore computing, functional programming languages and software appliances that are also catching the imagination.

Cloud computing is also going through a similar hype cycle like SOA with many terming it as the next best thing, while others are rejecting it as just another fad. There have been many acronyms and offerings around cloud (IaaS, PaaS, SaaS). There have also been private cloud offerings for enterprises by some vendors. Recent McKinsey report has done well to differentiate cloud from cloud services and has some very good observations on its usefulness to an enterprise.

- Many cloud services are confused with cloud.

A true cloud has to comply with three key requirements.

a. It has to abstract the underlying hardware from the buyer

b. Be elastic in scaling to demand

c. Bill buyers on a pay-per-use basis

A cloud service complies with two key requirements.

a. It is a service where the underlying infrastructure is abstracted and can scale elastically

b. It could run on top of a cloud, although it is not required to (e.g., SaaS)

- Cloud offerings are most attractive for small and medium-sized enterprises and not cost-effective for larger enterprises.

- Larger Enterprises can achieve server utilization rate similar to those of cloud providers by focusing on data center best practices ( virtualization, service catalog etc)

Tuesday, August 12, 2008

Who owns the Cloud???

Finally some good news... USTPO ( US Trademark and Patent Office) has withdrawn the notice of allowance initially issued to Dell for Cloud Computing trademark. This is one of the most absurd copyright effort I have heard after US-based Bikram Choudhary's move to get a copyright for his method of teaching yoga.

Off late many organizations are jumping into the cloud computing band-wagon; be it Sun, Microsoft, IBM (Blue Cloud) or the latest entrants Yahoo and AT&T. Last week, Intel, Yahoo, HP, and an international trio of research institutions announced a joint cloud-computing research initiative. The ambitious six-site project is aimed at developing an Internet-based computer infrastructure stable enough to host companies' most critical data-processing tasks. Many startups have jumped into the fray to cash-in on this latest buzz.

There are many offerings around the cloud infrastructure (ex. Amazon EC2& S3), hosted applications (ex. Salesforce) , application integration(ex. Boomi on Demand Platform) etc. There are also some open source offerings like Eucalyptus. Eucalyptus is an open-source software infrastructure for implementing cloud computing.

Nevertheless, there are still some doubts about how Cloud Computing is different from the Grid? or then is it just the good old wine in a new bottle. The doubts are not misplaced as many vendors are now rebranding their grid and virtualization offerings as Cloud offerings.

Clustering, Grid and Cloud Computing have some overlapping concepts but are yet different from each other.

Clustering is a fault-tolerant server technology, which is similar to redundant servers, except that each server takes part in the processing services requested.

Grid computing is another load-balanced parallel means of massive computation, similar to clusters, but implemented with loosely coupled systems that may join and leave the grid randomly.

Cloud computing is the most recent successor to grid computing, utility computing, virtualization and clustering. Cloud computing overlaps all these concepts, but has its own meaning: the ability to connect to infrastrcture, software and data on the web (the cloud) instead of on your hard drive or local network.

Grid computing typically involves a small number of users requesting big chunks of resources from a homogenous environment. Cloud computing involves large number of users with relatively low resource requirements from a heterogenous environment. Grid is about more with less where as Could is about more with more.

As this space keeps getting crowded, and with all the money being pumped in, it remains to be seen whether Cloud Computing survives the intial hype and proves to be a classic disruptive technology as perdicted by many industry analysts.

Thursday, June 05, 2008

My Tryst with Erlang

Of late I have been trying to learn some new dynamic/functional languages. With all the noise about RoR(Ruby on Rails) in the last few years and changes it has brought to the web application development, it was natural to try my hands at Ruby/Rails. Being a REST enthusiast and Rails 2.0 providing support for RESTful Services, I have looked at implementing some services using rails.

In recent times there have been many debates on scalability of Rails applications and concurrency support of Ruby in general being fairly weak, I looked at different directions for a better language to adopt. The ones that had caught my attention in the last few months have been Erlang and Scala. Being a Java developer for the last several years it would have been easy for me to go for Scala. Scala is an OO and functional hybrid that runs on the JVM and has access to java libraries. It is like having the best of both worlds!

But I admit to have been smitten by Erlang for its simplicity, and a language written keeping in mind the basic non-functional requirements of an enterprise system (ex. Reliability, Availability, Concurrency and Scalability). Scala does have some of the key features of Erlang implemented (ex. Actor Libraries for concurrency) albeit in different ways, it still is not in the same league. In the same breath I have to admit that Scala would definitely be a good replacement for Java. This article from Yariv Sadan, an Erlang developer with ErlyWeb, Twoorl etc to his credit, is a great read on the differences beween Erlang and Scala. Steve Vinoski, an Erlang convert has also put down some of his thinkings in his blog.

Coming to my experiences with Erlang, I have been playing with it for a few weeks trying to understand the basic concepts and have written a few sample programs. I am quite impressed by its simplicity when it comes to concurrency, error handling etc. Being in the EAI consultancy for quite sometime I have also looked at some of the Integration Frameworks like RabbitMQ written in Erlang.

Last week I was trying to install Erlang in my MacBook and it was really a tough job. There are no direct OSX distributions of Erlang available. I had to install it using MacPorts and it took me some time to understand the concepts and install Erlang, Yaws (Web Server written in Erlang) and ErlyWeb (Rails type Web Framework written in Erlang) . Intially I had issues (Compiler Issue: can not create executables) with the XCode version (I had XCode 2.5 which is supposed to be compatible with both Tiger and Leopard) and I upgraded it to 3.0. After having gone through developing the first web application, I upgraded my OS to OSX 10.5.3 (the new leopard version). I have not been able to start Erlang in my MacBook now; it gives me this silly “Bus Error”.

I have seen this issue in the MacPorts forum. I hope they fix the issue quickly for me to be able to venture into the beautiful world of Erlang again.

Monday, January 07, 2008

Control without Controlling!

An informative article by Steve Vinoski on Serendipitous Reuse, how REST and simillar architectural styles nurture it.

He concludes emphasizing that well-constrained architectural styles can be the right recipe for EA success (controll without controlling!)


It's highly ironic that many enterprise architects seek to impose centralized control over their distributed organizations. In many cases, such centralization is a sure recipe for failure. A proven framework based on a well-constrained architectural style like REST allows for decentralized development that, because of the architectural constraints, still yields consistency. The Web itself is proof that this form of "control without controlling"2 works. In the long run, this approach is far more likely to achieve what architects seek than trying to enforce collections of ad hoc governance rules.

Wednesday, January 02, 2008

Advice from Bruce..

Bruce Eckel has an excellent blog post based on his commencement address for Neumont University.

Some intresting points mentioned...

50-80% of programming projects fail. These numbers are so broad because people don't brag about their failures, so we have to guess. In any event, this makes the world sound pretty unreliable

5% of programmers are 20x more productive than the other 95%

You must learn continuously and teach yourself new technologies

Code is read much more than it is written. If people can't read your story, they can't improve it or fix it. Unreadable code has a real cost, and we call it 'technical debt'.

Code reviews are the most effective ways to find software defects, and yet we usually 'don't have time for them'.

Here are some more which many people have seriously believed(myths):
- Companies don't have to make a profit anymore. It's the new economy.
- Real estate always goes up, even if salaries don't.
- Or even: A university must be a traditional campus and not an office building.

Wednesday, September 19, 2007

Common Information Model, Bane for Service Re-engineering!

In this brilliant article titled "Classic SOA" Dan North discusses the technlogy agnostic way of designing services. He is spot on in his remark that the venodrs are making SOA look lot more complex than it actually is in order to sell their products and solutions.

He writes

"Naturally it is in the vendors’ interest to emphasize the complexity of SOA and then provide a timely and profitable solution. This leads many systems architects into a technology-centric view of SOA, when, in fact, the most important criteria for a service-oriented architect — before tackling the technology — should be a keen understanding of the business.

I am also quite impressed by his explantion on how a single domain model will make little sense for the consumer and the povider of the services and how it becomes diffucult to re-engineer becuase the tight domain model coupling.

He emphasises usage of "busniness concepts", an effectively higher-level, ubiquitous language that ties together all of the finer-grained domain models behind each service.

He goes on to add

"The service contract is then expressed in terms of enterprise-level business concepts, such as a vacation or a dispatch or a sales order, which again decouples the service consumer from the service provider and allows them to evolve independently, while still able to communicate in a common language. The mistake that enterprise information architects (or people with similarly named roles) make is trying to define what the business concept means to each of the people using it"

Thursday, September 06, 2007

Sun Java CAPS and OpenESB

As expected, future versions of Java CAPS will include "OpenESB framework and components". I am quite interested to see how the Java CAPS 5.2 version will look like? I have been a strong supporter of JBI and have been using Apache ServiceMix extensively. I have to admit that I have not used OpenESB to the same extent. It would be nice to see, how the OpenESB JBI container and other JBI components will be integrated with the existing components like eGate Integrator, eInsight etc (From SeeBeyond acquisition). This will also help the existing SeeBeyond customers who are quite anxious about the future roadmap of Java CAPS.

In this post Sun's Fred Aabedi writes about the value add this will provide to the new and existing customers.

In his words..

"The merger of the OpenESB framework and some of JBI components into CAPS 5.2 brings exciting new possibilities to both our existing and new customers in the integration, composite applications, and SOA domains. A consolidated runtime environment based on the world class Glassfish platform allows the interoperability of our classic Java EE based components and new JBI based components. This combination is quite powerful and provides a lot of new options to our customers to solve their integration problems and build killer composite applications. Customers realize this ability to leverage existing proven solutions along with leading edge technologies by taking advantage of the Bi-directional Re-use features (JBI Bridge) that allow interoperability between the Java EE and JBI based components. In addition, standardization on the NetBeans 6.0 platform for all of the Java CAPS tooling gives developers a proven and effective platform on which to develop enterprise solutions."

Thursday, August 16, 2007

What's new in SOA Testing?

SOA Testing.. How different is this from the Testing we have been doing all along?

SOA Testing is understood to be testing the building blocks of SOA , the services . This is evolving day by day as SOA being accepted as the mantra for business agility. There are many products in the market that help in automate testing of services. It would be fair to say that these products/tools are more or less webservice centric.

In artilce titled Adjusting Testing for SOA , David Linthicum talks about the the changes in testing approach, SOA brings to the table.

He concentrates on testing services ( Testing the Core) and makes a mention of the complexities related to service security and governance can bring to testing without elaborating on them.

In his words...

"Considering that, when testing services (for example, Web services, Java EE, etc.) you have to think about a few things, including autonomy, integration, granularity, stability and performance, in the particular order of your requirements"

Miko Matsumura in his article SOA Testing Hubub extends the concpet to SOA Governance.

He adds..

"As such, the testing group is a party concerned with the concept of quality. Therefore thier ability to create policy assertions that define their concerns and expectations around quality creates their participation in governance. Now the “enforcement point” for testing may be a quality system such as the Mindreefs, iTKO, Parasoft, PushToTest, or Solstice type system"

Wednesday, August 08, 2007

Mashups and EAI

In an interesting article Gregor Hohpe(Google Architect of EIP Fame) describes how EAI patterns and concepts can be used while building mashups.

This to some extent justifies, why major vendors in the EAI space (ex. TIBCO - GI Acquisition) are looking for offerings in this space.

It is not simple to differntiate Mashups from Composite Apps.Conceptually they are the same.Only difference is in their scope.Mashups are often built ad-hoc and then integrated using simple protocols like RSS and Atoms. Mashups are used in the context of Web 2.0. Mashups pull data from different sources, aggregate and transform the data to be used in different contexts.

Mashups: REST/XML,JSON, ad-hoc, bottom-up, easy to change, lowexpectations, built by user

Composite Apps: SOA/WS-*, planned, top-down, more static, (too) highexpectations, built by

If you want to know more about Mashups, have a look at this tutorial from the same author.

Heard of Yahoo Pipes.. . Experiment with it and have fun.

Thursday, May 03, 2007

Service Harvesting

SOA is expected to give the corporates, that have a love-hate relationship with their monolithic legacy applications, a new lease of life. Clearly management of these legacy applications mainly in mainframes have become the single largest outlay of IT funds, on the other hand, amoebic growth (in terms of functionality and business processes and data) of these applications over time, has made it nearly impossible to replace these applications. To complicate the matter organizations are also losing people, who are the so called "Knowledge Centers" for these applications. Organizations are also facing numerous challenges in terms of business agility which can be traced back to the inflexible nature these applications and their embedded business logic.

The best possible way out for these organizations lies in creating services out of these applications. These services can then be reused to develop new business processes or modify existing business processes to cater to changing business needs.

But, how to go about this? Is there any standard methodology or approach for this? I have discussed about the top-down and bottom-up approaches in one of my earlier posts. Obviously both of them have their limitations and may make the whole initiative bite dust. No effort to identify and create services would be successful without understanding the existing applications and their routines.

In his article Finding Services in the Mainframe, Mike Oara ,CTO, Relativity Technologies, discusses "meet-in-the-middle" approach for service identifications in mainframe applications. I am sure this can be applied to any kind of legacy applications (mainframe, non-mainframe). He also talks about identifying potential services and harvesting them.

According to him...

"In this approach the service modeling team and the mainframe application experts work together to identify potential services that are both useful and feasible, given the existing legacy constraints"

He defines potential services as ...

"Application artifacts which alone or combined have all the characteristics of services"

He also goes on to add a few ways to dig-up functionalities from these applications and delves into the debatable topics like "service composition" and "service granularity". Finally he talks about importance of interactions and negotiations obetween "mainframe" and distributed applications" communities.

IMHO, there are still quite a few gray areas around Service Design? There are two schools of thought advocating different approacheds for Service Design

- Business Transaction Approach
- Logical Data View Approach

In Logical data view approach, a service would be defined with more CRUD type operations. I am not convinced that this is the right approach for designing services?

My preference would be to design services based on business transaction rather than logical data view. But, within the constraints of legacy applications, this option may prove to be road-block.

Where as, when developing new applications it would be appropriate to go for the business transactions approach and keep the data for the services close to the services. Talking about "Service and Data" I am reminded of this blog post "SOA Question: should we carve service independence into the database?" by Nick Malik . He mentioned that...

"If the services are designed well, there should be no cause for a single transaction that adds data under two different services."

This is only possible if we provide for data redundancy across services and synchronize them.

Friday, April 27, 2007

Mashups and Unified Desktops

A few days back Todd Biske has written this nice blog post on Composite Applications, Mashups, Widgets and Gadgets. I think now the industry has somewhat in agreement that composite apps and mashups mean the same, but mashups are more used in the context of web based presentation of composite applications. Lot is being spoken about about Enterprise Mashups at present. I think the simplicity (its all about javascript,DHML and XML) of the technology would slowly eat away the Portal Technology market.

There seems to be a growing demand in organizations for creating "Unified Desktop" for the employees. The requirement is to create a single user interface to access basic features of all kinds of applications in the enterprise. Sometime back I was dealing with such an requirement from a leading online trading firm. The requirement is to build an "Unified Agent Desktop" for the customer care representatives, which would have case management workflows along with the access to multiple applications they use, to manage the cases.

Dashboard Gadgets and Widgets should hopefully make this simpler in the future. Application Vendors should provide these Application Widgets/Gadgets to their customers along with the applications or organizations can develop them based on their simplicity.

Wednesday, April 25, 2007

Mule jBPM Connector

Mule 1.4 comes with an BPM Connector. This connector can integrate with BPM engines that provide a Java API. If there is a requirement to integrate with a BPEL engine standard web services with the soap transport (axis/xfire) canbe used.

jBPM is the first BPM engine that comes out of the box with Mule 1.4.

It looks pretty cool. You can easily write your long running integration processes in JPDL.

I hope to see integration support for JBossRules (Drools) engine in the future.

Tuesday, April 24, 2007

Are you looking at implementing SOA as part of your project?

Recently while going through a set of questions asked by my fellow colleagues to a customer regarding an RFP, I saw this question. This was an RFP for a packaged software implementation and the question was asked in the context of integration requirements with the other applications in the enterprise.

I brought this, to point out how product vendors and system integrators confuse customers with questions like this. Understanding of the concept is of paramount importance for its adoption. Dave Linthicum has clearly emphasized this in his blog, where he talks about the need for the vendors (I would add the System Integrators) to got o SOA School.

http://weblog.infoworld.com/realworldsoa/archives/2007/03/soa_vendors_nee.html

There are some other industry leader who have rightly pointed out how SOA has become a "goal" rathor than a "mean" to achieve business agility. There are certain organizations who talk about ROI of SOA initiative etc... IMHO, rather than starting SOA initiatives, which has become a new name of implementing web services and implementing a service registry (or may be implementing an ESB), organizations should try and inculcate the style in the organization. Microsoft Enterprise Architect Nick Malik has written a very good post (http://blogs.msdn.com/nickmalik/archive/2007/01/16/your-soa-is-jabows-just-a-bunch-of-web-services-and-i-can-prove-it.aspx) ,in which he points out, where the maximum benifits of SOA lie in an enterprise.

A few lines from his post..

"IT projects provide tools for business processes. They automate parts of a business process or collect information or reduce errors. The point is… which processes? In the past, traditional IT only succeeded with the processes that changed rarely"

"The problem is that there is a long list of business processes that occur frequently but that are more difficult to automate because they change frequently"

"That is the SOA sweet spot"

In most of the cases we do not try to understand the customer requires and what are his pain points before suggesting a solution. Therefore, "SOA" has become "magic wand" for the sales guys and consultants.

Tuesday, April 10, 2007

Business Activity Monitoring (BAM) .. Key components

With the amount of industry buzz around SOA and technologies like MDM, BPM and BAM, it would be difficult to find out an IT Leader who does not want the benefits of BAM in his organization. There are quite a few products in the market in this segment which will catch your eye. I would discuss the key components of BAM tool/framework in this post.

The key to any enterprise is to understand the Business Events (external and internal) and treat them accordingly. There are certain business events that would help you know the business heath. There are certain business events which together can tell you the performance, trends etc. This is nothing but business intelligence, in fact real-time business intelligence.

Ideally there would be three components of a BAM tool

1. Event Absorption Layer also known as the Data Collector Layer
2. Event Processing Layer
3. Delivery Layer

Event absorption layer is the one that collects business events from the enterprise. If you look at application landscape of any organization, you would find number of applications (legacy, custom developed, packaged apps etc). Business processes and business logic lie embedded in these applications in many cases. Data collection can be either using push/pull model. Application pushing data or BAM framework pulling the data from these sources. As it is difficult to collect/absorb data from these diverse sources, integration platforms and technologies play a vital role here. Due care should be taken so that this event collection/absorption is as non-invasive as possible so as not to affect the performance of these business applications.

Event Processing Layer has the job of analyzing and correlating these events based on some rules and assumptions. Key Performance Indicators (KPIs) are nothing but filtered and correlated event data based on some rules. This layer consists of a rules engine,an analytics engine and a predictive engine (fingerprinting engine) . Predictive engine uses the analyzed and correlated data from the business events to predict.

Finally, the Delivery layer to deliver the results to the end users. There should be a notification/alert engine to notify the users if required using different channels. There should be portal(preferably web based) for the users to have a look at the filtered, correlated and analyzed data.