Tuesday, October 21, 2008

Business Rules Engine - BRE

Lets see what is a Business Rule and then we can know about Business Rules Engine ..

Business rules the operations, definitions and constraints that apply to an organization in achieving its goals. Simply put Business rules represent the Decision making policies of an organization like All customers that spend more than $100 at one time will receive a 10% discount.

Business rules represent a natural step in the application of computer technology aimed at enhancing productivity in the workplace. Automated business processes that have business logic embedded inside often take substantial time to change, and such changes can be prone to errors. And in a world where the life cycle of business models has greatly shortened, it has become increasingly critical to be able to adapt to changes in external environments promptly. These needs are addressed by a business rules approach.

Business rules enhances Business Agility. And the manageability of business processes also increases as rules becomes more accessible.

A BRMS or Business Rule Management System is a software system used to define, deploy, execute, monitor and maintain the variety and complexity of decision logic that is used by operational systems within an organization or enterprise. This logic, also referred to as business rules, includes policies, requirements, and conditional statements that are used to determine the tactical actions that take place in applications and systems.


A BRMS includes, at minimum:

* A repository, allowing decision logic to be externalized from core application code
* Tools, allowing both technical developers and business experts to define and manage decision logic
* A runtime environment, allowing applications to invoke decision logic managed within the BRMS and execute it using a business rules engine

The top benefits of a BRMS include:

* Reduced or removed reliance on IT departments for changes in live systems
* Increased control over implemented decision logic for compliance and better business management
* The ability to express decision logic with increased precision, using a business vocabulary syntax and graphical rule representations (decision tables, trees, scorecards and flows)
* Improved efficiency of processes through increased decision automation
* Rules are externalized and easily shared among multiple applications

Then how it is designed ???

Many organizations' rules efforts combine aspects of what is generally considered work-flow design with traditional rule design. This failure to separate the two approaches can lead to problems with the ability to re-use and control both business rules and workflows. Design approaches that avoid this quandary separate the role of business rules and work flows as follows:

Business rules produce knowledge (~=data) ; work flows perform business work (~=function). Concretely, that means that a business rule may do things like detect that a business situation has occurred and raise a business event (typically carried via a messaging infrastructure) or create higher level business knowledge (e.g., evaluating the series of organizational, product, and regulatory-based rules concerning whether or not a loan meets underwriting criteria). On the other hand, a work flow would respond to an event that indicated something such as the overloading of a routing point by initiating a series of activities.

This separation is important because the same business judgment (mortgage meets underwriting criteria) or business event (router is overloaded) can be reacted to by many different work flows. Embedding the work done in response to rule-driven knowledge creation into the rule itself greatly reduces the ability of business rules to be reused across an organization because it makes them work-flow specific.

To deliver this type of architecture it is essential to establish the integration between a BPM (Business Process Management) and BRM (Business Rules Management) platform that is based upon processes responding to events or examining business judgments that are defined by business rules. There are some products in the marketplace that provide this integration natively. In other situations this type of abstraction and integration will have to be developed within a particular project or organization.

Most Java-based rules engines provide a technical call-level interface, based on the JSR-94 application programming interface (API) standard, in order to allow for integration with different applications, and many rule engines allow for service-oriented integrations through Web-based standards such as WSDL and SOAP.

Most rule engines supply the ability to develop a data abstraction that represents the business entities and relationships that rules should be written against. This business entity model can typically be populated from a variety of sources including XML, POJOs, flat files, etc. There is no standard language for writing the rules themselves. Many engines use a Java-like syntax, while some allow the definition of custom business friendly languages.

Most rules engines function as a callable library. However, it is becoming more popular for them to run as a generic process akin to the way that RDBMSs behave. Most engines treat rules as a configuration to be loaded into their process instance, although some are actually code generators for the whole rule execution instance and others allow the user to choose

Types of rule engines

There are two different classes of rule engines, both of which are usually forward chaining. The first class processes so-called production/inference rules. These types of rules are used to represent behaviors of the type IF condition THEN action. For example, such a rule could answer the question: "Should this customer be allowed a mortgage?" by executing rules of the form "IF some-condition THEN allow-customer-a-mortgage".

The other type of rule engine processes so-called reaction/Event Condition Action rules. The reactive rule engines detect and react to incoming events and process event patterns. For example, a reactive rule engine could be used to alert a manager when certain items are out of stock.

The biggest difference between these types is that production rule engines execute when a user or application invokes them, usually in a stateless manner. A reactive rule engine reacts automatically when events occur, usually in a stateful manner. Many (and indeed most) popular commercial rule engines have both production and reaction rule capabilities, although they might emphasize one class over another. For example, most business rules engines are primarily production rules engines, whereas Complex Event Processing rules engines emphasize reaction rules

In computer science, and specifically the branches of knowledge engineering and artificial intelligence, an inference engine is a computer program that tries to derive answers from a knowledge base. It is the "brain" that expert systems use to reason about the information in the knowledge base for the ultimate purpose of formulating new conclusions. Inference engines are considered to be a special case of reasoning engines, which can use more general methods of reasoning.

Inference Engine - Architecture

The separation of inference engines as a distinct software component stems from the typical production system architecture. This architecture relies on a data store, or working memory, serving as a global database of symbols representing facts or assertions about the problem; on a set of rules which constitute the program, stored in a rule memory of production memory; and on an inference engine, required to execute the rules. (Executing rules is also referred to as firing rules.) The inference engine must determine which rules are relevant to a given data store configuration and choose which one(s) to apply. The control strategy used to select rules is often called conflict resolution.

An inference engine has three main elements. They are:

1. An interpreter. The interpreter executes the chosen agenda items by applying the corresponding base rules.
2. A scheduler. The scheduler maintains control over the agenda by estimating the effects of applying inference rules in light of item priorities or other criteria on the agenda.
3. A consistency enforcer. The consistency enforcer attempts to maintain a consistent representation of the emerging solution.

The recognize-act cycle

The inference engine can be described as a form of finite state machine with a cycle consisting of three action states: match rules, select rules, and execute rules.

In the first state, match rules, the inference engine finds all of the rules that are satisfied by the current contents of the data store. When rules are in the typical condition-action form, this means testing the conditions against the working memory. The rule matchings that are found are all candidates for execution: they are collectively referred to as the conflict set. Note that the same rule may appear several times in the conflict set if it matches different subsets of data items. The pair of a rule and a subset of matching data items is called an instantiation of the rule.

In many applications, where large volume of data are concerned and/or when performance time considerations are critical, the computation of the conflict set is a non-trivial problem. Earlier research work on inference engines focused on better algorithms for matching rules to data. The Rete algorithm, developed by Charles Forgy, is an example of such a matching algorithm; it was used in the OPS series of production system languages. Daniel P. Miranker later improved on Rete with another algorithm, TREAT, which combined it with optimization techniques derived from relational database systems.

The inference engine then passes along the conflict set to the second state, select rules. In this state, the inference engine applies some selection strategy to determine which rules will actually be executed. The selection strategy can be hard-coded into the engine or may be specified as part of the model. In the larger context of AI, these selection strategies as often referred to as heuristics following Allen Newell's Unified theory of cognition.

In OPS5, for instance, a choice of two conflict resolution strategies is presented to the programmer. The LEX strategy orders instantiations on the basis of recency of the time tags attached to their data items. Instantiations with data items having recently matched rules in previous cycles are considered with higher priority. Within this ordering, instantiations are further sorted on the complexity of the conditions in the rule. The other strategy, MEA, puts special emphasis on the recency of working memory elements that match the first condition of the rule. (The latter heuristic is heavily used in means-ends analysis.)

Finally the selected instantiations are passed over to the third state, execute rules. The inference engine executes or fires the selected rules, with the instantiation's data items as parameters. Usually the actions in the right-hand side of a rule change the data store, but they may also trigger further processing outside of the inference engine (interacting with users through a graphical user interface or calling local or remote programs, for instance). Since the data store is usually updated by firing rules, a different set of rules will match during the next cycle after these actions are performed.

The inference engine then cycles back to the first state and is ready to start over again. This control mechanism is referred to as the recognize-act cycle. The inference engine stops either on a given number of cycles, controlled by the operator, or on a quiescent state of the data store when no rules match the data.

Data-driven computation versus procedural control

The inference engine control is based on the frequent reevaluation of the data store states, not on any static control structure of the program. The computation is often qualified as data-driven or pattern-directed in contrast to the more traditional procedural control. Rules can communicate with one another only by way of the data, whereas in traditional programming languages procedures and functions explicitly call one another. Unlike instructions, rules are not executed sequentially and it is not always possible to determine through inspection of a set of rules which rule will be executed first or cause the inference engine to terminate.

In contrast to a procedural computation, in which knowledge about the problem domain is mixed in with instructions about the flow of control—although object-oriented programming languages mitigate this entanglement—the inference engine model allows a more complete separation of the knowledge (in the rules) from the control (the inference engine).

** All the content on this blog is extracted from various sources. I just trying understand and present them in my own fashion. Some times if i feel original thing itself as good, i am producing it as it is. Hence all credit goes to the original content writers...

Business Process Management - BPM

What is a Business Process ??

Business Process is a collection of interrelated tasks, which accomplish a particular goal for organization. There are three types of business processes:
1. Management processes: These are processes that govern the operations of a system. Typical management processes include "Corporate Governance" and "Strategic Management".

2. Operational processes: These are the processes that constitute the core business and create the primary value stream. Typical operational processes are Purchasing, Manufacturing, Marketing, and Sales.

3. Supporting processes: These are the processes to support the core processes.
Examples include Accounting, Recruitment, Technical support.

A business process can be decomposed into several sub-processes, which have their own attributes, but also contribute to achieving the goal of the super-process. The analysis of business processes typically includes the mapping of processes and sub-processes down to activity level. Activity is the smallest level of individual work that can be done in an organization.

Business Processes are designed to add value for the customer and should not include unnecessary activities. The outcome of a well designed business process is increased effectiveness (value for the customer) and increased efficiency (less costs for the company).

A business process begins with a customer’s need and ends with a customer’s need fulfillment.


So then What is BPMS ??
Yaaa.. it is what DBMS for DB to Business Process :-) .
Don't worry ..... BPMS is the software wrap around these processes to manage them properly.

oooh Then how it manages ??
It's a 5 step process (Lets call it Life cycle)
design, modeling, execution, monitoring, and optimization.

Lets see what they do individually
Design
Process Design encompasses both the identification of existing processes and designing the "to-be" process. Areas of focus include: representation of the process flow, the actors within it, alerts & notifications, escalations, Standard Operating Procedures, Service Level Agreements, and task hand-over mechanisms.
Good design reduces the number of problems over the lifetime of the process. Whether or not existing processes are considered, the aim of this step is to ensure that a correct and efficient theoretical design is prepared.

Modelling
Determines how the process that we considered in Design phase might operate under different circumstances like What if I have 75% of resources to do the same task? What if I want to do the same job for 80% of the current cost? Etc.,

Execution
Automating a process definition requires flexible and comprehensive infrastructure which typically rules out implementing these systems in a legacy IT environment. Here BRE seems to be a good option. Because Business rules have been used by systems to provide definitions for governing behavior, and a business rule engine can be used to drive process execution and resolution.

Monitoring
Monitoring encompasses the tracking of individual processes so that information on their state can be easily seen and statistics on the performance of one or more processes provided. An example of the tracking is being able to determine the state of a customer order (e.g. ordered arrived, awaiting delivery, invoice paid) so that problems in its operation can be identified and corrected

Process mining is an interesting area to look forward here. It is something like applying Data mining techniques to Process data. The aim of process mining is to analyze event logs extracted through process monitoring and to compare them with an 'a priori' process model. Process mining allows process analysts to detect discrepancies between the actual process execution and the a priori model as well as to analyze bottlenecks.

Optimization
Process optimization includes retrieving process performance information from modeling or monitoring phase and identifying the potential or actual bottlenecks and potential rooms for cost savings or other improvements and then applying those enhancements in the design of the process thus continuing the value cycle of business process management


Thursday, October 16, 2008

Intro 2 PEGA PRPC







Hi,
I am going to present something i known about PEGA PRPC (Pega Rule Process Commander).

The first good news to all NON-IT guys is: "This is the product for you all". To become a PEGA PRPC developer, you don't need to know any CSE stuff other than some OO concepts and little bit of java.

For all IT and Research guys, this is the area to consider as next gen programming paradigm.

The core concept of PEGA PRPC revolves around "Changes in software with time is inevitable". And PEGA PRPC's work is to cut down the costs involved during those changes.

Most of the products will spend 70% of their life time in maintenance adjusting themselves with the changing requirements of client / end user. To make this phase easier and cheap, we have to follow various software development methodologies like XP and follow industries best practices and coding conventions and Knowledge transfers ....... Blah Blah..........Blah.

PEGA products (www.pega.com) tag itself says how it fits in this scenario --- "BUILT FOR CHANGE". The products/services built using PEGA are resilient to changes. PEGA achieves this by separating the Business policies from application code. As the time changes, Business policies will change. Untill now, Change in policies require change in application code also. But here both the things are separated and changes in one module doesn't require to touch any part in another.

Let get into some technical jargon....

PEGASystems is the leader in Developing Business Process Management applications.

For starters, BPM (Business Process Management) software are usually tools that let business users directly interact with the system to set their own business rules and manage/change other aspects of the system dynamically when required. In short, a BPM tool tries to eliminate the need for developers and software coding so business users can directly make the changes they need like setting up business rules and policy execution flows once a platform has been built and adopted in the enterprise. Pega Systems delivers fast and immediate business benefits including improved revenue growth of 30% and more, cost reductions of 40% and more thanks to work automation, and 5 points increase in customer retention

To add strength to my arguments let me compare PEGA PRPC based development with Java.



In the Java IDE Environment, adding a new process, changing an existing process, and discontinuing an existing process. require code changes, database changes, and re compilation. By contrast, we found modifications to be very easy with Process Commander. All that was needed was to make changes to Visio flow diagrams, which automatically drive changes to the rules in Process Commander. In fact, the ease of changing and updating the application makes it possible for Business Analysts/Users to make changes without directly involving IT resources. We find this capability to be a core strength of Process Commander. Process Commander protects developers from introducing new logic by adding a new property to an existing class; updating and discounting an existing logic flow, and measuring the time it took to complete these tasks. With the support of Visio-based flow diagrams and HTML forms. The appropriate rule form supports each step. Commander provides wizards for Web Services Development. It is possible to use a web service without any configuration or settings in a few easy steps. Using the Process Commander Web Services Generator, we pointed to the Web Service URL. Process Commander read and parsed the WSDL, and generated the appropriate class structures, methods, and properties needed to expose the Web Service. Developers can use SOAP to expose the Methods by building WSDL. An easy-to-use wizard performs the WSDL generation Process Commander also allows the developer to use other interfaces, such as EJB, SOAP and IBM’s MQSeries with minimal development effort

Personalization through Rules Resolution
o Interfaces to Crystal Reports, MS Word and Excel
o Customization of development environment

Deployment Features:
o Push or pull changes to production with advanced deployment support through rule set
management and versioning
o Distribution support for applications developed with Process Commander

Application Changes and Updates:
o Changes to business logic via changes to business rules in real time
o Changes to business process flows via changes to Visio flow diagrams in real time
o Built-in Version Management
o Robust enterprise versioning enabling scaling to 100,000 rules and processes
o No database schema changes required to add properties or objects

Web Services:
o Wizard-based Web Services set up; three times faster than Java IDE
o Support for SOAP, IBM’s MQ Series, COM and XML

• Security:
o Role-based security


Productivity will increase even more dramatically when Process Commander is used in an environment that requires frequent real-time changes to rules and process flows. This is often the case in complex BPM scenarios


Saying all this, no doubt getting the chance to work on PEGA products is a great opportunity. Here we are expecting most of the projects to move from Java & J2EE space to PEGA space. And this is the sector any Service oriented company can look into.

But as developer/programmer, i am a little bit biased towards cording work, which you find very minimal in PRPC based development work. I am personally feeling that without coding, we can't improve our analysis and problem solving skills. Ofcourse, its my personal opinion only.

Luckily, we have a devision here which will take part in PEGA product development itself which will work in tight integration with PEGA india team.

Lets hope i will get into that team, so that i will take responsibility of doing the magic behind scenes rather than again and again re-using the features developed by someone else. Why should i miss the fun........ :-)

As a developer i don't recommend to work as PEGA PRPC developer but from organizations and clients profits point of view it is highly recommended. If you want money, go and get trained on PRPC, there is a highly demanding skill. If you want coding satisfaction look into java/J2EE space.

Right now i don't have any choice... i have to work in accordance of my organizations goals. ;-)
The biggest hurdle infront of me is to master this Technology. I personally feel there are no resources available for it on the web and the PEGA's PDN is not that much sufficient. Until now i am finding it very difficult to learn it. The manuals and Online Help PEGA is providing is giving one procedure to achieve something. But in that procedure if any thing goes wrong we don't know what went wrong and how to fix it. Just because we don't know what is happening behind the scenes. Understanding the internals is not recommended for every PRPC developer and even if you desired for it, there is no way you can do it unless you worked in a PEGA Product development environment. Right now i am collecting bits and peaces to get out of the hurdles i am facing with the help of seniors and in tern i want to share it globally.

So i am decided to publish some of the "technical how-tos" in the way i understand them.

Keep watching this space.....


Monday, October 13, 2008

Reading Excel sheet using java

Hi all,

As part of my current project requirement, i have to read an excel file and store that one into a Database.

Lets see how i did it and that may save you some time.

First thing that comes into picture is how to read Excel without using any Third party tools. Licensing is a big problem. So we have to achieve it using native JDK support. We can achieve this by considering the EXCEL document as a database file. To connect to this DB, we need to use the Microsoft provided Excel Database Drivers. (we can found them in Control panel --> Administrative Tools --> ODBC Data Source Administrator. Click Add and find the Microsoft Excel Driver (*xls))

Use the appropriate DNS name (say ExcelDNS) and select the workbook (say ExcelFile).

Remaining everything is similar to connecting to Other DBs.

Here is the Sample Code for it

/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/

package japp;
import java.io.*;
import java.sql.*;

/**
*
* @author TBalakrishna
*/
public class ExcelToJava
{
public static void main(String[] args)
{
Connection connection = null;
try{
Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
Connection con = DriverManager.getConnection( "jdbc:odbc:ExcelDNS" );
Statement st = con.createStatement();
ResultSet rs = st.executeQuery( "Select * from [Sheet1$]" );
ResultSetMetaData rsmd = rs.getMetaData();
int numberOfColumns = rsmd.getColumnCount();

//Remember First column will be treated as a Heading. So it can't be considered for printing
while (rs.next())
{
for (int i = 1; i <= numberOfColumns; i++) { if (i > 1) System.out.print(", ");
String columnValue = rs.getString(i);
System.out.print(columnValue);
}
System.out.println("");
}
st.close();
con.close();
}
catch(Exception ex)
{
System.err.print("Exception: ");
System.err.println(ex.getMessage());
}
}
}

----------------------------------------------------------------------------------------------

Upto now everything seems fine. But it has its own downsides... Like it can be implemented only on Windows Servers (i need Microsoft Excel Drivers .. right ;-)

So i thought of some other api's that i can use with out MS-Win dependency. Here are some those APIs



Library / package License Description
Actuate Spreadsheet Engine Commercial, 30-day trial version available Write Excel (XLS) files.
ExcelAccessor ? Read and write Excel (XLS) files.
ExcelReader ? JavaWorld article on how to read Microsoft Excel (XLS) files, including code. Requires an installed Microsoft ODBC driver for Excel files, and Sun's ODBC-JDBC driver.
ExtenXLS Commercial, 30-day trial version available Read and write Microsoft Excel (XLS) files.
JACOB project LGPL Java COM bridge with examples to interface Excel.
Java Excel API LGPL Read Excel (XLS) 97, 98 and 2000 files.
Java to Excel conversion ? Write SYLK files, which Excel can import and export.
JExcel Commercial Swing component to integrate Excel via JNI.
jXLS LGPL Create Excel files using XLS templates.
POI Apache Software License 1.1 Read and write Microsoft OLE 2 compound document format files. This includes MS Office files (DOC, XLS, PPT) written with Office versions that were released after 1997.
Snowbound Commercial Read Excel files.
SoftArtisans OfficeWriter Commercial Write Word and Excel files.
Vista JDBC Commercial, 15-day trial version available JDBC driver to access MS Excel (XLS) files.
xlSQL GPL JDBC driver to access MS Excel (XLS) and CSV files.


Most of these APIs are built on the fact that Microsoft Documents are saved on to hard using the Microsoft OLE 2 compound document file format

Starting with Office 95, all MS Office applications store their documents in an archive called the OLE2 Compound Document Format (OLE2CDF). It's a bit like the old FAT filesystem: it promotes fragmentation, doesn't support compression, and isn't linear (which would make streaming easier). Microsoft Foundation Classes allow applications to serialize to this format, so if you need interoperability with legacy Windows proprietary file formats or Office documents, you have to deal with OLE2CDF

Each OLE2CDF file on disk contains an entire filesystem, laid out using nested Directory Entries, which contain Entries. We are interested in Entry elements of the Document Entries type. A Document Entry contains application-specific (e.g. Excel) data structures


The Microsoft Excel 97 file format is also known as "BIFF8." Recent versions of Excel have changed very little about this file format, and writing out the new intricacies would serve no purpose other than to make everyone upgrade. So when we say Excel 97 format, we mean Excel 97-to-XP format.

The structure of this File System is

The following is the top-level structure of an Excel Workbook:

Example.xls {

OLE2CDF headers

"Workbook" stream {

Workbook {

Static String Table Record..

Sheet names... and pointers

}

Sheet {

ROW

ROW

...

NUMBER RECORD (cell)

LABELSST Record (cell)

...

}

Sheet

}

}

... images, macros, etc.

Document Summary

Summary

Java Excel API which is developed by Sourceforge.net (my favorite open source resource) looks to be an option. You can find about it @ http://jexcelapi.sourceforge.net/

download jxl.jar to you classpath. Here is the code to read an Excel file named ExcelSheet.xls

import java.io.File;

import java.io.IOException;

import jxl.*;

import jxl.read.biff.BiffException;

import jxl.write.*;


public class ReadExcel
{

private String inputFile;

public void setInputFile(String inputFile)
{

this.inputFile = inputFile;

}


public void read() throws IOException, WriteException
{

File inputWorkbook = new File(inputFile);

Workbook w;

try {

w = Workbook.getWorkbook(inputWorkbook);

// To get First sheet use 0 else what ever the number you want.

Sheet sheet = w.getSheet(0);


// Column wise reading for all rows

for (int j = 0; j <>

for (int i = 0; i <>
Cell cell = sheet.getCell(j, i);

CellType type = cell.getType();

//If its a LABEL
if (cell.getType() == CellType.LABEL)
{
System.out.println("I got a label "+ cell.getContents());
}
//If its a NUMBER
if (cell.getType() == CellType.NUMBER)
{
System.out.println("I got a number "+ cell.getContents());
}

//Like there there are a lot of Types. Better you use an switch here
}
}
} catch (BiffException e)
{
e.printStackTrace();
}
}


public static void main(String[] args) throws WriteException, IOException
{
ReadExcel test = new ReadExcel();
test.setInputFile("c:/temp/ExcelSheet.xls");
test.read();
}
}


------------------------------------------------------------------------------------------------

Next one is to use APACHE POI

Apache's POI (Poor Obfuscation Implementation, the name that seemed to describe the format best) is a high-quality application that can read and write Excel and other MS-format files right from inside of your Java application

POI consists of various parts that fit together to deliver the data in a MS file format to the Java application. At the lowest level is the POIFS (POI FileSystem) API that contains the basic logic to process any OLE2CDF file. Above that sit the various components to process the application data. HSSF (Horrible SpreadSheet Format) understands the Excel structures, while HDF (Horrible Document Format) understands the Microsoft Word structures. In this article, we will look at how to use POIFS to read or write a OLE2CDF file. In future articles of this series, we shall see how to use HSSF and HDF, as well as HPSF (Horrible Property Sheet Format, used to read -- and eventually write -- document property information available through File->Property) and using the HSSF Cocoon Serializer to serialize XML to an Excel file.
Microsoft's OLE 2 Compound Document format once prevented Java programmers from reading and writing Office and MFC-generated file formats from pure Java. Java programmers often had to resort to native bridges which limited them to Microsoft Operating Systems. The Jakarta POI Project opens up new worlds to Java developers by allowing them to write to OLE2CDF-based file formats with pure Java -- even on UNIX. This article explained how to work with the underlying OLE 2 Compound Document Format. In the next article, we'll explain how to read, write, and Modify Excel files with HSSF. The final article will cover the HSSFSerializer for Cocoon, as well as HPSF and HDF.


HSSF has two APIs for reading: usermodel and eventusermodel. The former is most familiar, and the latter is more cryptic but far more efficient. The usermodel consists primarily of the classes in the org.apache.poi.hssf.usermodel package, as well as org.apache.poi.hssf.eventusermodel. (In earlier versions of HSSF, this was in the eventmodel package.) The usermodel package maps the file into familiar structures like Workbook, Sheet, Row, and Cell. It stores the entire structure in memory as a set of objects. The eventusermodel package requires you to become more familiar with the actual low-level structures of the file format. It operates in a manner similar to XML's SAX APIs or the AWT event model (the origin of the name)--and can be trickier to use. It is also read-only, so you cannot modify files using the eventusermodel
Reading Excel sheet using Apache's POI

import java.io.IOException;

import java.io.InputStream;

import java.util.Iterator;

import org.apache.poi.poifs.filesystem.POIFSFileSystem;

import org.apache.poi.hssf.usermodel.HSSFCell;

import org.apache.poi.hssf.usermodel.HSSFSheet;

import org.apache.poi.hssf.usermodel.HSSFWorkbook;

import org.apache.poi.hssf.usermodel.HSSFRow;

/**

* A simple POI example of opening an Excel spreadsheet

* and writing its contents to the command line.

* @author Tony Sintes

*/

public class POIExample {



public static void main( String [] args ) {

try {

InputStream input = POIExample.class.getResourceAsStream( "qa.xls" );

POIFSFileSystem fs = new POIFSFileSystem( input );

HSSFWorkbook wb = new HSSFWorkbook(fs);

HSSFSheet sheet = wb.getSheetAt(0);



// Iterate over each row in the sheet

Iterator rows = sheet.rowIterator();

while( rows.hasNext() ) {

HSSFRow row = (HSSFRow) rows.next();

System.out.println( "Row #" + row.getRowNum() );



// Iterate over each cell in the row and print out the cell's content

Iterator cells = row.cellIterator();

while( cells.hasNext() ) {

HSSFCell cell = (HSSFCell) cells.next();

System.out.println( "Cell #" + cell.getCellNum() );

switch ( cell.getCellType() ) {

case HSSFCell.CELL_TYPE_NUMERIC:

System.out.println( cell.getNumericCellValue() );

break;

case HSSFCell.CELL_TYPE_STRING:

System.out.println( cell.getStringCellValue() );

break;

default:

System.out.println( "unsuported sell type" );

break;
}
}

}
} catch ( IOException ex ) {

ex.printStackTrace();

}
}
}


------------------------------------------------------------------------------------------------
Remember

getPhysicalNumberOfRows() returns the physical number of rows which may be more than the actual (logical) number of rows. The same goes for getPhysicalNumberOfCells().

You should check for nulls when fetching the HSSFRow and HSSFCell objects as shown.

Remember that Excel tables are often sparsely populated. So choose your data structures accordingly. POI accesses the data by sheet. In JExcelAPI you can directly access the data in any row and column

-------------------------------------------------------------------------------------------------
Conclusion


Comparison of JExcelAPI with Jakarta-POI (HSSF)


1. JExcelAPI is clearly not suitable for important data. It fails to read several files. Even when it reads it fails on cells for unknown reasons. In short JExcelAPI isn't suitable for enterprise use.

2. HSSF is the POI Project's pure Java implementation of the Excel '97(-2002) file format. It is a mature product and was able to correctly and effortlessly read excel data generated from various sources, including non-MS Excel products like Open Office, and for various versions of Excel. It is very robust and well featured. Highly recommended