Saturday, November 17, 2007

Memory Management in C++

Memory Management in C++

by Nathan C. Myers

Memory usage in C++ is as the sea come to land:
a tide rolls in, and sweeps out again,
leaving only puddles and stranded fish.
At intervals, a wave crashes ashore; but the
ripples never cease.


Introduction

Many programs have little need for memory management; they use a fixed amount of memory, or simply consume it until they exit. The best that can be done for such programs is to stay out of their way. Other programs, including most C++ programs, are much less deterministic, and their performance can be profoundly affected by the memory management policy they run under. Unfortunately, the memory management facilities provided by many system vendors have failed to keep pace with growth in program size and dynamic memory usage.

Because C++ code is naturally organized by class, a common response to this failure is to overload member operator new for individual classes. In addition to being tedious to implement and maintain, however, this piece-meal approach can actually hurt performance in large systems. For example, applied to a tree-node class, it forces nodes of each tree to share pages with nodes of other (probably unrelated) trees, rather than with related data. Furthermore, it tends to fragment memory by keeping large, mostly empty blocks dedicated to each class. The result can be a quick new/delete cycle that accidentally causes virtual memory thrashing. At best, the approach interferes with system-wide tuning efforts.

Thus, while detailed knowledge of the memory usage patterns of individual classes can be helpful, it is best applied by tuning memory usage for a whole program or major subsystem. The first half of this article describes an interface which can ease such tuning in C++ programs. Before tuning a particular program, however, it pays to improve performance for all programs, by improving the global memory manager. The second half of this article covers the design of a global memory manager that is as fast and space-efficient as per-class allocators.

But raw speed and efficiency are only a beginning. A memory management library written in C++ can be an organizational tool in its own right. Even as we confront the traditional problems involving large data structures, progress in operating systems is yielding different kinds of memory -- shared memory, memory-mapped files, persistent storage -- which must be managed as well. With a common interface to all types of memory, most classes need not know the difference. This makes quite a contrast with systems of classes hard-wired to use only regular memory.

Global Operator New

In C++, the only way to organize memory management on a larger scale than the class is by overloading the global operator new. To select a memory management policy requires adding a placement argument, in this case a reference to a class which implements the policy:

extern void* operator new(size_t, class Heap&);

When we overload the operator new in this way, we recognize that the regular operator new is implementing a policy of its own, and we would like to tune it as well. That is, it makes sense to offer the same choices for the regular operator new as for the placement version.

In fact, one cannot provide an interesting placement operator new without also replacing the regular operator new. The global operator delete can take no user parameters, so it must be able to tell what to do just by looking at the memory being freed. This means that the operator delete and all operators new must agree on a memory management architecture.

For example, if our global operators new were to be built on top of malloc(), we would need to store extra data in each block so that the global operator delete would know what to do with it. Adding a word of overhead for each object to malloc()'s own overhead (a total of 16 bytes, on most RISCs), would seem a crazy way to improve memory management. Fortunately, all this space overhead can be eliminated by bypassing malloc(), as will be seen later.

The need to replace the global operators new and delete when adding a placement operator new has profound effects on memory management system design. It means that it is impossible to integrate different memory management architectures. Therefore, the top-level memory management architecture must be totally general, so that it can support any policy we might want to apply. Total generality, in turn, requires absolute simplicity.

An Interface

How simple can we get? Let us consider some declarations. Heap is an abstract class:

class Heap {
protected:
virtual ~Heap();
public:
virtual void* allocate(size_t) = 0;
static Heap& whatHeap(void*);
};

(The static member function whatHeap(void*) is discussed later.) Heap's abstract interface is simple enough. Given a global Heap pointer, the regular global operator new can use it:


extern Heap* __global_heap;

inline void*
operator new(size_t sz)
{ return ::__global_heap->allocate(sz); }

Inline dispatching makes it fast. It's general too; we can use the Heap interface to implement the placement operator new, providing access to any private heap:

inline void*
operator new(size_t size, Heap& heap
{ return heap.allocate(size); }

What kind of implementations might we define for the Heap interface? Of course the first must be a general purpose memory allocator, class HeapAny. (HeapAny is the memory manager described in detail in the second half of this article.) The global heap pointer, used by the regular operator new defined above, is initialized to refer to an instance of class HeapAny:

extern class HeapAny __THE_global_heap;
Heap* __global_heap = &__THE_global_heap;

Users, too, can instantiate class HeapAny to make a private heap:

HeapAny& myheap = *new HeapAny;

and allocate storage from it, using the placement operator new:

MyType* mine = new(myheap) MyType;

As promised, deletion is the same as always:

delete mine;

Now we have the basis for a memory management architecture. It seems that all we need to do is provide an appropriate implementation of class Heap for any policy we might want. As usual, life is not so simple.

Complications

What happens if MyType's constructor itself needs to allocate memory? That memory should come from the same heap, too. We could pass a heap reference to the constructor:

mine = new(myheap) MyType(myheap);

and store it in the object for use later, if needed. However, in practice this approach leads to a massive proliferation of Heap& arguments -- in constructors, in functions that call constructors, everywhere! -- which penetrates from the top of the system (where the heaps are managed) to the bottom (where they are used). Ultimately, almost every function needs a Heap& argument. Applied earnestly, the result can be horrendous. Even at best, such an approach makes it difficult to integrate other libraries into a system.

One way to reduce the proliferation of Heap arguments is to provide a function to call to discover what heap an object is on. That is the purpose of the the Heap::whatHeap() static member function. For example, here's a MyType member function that allocates some buffer storage:

char* MyType::make_buffer()
{
Heap& aHeap = Heap::whatHeap(this);
return new(aHeap) char[BUFSIZ];
}

(If "this" points into the stack or static space, whatHeap() returns a reference to the default global heap.)

Another way to reduce Heap argument proliferation is to substitute a private heap to be used by the global operator new. Such a global resource calls for gingerly handling. Class HeapStackTop's constructor replaces the default heap with its argument, but retains the old default so it can be restored by the destructor:

class HeapStackTop {
Heap* old_;
public:
HeapStackTop(Heap& h);
~HeapStackTop();
};

We might use this as follows:

{ HeapStackTop top = myheap;
mine = new MyType;
}

Now space for the MyType object, and any secondary store allocated by its constructor, comes from myheap. At the closing bracket, the destructor ~HeapStackTop() restores the previous default global heap. If one of MyType's member functions might later want to allocate more space from the same heap, it can use whatHeap(); or the constructor can save a pointer to the current global heap before returning.

Creating a HeapStackTop object is very clean way to install any global memory management mechanism: a HeapStackTop object created in main() quietly slips a new memory allocator under the whole program.

Some classes must allocate storage from the top-level global heap regardless of the current default. Any object can force itself to be allocated there by defining a member operator new, and can control where its secondary storage comes from by the same techniques described above.

With HeapStackTop, many classes need not know about Heap at all; this can make a big difference when integrating libraries from various sources. On the other hand, the meaning of Heap::whatHeap() (or a Heap& member or argument) is easier to grasp; it is clearer, and therefore safer. While neither approach is wholly satisfactory, a careful mix of the two can reduce the proliferation of Heap& arguments to a reasonable level.

Uses for Private Heaps

But what can private heaps do for us? We have hinted that improved locality of reference leads to better performance in a virtual memory environment, and that a uniform interface helps when using special types of memory.

One obvious use for private heaps is as a sort of poor man's garbage collection:

Heap* myheap = new HeapTrash;
... // lots of calls to new(*myheap)
delete myheap;

Instead of deleting objects, we discard the whole data structure at one throw. The approach is sometimes called "lifetime management". Since the destructors are never called, you must carefully control what kind of objects are put in the heap; it would be hazardous ever to install such a heap as the default (with HeapStackTop) because many classes, including iostream, allocate space at unpredictable times. Dangling pointers to objects in the deleted heap must be prevented, which can be tricky if any objects secretly share storage among themselves. Objects whose destructors do more than just delete other objects require special handling; the heap may need to maintain a registry of objects that require "finalization".

But private heaps have many other uses that don't violate C++ language semantics. Perhaps the quietest one is simply to get better performance than your vendor's malloc() offers. In many large systems, member operator new is defined for many classes just so they may call the global operator new less often. When the global operator new is fast enough, such code can be deleted, yielding easier maintenance, often with a net gain in performance from better locality and reduced fragmentation.

An idea that strikes many people is that a private heap could be written that is optimized to work well with a particular algorithm. Because it need not field requests from the rest of the program, it can concentrate on the needs of that algorithm. The simplest example is a heap that allocates objects of only one size; as we will see later, however, the default heap can be made fast enough that this is no great advantage. A mark/release mechanism is optimal in some contexts (such as parsing), if it can be used for only part of the associated data structure.

When shared memory is used for interprocess communication, it is usually allocated by the operating system in blocks larger than the objects that you want to share. For this case a heap that manages a shared memory region can offer the same benefits that regular operator new does for private memory. If the interface is the same as for non-shared memory, objects may not need to know they are in shared memory. Similarly, if you are constrained to implement your system on an architecture with a tiny address space, you may need to swap memory segments in and out. If a private heap knows how to handle these segments, objects that don't even know about swapping can be allocated in them.

In general, whenever a chunk of memory is to be carved up and made into various objects, a Heap-like interface is called for. If that interface is the same for the whole system, then other objects need not know where the chunk came from. As a result, objects written without the particular use in mind may safely be instantiated in very peculiar places.

In a multi-threaded program, the global operator new must carefully exclude other threads while it operates on its data structures. The time spent just getting and releasing the lock can itself become a bottleneck in some systems. If each thread is given a private heap which maintains a cache of memory available without locking, the threads need not synchronize except when the cache becomes empty (or too full). Of course, the operator delete must be able to accept blocks allocated by any thread, but it need not synchronize if the block being freed came from the heap owned by the thread that is releasing it.

A heap that remembers details about how, or when, objects in it were created can be very useful when implementing an object- oriented database or remote procedure call mechanism. A heap that segregates small objects by type can allow them to simulate virtual function behavior without the overhead of a virtual function table pointer in each object. A heap that zero-fills blocks on allocation can simplify constructors.

Programs can be instrumented to collect statistics about memory usage (or leakage) by substituting a specialized heap at various places in a program. Use of private heaps allows much finer granularity than the traditional approach of shadowing malloc() at link time.

In the remainder of this article we will explore how to implement HeapAny efficiently, so that malloc(), the global operator new(size_t), the global operator new(size_t, Heap&), and Heap::whatHeap(void*) can be built on it.

A Memory Manager in C++

An optimal memory manager has minimal overhead: space used is but fractionally larger than the total requested, and the new/delete cycle time is small and constant. Many factors work against achieving this optimum.

In many vendor libraries, memory used by the memory manager itself, for bookkeeping, can double the total space used. Fragmentation, where blocks are free but unavailable, can also multiply the space used. Space matters, even today, because virtual memory page faults slow down your program (indeed, your entire computer), and swap space limits can be exceeded just as can real memory.

A memory manager can also waste time in many ways. On allocation, a block of the right size must be found or made. If made, the remainder of the split block must be placed where it can be found. On deallocation, the freed block may need to be coalesced with any neighboring blocks, and the result must be placed where it can be found again. System calls to obtain raw memory can take longer than any other single operation; a page fault that results when idle memory is touched is just a hidden system call. All these operations take time, time spent not computing results.

The effects of wasteful memory management can be hard to see. Time spent thrashing the swap file doesn't show up on profiler output, and is hard to attribute to the responsible code. Often the problem is easily visible only when memory usage exceeds available swap space. Make no mistake: poor memory management can multiply your program's running time, or so bog down a machine that little else can run.

Before buying (or making your customers buy) more memory, it makes sense to see what can be done with a little code.

Principles

A memory manager project is an opportunity to apply principles of good design: Separate the common case from special cases, and make the common case fast and cheap, and other cases tolerable; make the user of a feature bear the cost of its use; use hints; reuse good ideas. [Lampson]

Before delving into detailed design, we must be clear about our goals. We want a memory manager that satisfies the following:

# Speed:
It must be much faster than existing memory managers, especially for small objects. Performance should not suffer under common usage patterns, such as repeatedly allocating and freeing the same block.
# Low overhead:
The total size of headers and other wasted space must be a small percentage of total space used, even when all objects are tiny. Repeated allocation and deallocation of different sizes must not cause memory usage to grow without bound.
# Small working set:
The number of pages touched by the memory manager in satisfying a request must be minimal, to avoid paging delays in virtual memory systems. Unused memory must be returned to the operating system periodically.
# Robustness:
Erroneous programs must have difficulty corrupting the memory manager's data structures. Errors must be flagged as soon as possible, not allowed to accumulate. Out-of-memory events must be handled gracefully.
# Portability:
The memory manager must adapt easily to different machines.
# Convenience:
Users mustn't need to change code to use it.
# Flexibility:
It must be easily customized for unusual needs, without imposing any additional overhead.

Techniques

Optimal memory managers would be common if they were easily built. They are scarce, so you can expect that a variety of subtle techniques are needed even to approach the optimum.

One such technique is to treat different request sizes differently. In most programs, small blocks are requested overwhelmingly more often than large blocks, so both time and space overhead for them is felt disproportionately.

Another technique results from noting that there are only a few different sizes possible for very small blocks, so that each such size may be handled separately. We can even afford to keep a vector of free block lists for those few sizes.

A third is to avoid system call overhead by requesting memory from the operating system in big chunks, and by not touching unused (and possibly paged-out) blocks unnecessarily. This means data structures consulted to find a block to allocate should be stored compactly, apart from the unused blocks they describe.

The final, and most important, technique is to exploit address arithmetic which, while not strictly portable according to language standards, works well on all modern flat-memory architectures. A pointer value can be treated as an integer, and bitwise logical operations may be used on it to yield a new pointer value. In particular, the low bits may be masked off to yield a pointer to a header structure that describes the block pointed to. In this way a block need not contain a pointer to that information. Furthermore, many blocks can share the same header, amortizing its overhead across all. (This technique is familiar in the LISP community, where it is known as "page-tagging".)

With so many goals, principles, and techniques to keep track of, it should be no surprise that there are plenty of pitfalls to avoid. They will be discussed later.

A Design

The first major feature of the design is suggested by the final two techniques above. We request memory from the operating system in units of a large power of two (e.g. 64K bytes) in size, and place them so they are aligned on such a boundary. We call these units "segments". Any address within the segment may have its low bits masked off, yielding a pointer to the segment header. We can treat this header as an instance of the abstract class HeapSegment:

class HeapSegment {
public:
virtual void free(void*) = 0;
virtual void* realloc(void*) = 0;
virtual Heap& owned_by(void*) = 0;
};

The second major feature of the design takes advantage of the small number of small-block sizes possible. A segment (with a header of class HeapPageseg) is split up into pages, where each page contains blocks of only one size. A vector of free lists, with one element for each size, allows instant access to a free block of the right size. Deallocation is just as quick; no coalescing is needed. Each page has just one header to record the size of the blocks it contains, and the owning heap. The page header is found by address arithmetic, just like the segment header. In this way, space overhead is limited to a few percent, even for the smallest blocks, and the time to allocate and deallocate the page is amortized over all usage of the blocks in the page.

For larger blocks, there are too many sizes to give each a segment; but such blocks may be packed adjacent to one another within a segment, to be coalesced with neighboring free blocks when freed. (We will call such blocks "spans", with a segment header of type HeapSpanseg.) Fragmentation, the proliferation of free blocks too small to use, is the chief danger in span segments, and there are several ways to limit it. Because the common case, small blocks, is handled separately, we have some breathing room: spans may have a large granularity, and we can afford to spend more time managing them. A balanced tree of available sizes is fast enough that we can use several searches to avoid creating tiny unusable spans. The tree can be stored compactly, apart from the free spans, to avoid touching them until they are actually used. Finally, aggressive coalescing helps reclaim small blocks and keep large blocks available.

Blocks too big to fit in a segment are allocated as a contiguous sequence of segments; the header of the first segment in the sequence is of class HeapHugeseg. Memory wasted in the last segment is much less than might be feared; any pages not touched are not even assigned by the operating system, so the average waste for huge blocks is only half a virtual-memory page.

Dispatching for deallocation is simple and quick:

void operator delete(void* ptr)
{
long header = (long)ptr & MASK;
((HeapSegment*)header)->free(ptr);
}

HeapSegment::free() is a virtual function, so each segment type handles deallocation its own way. This allows different Heaps to coexist. If the freed pointer does not point to allocated memory, the program will most likely crash immediately. (This is a feature. Bugs that are allowed to accumulate are extremely difficult to track down.)

The classical C memory management functions, malloc(), calloc(), realloc(), and free() can be implemented on top of HeapAny just as was the global operator new. Only realloc() requires particular support.

The only remaining feature to implement is the function Heap::whatHeap(void* ptr). We cannot assume that ptr refers to heap storage; it may point into the stack, or static storage, or elsewhere. The solution is to keep a bitmap of allocated segments, one bit per segment. On most architectures this takes 2K words to cover the entire address space. If the pointer refers to a managed segment, HeapSegment::owned_by() reports the owning heap; if not, a reference to the default global heap may be returned instead. (In the LISP community, this technique is referred to as BBOP, or "big bag o' pages".)

Pitfalls

Where we depart from the principles of good design mentioned above, we must be careful to avoid the consequences. One example is when we allocate a page to hold a small block: we are investing the time to get that page on behalf of all the blocks that may be allocated in it. If the user frees the block immediately, and we free the page, then the user has paid to allocate and free a page just to use one block in it. In a loop, this could be much slower than expected. To avoid this kind of thrashing, we can add some hysteresis by keeping one empty page for a size if there are no other free blocks of that size. Similar heuristics may be used for other boundary cases.

Another pitfall results from a sad fact of life: programs have bugs. We can expect programs to try to free memory that was not allocated, or that has already been freed, and to clobber memory beyond the bounds of allocated blocks. The best a regular memory manager can do is to throw an exception as early as possible when it finds things amiss. Beyond that, it can try to keep its data structures out of harm's way, so that bugs will tend to clobber users' data and not the memory manager's. This makes debugging much easier.

Initialization, always a problem for libraries, is especially onerous for a portable memory manager. C++ offers no way to control the order in which libraries are initialized, but the memory manager must be available before anything else. The standard iostream library, with a similar problem, gets away by using some magic in its header file (at a sometimes intolerable cost in startup time) but we don't have even this option, because modules that use the global operator new are not obliged to include any header file. The fastest approach is to take advantage of any non-portable static initialization ordering available on the target architecture. (This is usually easy.) Failing that, we can check for initialization on each call to operator new or malloc(). A better portable solution depends on a standard for control of initialization order, which seems (alas!) unlikely to appear.

Measurements

Tests of vendor-supplied memory managers can yield surprising results. We know of a program that used 250 megabytes of swap space (before crashing when it ran out) when linked with the vendor-supplied library, but only a steady 50 megabytes after it was relinked with our memory manager. We have a simple presentation graphics animation program that uses less than half a megabyte of memory, but draws twice as fast when relinked.

Dividends

The benefits of careful design often go beyond the immediate goals. Indeed, unexpected results of this design include a global memory management interface which allows different memory managers to coexist. For most programs, though, the greatest benefit beyond better performance is that all the ad hoc apparatus intended to compensate for a poor memory manager may be ripped out. This leaves algorithms and data structures unobscured, and allows classes to be used in unanticipated ways.

Thanks to Paul McKenney and Jim Shur for their help in improving this article.

Saturday, November 10, 2007

Tips for memory management!!

1. For C/C++

a. Forget to delete an object.

Ex:

int* x;
x = new int();
....
x = y; (2) <<<< leak memory here

Because the memory referenced by x is now no longer referenced by x after we execute statement 2.

b.Create an array but only delete one object;
Solution: Use delete[] to delete an array instead of delete

c. Delete object twice:
int *x;
int *y;

x = new int();
y = x;

...Do sth...
delete x;
...Do sth...
delete y;

--> crash here.

d. Create an object in dll but delete it outside dll. The correct way is call the function which delete object in dll. Dll should provide the way to delete object in its process.

e. Always check the array boundary before access it or you will be crashed and program will have the buffer overflow.

Saturday, October 13, 2007

Visitor Pattern--How to implement

Today, i want to introduce about Visitor Pattern.

Goal: Why do we have to use Visitor Pattern? The answer is we need to separate the algorithm and the object structure. To see it clearly, we will discuss about it.
The Visitor pattern is also the classic technique for recovering lost type information without resorting to dynamic casts.

In this class diagram, a visitor can be considered as a different algorithm.
A Object Structure is an object which the algorithm work on it. This ObjectStructure has a complex structure and we want to separate the structure from algorithm. To do it, we have to give the algorithm the part of an ObjectStructure. This part of object will do its own task by using the algorithm which give to it.

Example:

interface
Visitor {
   void visit(Wheel wheel);
void visit(Engine engine);
void visit(Body body);
void visit(Car car);
}


interface Visitable {
void accept(Visitor visitor);
}

class Wheel implements Visitable {
private String name;
Wheel(String name) {
this.name = name;
}
String getName() {
return this.name;
}
public void accept(Visitor visitor) {
visitor.visit(this);
}
}

class Engine implements Visitable{
public void accept(Visitor visitor) {
visitor.visit(this);
}
}

class Body implements Visitable{
public void accept(Visitor visitor) {
visitor.visit(this);
}
}

class Car implements Visitable {
private Engine engine = new Engine();
private Body body = new Body();
private Wheel[] wheels
= { new Wheel("front left"), new Wheel("front right"),
new Wheel("back left") , new Wheel("back right") };
public Engine getEngine() {
return this.engine;
}
public Body getBody() {
return this.body;
}
public Wheel[] getWheels() {
return this.wheels;
}
public void accept(Visitor visitor) {
visitor.visit(this);
engine.accept(visitor);
body.accept(visitor);
for(int i = 0; i < class="me1">length; i++) {
Wheel wheel = wheels[i];
wheel.accept(visitor);
}
}
}

class PrintVisitor implements Visitor {

public void visit(Wheel wheel) {
System.out.println("Visiting "+ wheel.getName()
+ " wheel");
}
public void visit(Engine engine) {
System.out.println("Visiting engine");
}
public void visit(Body body) {
System.out.println("Visiting body");
}
public void visit(Car car) {
System.out.println("Visiting car");
}

}

class DoVisitor implements Visitor {
public void visit(Wheel wheel) {
System.out.println("Steering my wheel");
}
public void visit(Engine engine) {
System.out.println("Starting my engine");
}
public void visit(Body body) {
System.out.println("Moving my body");
}
public void visit(Car car) {
System.out.println("Vroom!");
}
}

public class VisitorDemo {
static public void main(String[] args){
Car car = new Car();
Visitor visitor = new PrintVisitor();
Visitor doVisitor = new DoVisitor();
car.accept(visitor);
car.accept(doVisitor);
}
}
As we can see, a car has some part: Wheel, Engine, Body, Car.
The first algorithm: PrintVisitor( Print) and the second algorithm has different
behavior to four part of car and do not care about the structure of Car ( refer to
the main function). It is very important when implement the complex algorithm.
It is also useful when maintaining class. We can easily add new algorithm without
affecting the others algorithm and object structure.

Sunday, September 30, 2007

PHP learning--Part 5

Variables


+ Variable names follow the same rules as other labels in PHP. A valid variable name starts with a letter or underscore, followed by any number of letters, numbers, or underscores. As a regular expression, it would be expressed thus: '[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*'

Note: For our purposes here, a letter is a-z, A-Z, and the ASCII characters from 127 through 255 (0x7f-0xff).

Varible assignment
+ Assigned by value:  That is to say, when you assign an expression to a variable, the entire value of the original expression is copied into the destination variable. This means, for instance, that after assigning one variable's value to another, changing one of those variables will have no effect on the other
+ Assigned by reference: This means that the new variable simply references (in other words, "becomes an alias for" or "points to") the original variable.Changes to the new variable affect the original, and vice versa.

To assign by reference, simply prepend an ampersand (&) to the beginning of the variable which is being assigned (the source variable)
Example:

$bar = &$foo;

Note:

PHP.Net: Relying on the default value of an uninitialized variable is problematic in the case of including one file into another which uses the same variable name. It is also a major security risk with register_globals turned on. E_NOTICE level error is issued in case of working with uninitialized variables, however not in the case of appending elements to the uninitialized array. isset() language construct can be used to detect if a variable has been already initialized.

Comment: The good behaviour is always initialize a varible before using it. It is always recommended on every programming language not only for PHP.

Variable scope

The scope of a variable is the context within which it is defined. For the most part all PHP variables only have a single scope. This single scope spans included and required files as well.

Example:

<?php
$a
= 1;
include
'b.inc';
?>

Varible $a will be available within the included b.inc script.

+ Any variable used inside a function is by default limited to the local function scope. For example:

<?php
$a
= 1; /* global scope */

function Test()
{
    echo
$a; /* reference to local scope variable */
}

Test();
?>

This script will not produce any output because the echo statement refers to a local version of the $a variable, and it has not been assigned a value within this scope


To used a global variable in the function we used the global keyword

Example:

<?php
$a
= 1;
$b = 2;

function
Sum()
{
    global
$a, $b;

   
$b = $a + $b;
}

Sum();
echo
$b;
?>

Variable a, b in function is declared as global variable, all the references to either variable will refer to the global function.

The second way to access varibles from global scope is to use the special PHP-defined $GLOBALS Array. This $GLOBALS variable is the superglobal and can be used in any scope.

Static variable:
The static varible is available in local function scope but it is does notlose its value when program excution leaves this scope. Consider the following example:

<?php
function Test()
{
    static
$a = 0; //just for defined static variable.
    echo
$a;
   
$a++;
}
?>

When we call Test the first times, it output: 0.
The second times, it output 1;

References with global and static variables

This can lead to unexpected behaviour which the following example addresses. So do not use it.

Next part will be in next day.
I want thank to php.net with the manual. This part what i learn from their manual.

Saturday, September 29, 2007

PHP learning--part 4

PHP Types:

About the primitive types, PHP has 4 primitive type:
+ boolean
+ integer
+ float
+ string

Two compound types:
+ array
An array in PHP is actually an ordered map. A map is a type that maps values to keys. This type is optimized in several ways, so you can use it as a real array, or a list (vector), hashtable (which is an implementation of a map), dictionary, collection, stack, queue and probably more. Because you can have another PHP array as a value, you can also quite easily simulate trees.
Note: It is so powerful, but complex, not easy to understand.

Example1: 1-dimension array ( or map)
<?php
$arr
= array("foo" => "bar", 12 => true);
echo
$arr["foo"]; // bar
echo $arr[12]; // 1
?>

Example 2: 2 dimension array

<?php
$arr
= array("somearray" => array(6 => 5, 13 => 9, "a" => 42));
echo
$arr["somearray"][6]; // 5
echo $arr["somearray"][13]; // 9
echo $arr["somearray"]["a"]; // 42
?>

Array do's and don'ts

Why is $foo[bar] wrong?

If there are no constant define bar, PHP will automatically change bar into 'bar' and use that.
You will mismatch value define.


error_reporting
(E_ALL);
ini_set('display_errors', true);
ini_set('html_errors', false);
// Simple array:
$array = array(1, 2);
$count = count($array);
for (
$i = 0; $i < $count; $i++) {
echo
"\nChecking $i: \n";
echo
"Bad: " . $array['$i'] . "\n";
echo
"Good: " . $array[$i] . "\n";
echo
"Bad: {$array['$i']}\n";
echo
"Good: {$array[$i]}\n";
}
?>

The above example will output:

Checking 0:  Notice: Undefined index:  $i in /path/to/script.html on line 9 Bad:  Good: 1 Notice: Undefined index:  $i in /path/to/script.html on line 11 Bad:  Good: 1  Checking 1:  Notice: Undefined index:  $i in /path/to/script.html on line 9 Bad:  Good: 2 Notice: Undefined index:  $i in /path/to/script.html on line 11 Bad:  Good: 2 So array is flexible, but using it carefully.

+ Object

Object Initialization
<?php
class foo
{
function
do_foo()
{
echo
"Doing foo.";
}
}

$bar = new foo;
$bar->do_foo();
?>

Converting to object:

Rule: If an object is converted to an object, it is not modified. If a value of any other type is converted to an object, a new instance of the stdClass built in class is created. If the value was NULL, the new instance will be empty. Array converts to an object with properties named by array keys and with corresponding values. For any other value, a member variable named scalar will contain the value.

Example:

Convert string to object.

<?php
$obj
= (object) 'ciao';
echo
$obj->scalar; // outputs 'ciao'
?>

Resource

A resource is a special variable, holding a reference to an external resource. Resources are created and used by special functions


Converting to resource

As resource types hold special handlers to opened files, database connections, image canvas areas and the like, you cannot convert any value to a resource.

NULL

The special NULL value represents that a variable has no value

The type of a variable is usually not set by the programmer; rather, it is decided at runtime by PHP depending on the context in which that variable is used.-->Varible type in PHP equal type variant in many other programming language.

That's all. We will go to detail in next day.

Sunday, September 23, 2007

PHP learning--part 3

PHP syntax:



PHP code have to be put in
<?php code ?>
Example:
<?php echo 'While this is going to be parsed.'; ?>


There are 2 layer when parse a php file. First the PHP file compile with PHP parser. The output is HTML. Then the HTML output will be parse by HTML parser and then output into browser client. It is complex process behind.

When PHP parses a file, it looks for opening and closing tags, which tell PHP to start and stop interpreting the code between them. Parsing in this manner allows php to be embedded in all sorts of different documents, as everything outside of a pair of opening and closing tags is ignored by the PHP parser. Most of the time you will see php embedded in HTML documents, as in this example.

<?php
if ($expression) {
   
?>
    <strong>This is true.</strong>
    <?php
} else {
   
?>
    <strong>This is false.</strong>
    <?php
}
?>

How this work?

I will explain:
First, when PHP parser meet first open tag <?php, it will start compile php code until it meet
close tag ?>. So the statment if ($expression) { will be put into parser. But the if statement is not completed. So it will wait until the if structure complete and output. Because the
$expression is default false, so the block: <strong>This is true.</strong> will be ignore. The compiler is enough intelligent to do this. When it meet else part, it will ouput <strong>This is false.</strong> to html parser. Then html parser will do its job: output to client browser.
Hope it is clear.

There are four different pairs of opening and closing tags which can be used in php. Two of those, <?php ?> and <script language="php"> </script>, are always available. The other two are short tags and ASP style tags, and can be turned on and off from the php.ini configuration file. As such, while some people find short tags and ASP style tags convenient, they are less portable, and generally not recommended.

Note: Also note that if you are embedding PHP within XML or XHTML you will need to use the <?php ?> tags to remain compliant with standards.

All types of php opening and closing tags:

1. <?php echo 'if you want to serve XHTML or XML documents, do like this'; ?>

2.  <script language="php">
       
echo 'some editors (like FrontPage) don\'t
              like processing instructions'
;
   
</script>

3.  <? echo 'this is the simplest, an SGML processing instruction'; ?>
    <?= expression ?> This is a shortcut for "<? echo expression ?>"

4.  <% echo 'You may optionally use ASP-style tags'; %>
    <%= $variable; # This is a shortcut for "<% echo . . ." %>

PHP learning -- Part 2

The second things i want to know: which do we need to develop a PHP program?

1. Web server: Apache(recommend) or IIS(not recommend)
2. DBMS such as Mysql(recommend) or others
3. PHP engine (Of course). Download at php.net

But i recommend on WAMP. It wrapped all module in one. It is easier for you to develop.


After install WAMP at port 80, i want to run the first hello world to have a look at PHP.

Although now we do not understand what its mean, it will give us an overview.

First hello world program written in PHP.

File: hello.php


<html>
<head>
<title>PHP Test</title>
</head>
<body>
<?php echo '<p>Hello World</p>'; ?>
</body>
</html>

To make in run, we start wamp server. Put this file into sub folder www in wamp folder.
Open browser, and enter URL: localhost/hello.php

The example will appear:
Hello World


Cool. We did a very simple program. Now we start look inside PHP in next post

PHP Learning--Part 1

Today, i learn PHP language. I know a lot about C/C++, Delphi, Java but nothing about web programming(such as PHP). So now i will self-study PHP. I know it is difficult when you are starting learning new language. But when you have base knowledge. It will be easier.
First glance at PHP:

PHP, which stands for "PHP: Hypertext Preprocessor" is a widely-used Open Source general-purpose scripting language that is especially suited for Web development and can be embedded into HTML. Its syntax draws upon C, Java, and Perl, and is easy to learn. The main goal of the language is to allow web developers to write dynamically generated web pages quickly, but you can do much more with PHP.

That is definition in PHP.net. But in my opinion, in short way, PHP just a server side script run in server side. That's all.

Now i want to know the key feature of PHP. Let's start by study:
1. Server-side scripting:

This is the most traditional and main target field for PHP. You need three things to make this work. The PHP parser (CGI or server module), a web server and a web browser. You need to run the web server, with a connected PHP installation. You can access the PHP program output with a web browser, viewing the PHP page through the server. All these can run on your home machine if you are just experimenting with PHP programming

2.Command line scripting:

You can make a PHP script to run it without any server or browser. You only need the PHP parser to use it this way. This type of usage is ideal for scripts regularly executed using cron (on *nix or Linux) or Task Scheduler (on Windows). These scripts can also be used for simple text processing tasks
Note: It is like batch processing or Shell Programming. Good for system task such as make server log, backup....

3.Writing desktop applications. PHP is probably not the very best language to create a desktop application with a graphical user interface, but if you know PHP very well, and would like to use some advanced PHP features in your client-side applications you can also use PHP-GTK to write such programs. You also have the ability to write cross-platform applications this way. PHP-GTK is an extension to PHP, not available in the main distribution. If you are interested in PHP-GTK, visit http://gtk.php.net/.
Note: So cool. I do not know about this. Maybe i tried. ^_^ I only know that PHP is for Web development.

4. Database support:
Adabas D InterBase PostgreSQL
dBase FrontBase SQLite
Empress mSQL Solid
FilePro (read-only) Direct MS-SQL Sybase
Hyperwave MySQL Velocis
IBM DB2 ODBC Unix dbm
Informix Oracle (OCI7 and OCI8)
Ingres Ovrimos

We only need some key database server like MySQL, Oracle, Innobase, DB2, SQL Server(not see in the list. I will find the way to work around.).

Friday, September 21, 2007

The taming of the thread

A process-driven approach to avoid thread death

Threads can be nasty beasts. This is partly attributed to their delicate nature. Threads can die. If the causes of thread death are not in your code, then MutableThread may keep your code running. This article explores a solution using an object-oriented, problem-solution method.

The problem

The first question to ask in the problem-solution method is, of course: What is the problem? The problem statement is easy to generate in existing systems since the problem generally causes the trouble. In this case, the problem is: The threads die, and the application stops running. No exception is thrown since the cause of thread death is external to the application.
The what and the how

It is extremely important to isolate the objective from the technical aspects of a possible solution. If design concerns influence the requirements early on, creative solutions might be prevented. It is a designer's nature to think about how to do everything and therefore his temptation to avoid challenges that might result in a superior solution by declaring them problematic, too ambitious, or even impossible.

The it-would-be-wonderful-if statement is a liberating way to imagine an ideal world. It produces the ideal what without regard to design limitations.

Let's define our what: Objects do not depend upon particular threads. If a thread should fail, the thread is recreated, and the object continues to run. This should be as unobtrusive as possible.
Scenarios can clarify the requirement and save time

Following the what review exercise, run through some scenarios the particular what will do. This first test can be done without spending a dime on design.

To generate some scenarios, simply imagine the potential objects created by the what and consider the possibilities that could happen. Try to think outside the box.

So, let's say we have an object that runs in a thread. We know the thread might die, and we need to deal with the consequences.

We must provide memory management to prevent so-called memory leaks. However, memory management has nothing directly to do with our thread-oriented requirement—it's beyond the scope of the what.

The crucial issue is that we want to maintain thread operation. There are a few possible subscenarios:

1. Sunny day: All is well. Just restart the thread and keep running.
2. Rainy day: Bad things happen. The object is somehow corrupted.
3. Typhoon: Really bad things happen. The JVM may be unstable.

Thinking more carefully, Scenario 2 describing corrupted data and Scenario 3 describing an unstable JVM might not affect this application. For this application, failure is the worst possibility. For others, it may be necessary to validate data or even JVM integrity. But for our simple case, the sunny day scenario suffices. Therefore, this design will not include data checking or object recreation in a new JVM. In a more robust design, the object persistence might be refreshed to validate data. In a worst-case scenario, the entire JVM should shutdown and restart.

The how: Identify objects and create a design

We must identify the objects involved in the design. First, there should be a detection object; we can call this object the watchdog. In our particular design, the watchdog watches all other threads. The watchdog runs in its own thread and has a collection of references to other threads so it can monitor them and make sure they're all alive. Conveniently, the Thread class provides an isAlive() method to determine if a thread is alive. The watchdog uses this method to detect each thread in its collection. If a thread fails, it's the watchdog's responsibility to report it.

For more robustness, this design will include a second "dog," the beta dog (the watchdog is the alpha dog). The beta dog's purpose is simply to check that the alpha dog is alive. The alpha dog also detects the beta dog.
Watchdogs

The ThreadWatchDog is a particular MutableThread instance that monitors threads, and monitored threads must be either MutableThread or Thread (or their descendants). The watchdog runs through the collection of threads and invokes the isAlive() method. When it notices that a thread is dead, it uses reCreate() to recreate the thread if it is a mutable thread. Otherwise, it simply reports the failure.

Here's how this looks (in the ThreadWatchDog test program associated with this article):


if (lTestThread instanceof MutableThread)
{
if (!((MutableThread)lTestThread).isAlive())


This tests the mutable thread to see if it is alive. In the case of a MutableThread, it reports an exception and then attempts to restart the thread, if possible, with the following code:

ReportingExceptionHandler.processException( new ReportingException("Mutable Thread " + lThreadKey + " is dead"));
try {
// Attempt to restart the thread by clearing and restarting
((MutableThread)lTestThread).createThread();
((MutableThread)lTestThread).start();



For a MutableThread, the ThreadKey is the name assigned to the thread when it is created. The application sets up the threads and assigns them to the watchdog on startup. This is done as follows:

TestThread threadOne = new TestThread();
threadOne.setName("threadOne");
TestThread threadTwo = new TestThread();
threadTwo.setName("threadTwo");
TestMutableThread threadMutable = new TestMutableThread();
threadMutable.setName("threadMutable");
ThreadWatchDog.getInstance().put(threadOne);
ThreadWatchDog.getInstance().put(threadTwo);
ThreadWatchDog.getInstance().put(threadMutable);
threadOne.start();
System.out.println("TEST: Thread One started");
threadTwo.start();
System.out.println("TEST: Thread Two started");
threadMutable.start();
MutableThread lWatchDog = ThreadWatchDog.getInstance();
System.out.println("TEST: Starting the watchdog");
lWatchDog.start();



This starts up the watchdog(s), and thread monitoring is now active. Note clearly that the threads should be started up before initializing the watchdogs, or things will get really confusing.

Note that the put() method adds threads to the ThreadWatchDog. This adds the thread to the collection. The put() method is also overloaded with a put(MutableThread mutableThread). This is because MutableThread isn't really a Thread; rather it implements the MutableRunnable interface, much as Thread implements the Runnable interface.

The MutableThread includes a handle to the actual Thread, and this can be recreated, which replaces the thread owned by the mutable thread:

public void createThread() {
mThisThread = new Thread(this, mThreadName);
mThisThread.setPriority(mPriority);
}



Note that the mThisThread is created by passing the this through the thread constructor. That allows the current object to be assigned to the new thread.

The actual thread is encapsulated within the MutableThread and can be recreated and restarted.

The deprecated Thread.stop() method is used in the test program to show what happens when threads die:

System.out.println("TEST: Stopping threadOne");
threadOne.stop();
System.out.println("TEST: Stopping threadMutable");
threadMutable.stop();



Later in the test program, we even kill the alpha watchdog to make sure the failure is detected and reported. The watchdogs are named internally and do not need to be named as do the application threads.
More on MutableThread

The MutableThread class does not use thread groups, but this function can be added. The thread owned by the MutableThread is not accessible by any other class because the thread should not be referenced anywhere else. Users of this class should not be affected if the thread is replaced. The class is abstract because it does not implement the run() method. This must be implemented by any class that wants to be an instance of MutableThread.

Note that the MutableThread implements the Runnable interface, and the internal thread created with the new Thread(Runnable, String) then invokes the run() method within the thread's implementing Runnable class. The string passed is the name. All necessary attributes are retained within the MutableThread object so that the thread name and priority are assigned to a new thread when the createThread() method is invoked.
Explore mutation

That's about it. Download the source code file that accompanies this article and extract various bits to explore the notion of mutable threads and watchdogs on your own.

mutablethread.jar contains all the code for MutableThread and the watchdogs. I've included a PC batch file, run.bat, to help you invoke the test program. Simply type run in the same directory as the mutablethread.jar and run.bat files. The code includes a logger class that logs to the console, but this can be easily modified to a logging system such as log4j. There is also a ReportingException that handles nesting and reports the exception that a thread has died.

Finally, don't forget to summarize requirements and separate the requirements from design concerns. Following this analysis and design approach can be a big advantage in creating more reliable and extensible systems.

Sunday, September 16, 2007

Detect and remove hardware problem with BUGDOCTOR.COM

BUGDOCTOR.COM the antidote to all your computer error woes. New Bug-Detecting and Removal Software Bringing Back Your Computer's Pulse

March 9, 2005 - Are you one of the billions of computer user’s whose PC is infected with a pesky infection? SOFTWARE DOCTOR INC. today announced the worldwide public release of BugDoctor, the most advanced error-removal software on the market. BugDoctor is a new computer tool that seamlessly scans for, detects and deletes a comprehensive range of hard drive bugs. It roots out potentially crash-inducing errors at their source, ferreting deep into PC application sections-such as Active X Com, device drivers, corrupt help files, and Windows startup applications and shortcuts-where these parasites hide. The program is available for complimentary download at BugDoctor.com and an initial error scan is available free of charge.

According to a National Cyber Security study conducted in June 2003, 9 out of 10 PCs connected to the Internet contain spyware. Even more startling, 94% of PCs are infected with spyware-associated bugs and errors that if not repaired properly can put them at extreme risk for system failure. While many adware and spyware removal programs are effective in eradicating the annoying popups that such malicious programs carry with them, they leave behind registry keys that not only slow down system performance but also pose the potential for an outright system crash.

If you are noticing such computer problems as slow startups, troublesome shut-downs, unresponsive programs, or runtime, 404 or msg errors, then your computer could be at high risk for permanent failure. Before tossing thousands of money into a new system, however, you owe it to yourself to give BugDoctor a try. Chris V., a satisfied client, is glad he did: "I was going to invest over $2,500 in a new computer, but I decided to try BugDoctor first and I’m glad I did. Now my PC runs like new again, saving me hundreds of dollars and tons of time."

Not only will BugDoctor repair every type of aforementioned computer error, but it will also boost your system’s speed and performance by as much as 300%. The program is easy and safe to use and regular scanning and deletion will result in a more stable and faster running Windows environment.

To download the new BugDoctor error-removal software and to scan your own PC before it’s too late, visit the company online at www.BugDoctor.com. For more information on how BugDoctor can protect and save your computer from debilitating system bugs, contact Mike Delrue at support@bugdoctor.com.

###
This press release has been posted by Software Submit.NET, website and software submission on-line service, avaliable worldwide. For details contact Software Submit.NET at http://www.softwaresubmit.net

About This Release
If you have any questions regarding information in these press releases please contact the organization listed in the press release. Issuers of press releases and not PR Leap are solely responsible for the accuracy of the content.

Saturday, September 15, 2007

Builder pattern

Today i want to introduce Builder Pattern.

1. First we need to know the Builder Pattern class diagram:
Class diagram:



Now we will explain the meaning of those class:

Builder: An interface used to create complex object.

Concrete Builder: An implementation of Builder. Builder give us the way to construct object. And Concrete Builder show us how to do in detail.

Director: The Director class is responsible for managing the correct sequence of object creation. It receives a Concrete Builder as a parameter and executes the necessary operations on it.

Explanation:
We have a complex object with many part. To create object, we have to create many sub item in it. For example: We have a pizza. There are many type of pizza. Each type has its own way to do pizza.
For Hawaiian pizza: it make from cross, mild, ham+pineapple
For Spicy pizza: it make from pan baked, hot, pepperoni+salami


So we design as below:
+ Builder define how to create pizza. It provide the mean to create pizza:
public void setDough(String dough)
public void setSauce(String sauce)
public void setTopping(String topping)

For each kind of pizza( a concrete builder) , we will give pizza a suitable thing




/** "Product" */
class Pizza
{
private String dough = "";
private String sauce = "";
private String topping = "";

public void setDough(String dough)
{ this.dough = dough; }
public void setSauce(String sauce)
{ this.sauce = sauce; }
public void setTopping(String topping)
{ this.topping = topping; }
}

/** "Abstract Builder" */
abstract class PizzaBuilder
{
protected Pizza pizza;

public Pizza getPizza()
{
return pizza;
}
public void createNewPizzaProduct()
{
pizza = new Pizza();
}

public abstract void buildDough();
public abstract void buildSauce();
public abstract void buildTopping();
}

/** "ConcreteBuilder" */
class HawaiianPizzaBuilder extends PizzaBuilder
{
public void buildDough()
{
pizza.setDough("cross");
}
public void buildSauce()
{
pizza.setSauce("mild");
}
public void buildTopping()
{
pizza.setTopping("ham+pineapple");
}
}

/** "ConcreteBuilder" */
class SpicyPizzaBuilder extends PizzaBuilder
{
public void buildDough()
{
pizza.setDough("pan baked");
}
public void buildSauce()
{
pizza.setSauce("hot");
}
public void buildTopping()
{
pizza.setTopping("pepperoni+salami");
}
}

But we need to describe the sequence to do pizza(step by step)
so we need to implement Director and give it Builder as parameter


class Waiter
{
private PizzaBuilder pizzaBuilder;

public void setPizzaBuilder(PizzaBuilder pb)
{
pizzaBuilder = pb;
}
public Pizza getPizza()
{
return pizzaBuilder.getPizza();
}

public void constructPizza()
{
pizzaBuilder.createNewPizzaProduct();
pizzaBuilder.buildDough();
pizzaBuilder.buildSauce();
pizzaBuilder.buildTopping();
}
}


Finally: We use it:

class BuilderExample
{
public static void main(String[] args)
{
Waiter waiter = new Waiter();
PizzaBuilder hawaiianPizzaBuilder = new HawaiianPizzaBuilder();
PizzaBuilder spicyPizzaBuilder = new SpicyPizzaBuilder();

waiter.setPizzaBuilder( hawaiianPizzaBuilder );
waiter.constructPizza();

Pizza pizza = waiter.getPizza();
}
}

Hope it can help you to understand builder pattern

Friday, September 7, 2007

EJB best practices

EJB best practices: Build a better exception-handling framework

Deliver more useful exceptions without sacrificing clean code



Level: Intermediate

Brett McLaughlin (brett@newInstance.com), Author and Editor, O'Reilly Media Inc.

01 Jan 2003

Enterprise applications are often built with little attention given to exception handling, which can result in over-reliance on low-level exceptions such as java.rmi.RemoteException and javax.naming.NamingException. In this installment of EJB Best Practices, Brett McLaughlin explains why a little attention goes a long way when it comes to exception handling, and shows you two simple techniques that will set you on the path to building more robust and useful exception handling frameworks.

In previous tips in this series, exception handling has been peripheral to our core discussion. One thing you may have picked up, however, is that we've consistently distanced low-level exceptions from the Web tier. Rather than have the Web tier handle exceptions such as java.rmi.RemoteException or javax.naming.NamingException, we've supplied exceptions like ApplicationException and InvalidDataException to the client.

Remote and naming exceptions are system-level exceptions, whereas application and invalid-data exceptions are business-level exceptions, because they deliver more applicable business information. When determining what type of exception to throw, you should always first consider the tier that will handle the reported exception. The Web tier is generally driven by end users performing business tasks, so it's better equipped to handle business-level exceptions. In the EJB layer, however, you're performing system-level tasks such as working with JNDI or databases. While these tasks will eventually be incorporated into business logic, they're best represented by system-level exceptions like RemoteException.

Theoretically, you could have all of your Web tier methods expect and respond to a single application exception, as we did in some of our previous examples. But that approach wouldn't hold up over the long run. A far better exception-handling scheme would be to have your delegate methods throw more specific exceptions, which are ultimately more useful to the receiving client. In this tip, we'll discuss two techniques that will help you create more informative, less generalized exceptions, without generating a lot of unnecessary code.

Nested exceptions

The first thing to think about when designing a solid exception-handling scheme is the abstraction of what I call low-level or system-level exceptions. These are generally core Java exceptions that report errors in network traffic, problems with JNDI or RMI, or other technical problems in an application. RemoteException, EJBException, and NamingException are common examples of low-level exceptions in enterprise Java programming.

These exceptions are fairly meaningless, and can be especially confusing when received by a client in the Web tier. A client that invokes purchase() and receives back a NamingException has little to work with when it comes to resolving the exception. At the same time, your application code may need to access the information within these exceptions, so you can't simply throw out or ignore them.

The answer is to provide a more useful type of exception that also contains a lower-level exception. Listing 1 shows a simple ApplicationException that does just this:


Listing 1. A nested exception
package com.ibm;

import java.io.PrintStream;
import java.io.PrintWriter;

public class ApplicationException extends Exception {

/** A wrapped Throwable */
protected Throwable cause;

public ApplicationException() {
super("Error occurred in application.");
}

public ApplicationException(String message) {
super(message);
}

public ApplicationException(String message, Throwable cause) {
super(message);
this.cause = cause;
}

// Created to match the JDK 1.4 Throwable method.
public Throwable initCause(Throwable cause) {
this.cause = cause;
return cause;
}

public String getMessage() {
// Get this exception's message.
String msg = super.getMessage();

Throwable parent = this;
Throwable child;

// Look for nested exceptions.
while((child = getNestedException(parent)) != null) {
// Get the child's message.
String msg2 = child.getMessage();

// If we found a message for the child exception,
// we append it.
if (msg2 != null) {
if (msg != null) {
msg += ": " + msg2;
} else {
msg = msg2;
}
}

// Any nested ApplicationException will append its own
// children, so we need to break out of here.
if (child instanceof ApplicationException) {
break;
}
parent = child;
}

// Return the completed message.
return msg;
}

public void printStackTrace() {
// Print the stack trace for this exception.
super.printStackTrace();

Throwable parent = this;
Throwable child;

// Print the stack trace for each nested exception.
while((child = getNestedException(parent)) != null) {
if (child != null) {
System.err.print("Caused by: ");
child.printStackTrace();

if (child instanceof ApplicationException) {
break;
}
parent = child;
}
}
}

public void printStackTrace(PrintStream s) {
// Print the stack trace for this exception.
super.printStackTrace(s);

Throwable parent = this;
Throwable child;

// Print the stack trace for each nested exception.
while((child = getNestedException(parent)) != null) {
if (child != null) {
s.print("Caused by: ");
child.printStackTrace(s);

if (child instanceof ApplicationException) {
break;
}
parent = child;
}
}
}

public void printStackTrace(PrintWriter w) {
// Print the stack trace for this exception.
super.printStackTrace(w);

Throwable parent = this;
Throwable child;

// Print the stack trace for each nested exception.
while((child = getNestedException(parent)) != null) {
if (child != null) {
w.print("Caused by: ");
child.printStackTrace(w);

if (child instanceof ApplicationException) {
break;
}
parent = child;
}
}
}

public Throwable getCause() {
return cause;
}
}

The code in Listing 1 is fairly straightforward; we've simply chained together multiple exceptions to create a single, nested exception. The real benefit, however, is in using this technique as the starting point to create an application-specific hierarchy of exceptions. An exception hierarchy will let your EJB clients receive both business-specific exceptions and system-specific information, without requiring you to write a lot of extra code.





A hierarchy of exceptions

Your exception hierarchy should begin with something fairly robust and generic, like ApplicationException. If you make your top-level exception too specific you'll end up having to restructure your hierarchy later to fit in something more generic.

So, let's say that your application called for a NoSuchBookException, an InsufficientFundsException, and a SystemUnavailableException. Rather than create individual exceptions for each, you could set up each exception to extend ApplicationException, providing only the few additional constructors needed to create a formatted message. Listing 2 is an example of such an exception hierarchy:


Listing 2. An exception hierarchy
package com.ibm.library;

import com.ibm.ApplicationException;

public class NoSuchBookException extends ApplicationException {

public NoSuchBookException(String bookName, String libraryName) {
super("The book '" + bookName + "' was not found in the '" +
libraryName + "' library.");
}
}

The exception hierarchy makes things much simpler when it comes to writing numerous specialized exceptions. Adding a constructor or two for each exception class rarely takes more than a few minutes per exception. You will also often need to provide subclasses of these more specific exceptions (which are in turn subclasses of the main application exception), providing even more specific exceptions. For example, you might need an InvalidTitleException and a BackorderedException to extend NoSuchBookException.

Enterprise applications are often built with almost no attention given to exception handling. While it's easy -- and sometimes tempting -- to rely on low-level exceptions like RemoteException and NamingException, you'll get a lot more mileage out of your application if you start with a solid, well-thought-out exception model. Creating a nested, hierarchical exception framework will improve both your code's readability and its usability.

EJB best practices

EJB best practices: Industrial-strength JNDI optimization

Use caching and a generic factory class to automate JNDI lookups
developerWorks

Integrate new tools and architectures into your environment -- fast!

Level: Intermediate

Brett McLaughlin (brett@newInstance.com), Author and Editor, O'Reilly Media Inc.

01 Sep 2002

Brett McLaughlin continues his EJB best practices with an examination of JNDI lookups, which are an essential and frequent part of almost all EJB interactions. Unfortunately, JNDI operations almost always exact a performance toll. In this tip, Brett shows you how a home-interface factory can reduce the overhead of JNDI lookups in your EJB applications.

Every kind of EJB component (session, entity, and message driven) has a home interface. The home interface is a bean's base of operations; once you've found it, you have access to that bean's functionality. EJB applications rely on JNDI lookups to access their beans' home interfaces. Because EJB apps tend to run multiple beans, and because JNDI lookups are often present in many components, much of an application's performance overhead can be spent on these lookups.

In this tip, we'll look at some of the most common JNDI optimizations. In particular, I'll show you how to combine caching and a generic helper class to create a factory-style solution to JNDI overhead.

Reducing context instances

Listing 1 shows a typical piece of EJB code, requiring multiple JNDI lookups. Study the code for a moment, and then we'll work on optimizing it for better performance.

public boolean buyItems(PaymentInfo paymentInfo, String storeName,
List items) {
// Load up the initial context
Context ctx = new InitialContext();

// Look up a bean's home interface
Object obj = ctx.lookup("java:comp/env/ejb/PurchaseHome");
PurchaseHome purchaseHome =
(PurchaseHome)PortableRemoteObject.narrow(obj, PurchaseHome.class);
Purchase purchase = purchaseHome.create(paymentInfo);

// Work on the bean
for (Iterator i = items.iterator(); i.hasNext(); ) {
purchase.addItem((Item)i.next());
}

// Look up another bean
Object obj = ctx.lookup("java:comp/env/ejb/InventoryHome");
InventoryHome inventoryHome =
(InventoryHome)PortableRemoteObject.narrow(obj, InventoryHome.class);
Inventory inventory = inventoryHome.findByStoreName(storeName);

// Work on the bean
for (Iterator i = items.iterator(); i.hasNext(); )
inventory.markAsSold((Item)i.next());
}

// Do some other stuff
}


While this example is somewhat contrived, it does reveal some of the most glaring problems with using JNDI. For starters, you might ask yourself if the new InitialContext object is necessary. It's likely that this context has already been loaded elsewhere in the application code, yet we've created a new one here. Caching the InitialContext instances would result in an immediate performance boost, as shown in Listing 2:

public static Context getInitialContext() {
if (initialContext == null) {
initialContext = new InitialContext();
}

return initialContext;
}


By using a helper class with the getInitialContext() instead of instantiating a new InitialContext for every operation, we've cut down the number of contexts floating around in our application to one.

Uh oh -- what about threading?
If you're worried about the effects of threading on the solution proposed here, don't be. It is absolutely possible that two threads could go to work on at the same time (thus creating two contexts at once) but this type of error would happen only on the first invocation of the method. Because the problem won't come up more than once, synchronization is unnecessary, and would in fact introduce more complexities than it would resolve.

Optimizing lookups

Caching the context instances is a step in the right direction, but we're not done optimizing yet. Every time we call the lookup() method it will perform a new lookup, and return a new instance of a bean's home interface. At least, that's the way JNDI lookups are usually coded. But wouldn't it be better to have just one home-interface per bean, shared across components?

Rather than looking up the home interface for PurchaseHome or InventoryHome again and again, we could cache each individual bean reference; that's one solution. But what we really want is a more general mechanism for caching home interfaces in our EJB applications.

The answer is to create a generic helper class to both obtain the initial context and look up the home interface for every bean in the application. In addition, this class should be able to manage each bean's context for various application components. The generic helper class shown in Listing 3 will act as a factory for EJB home interfaces:

package com.ibm.ejb;

import java.util.Map;
import javax.ejb.EJBHome;
import javax.naming.Context;
import javax.naming.InitialContext;
import javax.naming.NamingException;

public class EJBHomeFactory {

private static EJBHomeFactory instance;

private Map homeInterfaces;
private Context context;

// This is private, and can't be instantiated directly
private EJBHomeFactory() throws NamingException {
homeInterfaces = new HashMap();

// Get the context for caching purposes
context = new InitialContext();

/**
* In non-J2EE applications, you might need to load up
* a properties file and get this context manually. I've
* kept this simple for demonstration purposes.
*/
}

public static EJBHomeFactory getInstance() throws NamingException {
// Not completely thread-safe, but good enough
// (see note in article)
if (instance == null) {
instance = new EJBHomeFactory();
}
return instance;
}

public EJBHome lookup(String jndiName, Class homeInterfaceClass)
throws NamingException {

// See if we already have this interface cached
EJBHome homeInterface =
(EJBHome)homeInterfaces.get(homeInterfaceClass);
// If not, look up with the supplied JNDI name
if (homeInterface == null) {
Object obj = context.lookup(jndiName);
homeInterface =
(EJBHome)PortableRemoteObject.narrow(obj, homeInterfaceClass);

// If this is a new ref, save for caching purposes
homeInterfaces.put(homeInterfaceClass, homeInterface);
}
return homeInterface;
}
}

Inside the EJBHomeFactory class

The key to the home-interface factory is in the homeInterfaces map. The map stores each bean's home interface for use; as such, one home-interface instance can be used over and over again. You should also note that the key in the map is not the JNDI name passed into the lookup() method. It's quite common to have the same home interface bound to different JNDI names, but doing so can result in duplicates in your map. By relying on the class itself, you ensure that you won't end up with multiple home interfaces for the same bean.

Inserting the new home-interface factory class into the original code from Listing 1 will result in the optimized EJB lookup shown in Listing 4:

public boolean buyItems(PaymentInfo paymentInfo, String storeName,
List items) {

EJBHomeFactory f = EJBHomeFactory.getInstance();

PurchaseHome purchaseHome =
(PurchaseHome)f.lookup("java:comp/env/ejb/PurchaseHome",
PurchaseHome.class);
Purchase purchase = purchaseHome.create(paymentInfo);

// Work on the bean
for (Iterator i = items.iterator(); i.hasNext(); ) {
purchase.addItem((Item)i.next());
}

InventoryHome inventoryHome =
(InventoryHome)f.lookup("java:comp/env/ejb/InventoryHome",
InventoryHome.class);
Inventory inventory = inventoryHome.findByStoreName(storeName);

// Work on the bean
for (Iterator i = items.iterator(); i.hasNext(); ) {
inventory.markAsSold((Item)i.next());
}

// Do some other stuff
}


In addition to being more clear (at least in my opinion) the factory-optimized EJB lookup above will perform much faster over time. The first time you use the new class, you'll incur all the usual lookup penalties (assuming another portion of the application hasn't already paid them) but all future JNDI lookups should hum right along. It's also worth pointing out that the home-interface factory will not interfere with your container's bean management. Containers manage bean instances, not the home interfaces of those instances. Your container will still be in charge of instance swapping, as well as any other optimizations you want it to perform.

In the next installment of EJB best practices, I'll show you how you can enable administrative access to entity beans, without directly exposing them to your application's Web tier. Until then, I'll see you online.

Google