11 Reasons Why I Hate XML

… at least in Java.

1 – Namespace and import

XML is only apparently simple. As soon as namespace are used, it immediately gets complicated.  What is the difference between targetNamespace=”…”, xmlns=”…” and xmlns:tns=”…” ? Can I declare several prefixes for the same namespace? Can I change the default namespace from within a document? What happens if I import a schema and rebind it to another namespace? How do I reference an element unambiguously? Ever wondered how to really create a QName correctly? Ever wondered what happens if you have a cycle in your dependencies?

2 – Encoding and CDATA

XML encoding and file encoding are not the same.  This is a huge source of troubles. Both encoding must match, and the XML file should be read and parsed according to the encoding specified in the XML header. Depending on the encoding, characters will be serialized in a different way, again a huge source of confusion. If the reader or writer of an XML document behave incorrectly, the document can be dangerously corrupted and information can be lost. Editors don’t necessary display the characters correctly, while the document may be right. Ever got a ? or ¿ in your text? Ever made a distinction between &amp; and & ? Ever wondered whether a CDATA section was necessary or if using UTF-8 would be ok? Ever realized that < and > can be used as-is in attributes but need an encoding within a tag?

3 – Entities and DOCTYPE

Somehow relates to #2, but not only. XML entities are a generic way to define variables and are declared in the DOCTYPE. You can define custom entities; this is rather unusual but still need to be supported. Entites can be internal or external to your XML document, in which case the entity resolving might differ. Because entities are also used to escape special character, you can not consider this as an advanced feature that you won’t use. XML entities needs to be handled with care and is always a source of trouble. For instance, the tag <my-tag>hello&amp;world</my-tag> will trigger 3 characters(...) events with SAX.

4 – Naming convention

Ever wondered whether it was actually better to name your tag <my-tag/>, <myTag/> or <MyTag/>? The same goes for attributes….

5 Null, empty string and white spaces

Making the difference between null and empty string with XML is always painful. Null would be represented by the absence of the tag or attribute, whereas empty string would be represented with an empty tag or empty attribute. The same problem appears if you want to distinguish empty list and no list at all. If not considered clearly upfront (which is frequently the case), it can be very hard to retrofit clearly this distinction in an application.
Whitespace is another issue on its own. The way tabs, spaces, carriage return, line feeds are processed is always confusing. There are some options to control that, but it’s way too complicated for most of the usage. As a consequence, sometimes these special characters will be encoding in entities, sometimes embedded in CDATA and sometimes stores as-is in the XML.

6 – Normalization

XML encryption and signature look fine on paper. But as soon as you dig in the spec, you realize that it’s not so easy because of the syntactic and semantic equivalence of XML document. Is <my-tag></my-tag> the same as <my-tag/>? To solve this issue, XML normalization was introduced which define the canonical representation of a document. Good luck to understand all the subtleties when considering remarks #1, #2,  #3 and #5.

7 – Too many API and implementations

Even if stuffs improved in this area, there are too many API and implementation available. I wish there was one unified API and one single implementation sometimes…Ever wondered how to select a specific implementation?  Ever got a classloader issue due to an XML library? Ever got confused whether StAX was actually really better than SAX to read XML documents?

8 – Implementation options

Most XML implementations have options or features to deal with the subtleties I just describe. This is especially true for namespace handling. As a consequence, you code may work on one implementation but not on another.  For instance, startDocument should be used to start an XML document and deal with namespace correctly. The strictness of the implementations differs, so don’t take for granted that portability is 100%.

9 – Pretty printing

There are so many API and frameworks that it’s always a mess to deal with pretty printing, if supported by the framework.

10 – Security

XML was not designed for security. Notorious problems are: dangerous framework extension, XML bomb, outbound connection to access remote schema, extensive memory consumption, and many more problems documented in this excellent article from MISC. As a consequence, XML document can be easily abused to disrupt the system.

11 – XPath and XSLT

XPath and XSLT belong to the XML ecosystem and suffer the same problems as XML itself: apparent simplicity but internal complexity. I won’t speak here about everything else that surrounds XML and that forms the big picture of the XML family specifications. I will just say that I recently got a NPE in NetBeans because “/wsa:MessageID” was not ok but using “/wsa:MessageID/.” was just fine.  Got the point?

OpenESB: Invoke an Asynchronous Web Service

I was contacted last week to know if I had actually integrated an asynchronous web service in OpenESB, as promised in a previous post. The NetBeans SOA package is sometimes a bit obscure, though there are some explanation about the examples. I took a bit of time to dig this out, and here is then the promised follow-up (except that I won’t use WS-Addressing). I will use

  • OpenESB bundled with Glassfish
  • NetBeans to author the BPEL process
  • SoapUI to test the process

What we want to get

The BPEL process that will be created is a synchronous BPEL process, which calls an asynchronous web service using a correlation set to “resume” the process when the asynchronous response is received. The scenario is not very realistic – a BPEL process that calls an asynchronous WS will itself be asynchronous most of the time. The asynchronous WS may indeed take arbitrary long to respond; the client of the BPEL process would probably time out in this case.  This example suffices however to show the underlying principles.

  • The BPEL process is synchronous
  • But it calls an asynchronous WS service
  • We use correlation-set for request/response matching

The BPEL process that we want to obtain at the end is shown below:

Create the PartnerLinks

One or two PartnerLinks?

Communication to and from the asynchronous web service can be realized using a single partner link with two ports or using two partner links with one port each.
From point of view of BPEL an asynchronous request/response is indeed no more than a pair of one-way messages. The request/response matching will anyway be done using correlation set.

As a consequence, the messages can come from “anywhere” and there is therefore not need to have one single partner link. I found it easier to have 2 partner links so that all ports on the left side are the one exposed by the process, and all ports on the right side are the one consumed by the process.

WSDL with one-way PartnerLink

PartnerLinks can be defined in the BPEL process or in the WSDL of the web service. NetBeans has a nice feature to create the PartnerLink in a WSDL therefore I chose to define them there.

A one-way web service is a web service which defines only <input> or <output>. I therefore took the WSDL of my previous post and simply removed the <output> tags so that they become one-way web service. (I also removed anything related to WS-Addressing as it’s not used here).

The PartnerLink can then easily be created with NetBeans using the “Partner” view in the WSDL. The two WSDLs then looked like this:

 

Create the BPEL process

Add the PartnerLink

Now that the WSDL files of the asynchronous web services are ready, I create a new BPEL process. I then added the following PartnerLinks:

  • AsynchronousSampleClient from the SOA sample bundled with NetBeans
  • AsyncTestImplService created previously
  • AsyncTestResponseImplService create previously

Wire the request/response

Then I wired the request/response as follows. I relied on NetBeans variable creation for each <invoke> or <receive> activity. I therefore got the following variables:

  • ResponseIn
  • SayHelloIn
  • OperationAIn
  • OperationAOut

Assign the variables

For the purpose of this example I assign the variable between the message like follows. Note that this example make no sense from a business point of view.

 

Define the correlation set

A receive activity within the process should be assigned a correlation set. The BPEL engine is otherwise unable to match the request/response and resume the right process instance.

I defined a correlation set “correlation” which would use the property “correlationProp”.  A correlation property is a value that existing in different message and that can be used to match messages together. The property itself is a “symbolic” name for the value, and the corresponding element in each message is defined using so-called property aliases.
I then added two aliases, one in each WSDL file, and defined how “correlationProp” would map in the “sayHello” and “response” message respectively.

The process can then be built without warnings.

Deployment

The endpoint ports

The process defines 3 ports that can be changed according to your need. In this example the expected endpoints are:

The corresponding WSDL can be obtain by appending  “?wsdl” in the URL.

Note that the address for the asynchronous callback is not passed as parameter from the BPEL process, but should be hard-coded in the web service implementation. It would however be very easy to pass the callback address as an extra parameter so that the asynchronous web service is entirely decoupled of the BPEL process.

Build

Rebuild the process to take the latest change in the port URL.

Create the composite application (CA)

The BPEL process cannot be deployed as-is. You will need to embed the BPEL process into a composite application, which is a deployable unit. This is very easy to do:

  1. Create a new project of type composite application.
  2. Drag and drop the BPEL project onto the Service Assembly
  3. Rebuild the composite application

All that is necessary will be created automatically during the build. After the build is complete, NetBeans will refresh the Service Assembly and it looks then like this:

Deploy

Go in the Glassfish console and deploy the service assembly produced in the previous step.

Import WSDL in SoapUI

Start SoapUI and import the 3 WSDL.

Mock the asynchronous web service

Now that the 3 WSDL have been imported, we will create a mock for the asynchronous web service.  This way we can verify if the BPEL process call the asynchronous web service correctly and we can send the callback response manually.

Select the WSDL “AsyncTestImplPortBinding”,  and right-click “Generate Mock Service”. Make sure to use

  • path = /AsyncTestImplService/AsyncTestImpl?*
  • port = 8888

So that it matches the port that the BPEL process will use.

Make sure to start the Mock, in which case SaopUI displays “running on port 8888” at the top-right of the Mock window. The project looks like this:

Test

1 – Invoke BPEL process

Send the following SOAP message to the BPEL process (located at http://localhost:18182/AsynchronousSampleClient):

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"

xmlns:asy="http://enterprise.netbeans.org/bpel/AsynchronousSampleSchemaNamespace">
<soapenv:Header/>
<soapenv:Body>
<asy:typeA>
<paramA>dummy</paramA>
<id>123</id>
</asy:typeA>
</soapenv:Body>
</soapenv:Envelope>

2 – Receive the asynchronous invocation

When the Mock service the asynchronous message it displays something like “[sayHello] 4ms”. The message can be opened and should look like:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
<sayHello xmlns:msgns="http://ewe.org/" xmlns="http://ewe.org/">
<arg0 xmlns="">123</arg0>
</sayHello>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

3 – Send the callback manually

We simulate manually the behavior of the mock service and send the following message to the callback endpoint (http://localhost:18182/AsynchronousSampleClient/response):

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:ewe="http://ewe.org/">
<soapenv:Header/>
<soapenv:Body>
<ewe:response>
<!--Optional:-->
<arg0>123</arg0>
</ewe:response>
</soapenv:Body>
</soapenv:Envelope>

4 – Get the synchronous reply

So far, the SOAP request of step #1 was still waiting to receive the synchronous response.  After the callback has been sent, the BPEL engine should resume the right instance of the BPEL process (using the correlation value “123”), which should then terminate.

SoapUI will display the time taken for the request/response which will then be something like “response time: 5734 ms”. The time will of course depend on how long you took to perform step 2 and 3. (Note that after some time, the request will timeout if you really take too long to do these steps.)
The SOAP response message should look like:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
<typeA
xmlns:msgns="http://enterprise.netbeans.org/bpel/AsynchronousSampleClient"
xmlns="http://enterprise.netbeans.org/bpel/AsynchronousSampleSchemaNamespace">
<id xmlns="">123</id>
</typeA>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Conclusion

This example as-is make little sense from a technical and business point of view; I wish I had also used more meaningul names for the various elements. It however shows the principle of asynchronous web service invocation using OpenESB. The adaption of this example for meaningful use cases should be relatively simple. It’s a matter of changing the message types and assignment rules.

Taming Size and Complexity

The only real problem with modern software is size and complexity. If we had a bigger brain able to apprehend and reason about software as a whole without omitting details, we wouldn’t have that many issues. Unfortunately, our mental abilities are limited, and as a consequence, we need to have ways to build software whose complexity is beyond our own analytical power. The same is true for any large scale engineering initiative. Apart from discipline, which is a prerequisite to manage size and complexity, traditional ways to address size & complexity are: abstraction, automation and intuition.

Abstraction

The traditional way to address complexity is to raise the abstraction level. Get rid of details and stay focused on the essential – complexity goes away. You can then reason on different parts at various abstraction levels independently of each other. This is the ground-breaking argument about any modeling effort or methodology. The problem is that the whole is not equal to the sum of its parts. Unforeseen interactions will emerge resulting in a myriad of potential problems.  An other major problem is the traceability of the different parts.

Automation

The traditional way to address size is through automation. A lot of task that we perform are not tedious due to their very nature, but due to the effort they demand. Our concentration is also limited which implies we will make mistakes. Automation leads then not only to higher productivity but higher quality.  There are too many examples of automated task, but code formatting and refactoring fall for instance into this category.  Even though automation is extremely effective for specific task, automation is also impacted by the complexity of the software to produce. State explosion is for instance one of the main problems of a technique such as symbolic execution.

Intuition

Actually, problem solving implies not only strong analytical skills but also some form of intuition. The same goes with software and program understanding. Software exploration and visualization are powerful techniques to reason about abstract information in an intuitive way. Software is intangible and has by consequence no natural representation – this leaves the door open for new visualization metaphors. Examples of interactive visual development technologies are BPEL workflow, DSM, or polymetric views.

How to Upgrade Liferay

I’ve been using Liferay for several years now, starting with version 4.3 and upgrading it each 6 month or so. I recently upgraded from 5.1.2 to 5.2.3. This was so far the most painful upgrade, though everything is now OK.

http://www.liferay.com/web/guest/community/forums/-/message_boards/message/3994376
http://www.liferay.com/web/guest/community/forums/-/message_boards/message/2728671
http://forums.sun.com/thread.jspa?threadID=142232

Here is a short list of points worth considering when upgrading a Liferay deployment. The notes refer to an Apache/Tomcat/MySQL deployment.

Backup and restore

  • Consistency of the deployment. Always backup database plus jackrabbit at the same time.
  • Jackrabbit file names contain special character, notably %, which are sometimes replaced automatically by some tools. There number of files in jackrabbit grows very fast, and is absolutely disproportional to the size of the document library. So better pack your jackrabbit directory first if you need to transfer it over a network, especially winscp.
  • Case-sentitivity of the database. Some tables have uppercase. Depending on MySQL and the OS the system will be case sensitive or not. You can use the trick lower_case_table_names if necessary.
  • Make sure you export and import the database with UTF-8 encoding. If necessary force the encoding with --default-character-set.

Upgrade

  • Read the release notes carefully.
  • Merge catalina startup files (setenv, catalina, startup) if you’added custom stuff there
  • Merge property files (portal-ext, portal-legacy-XX)
  • Update theme(s), especially the supported version number
  • Update built-in theme under /html/themes/classic if you’ve made changes there
  • Update custom layouts in template.xml plus directory layouttpl
  • Merge web.xml
  • Update context.xml, if still used
  • Update static HTML files if they’ve been copied somewhere else for faster delivery
  • Remove any hook, e.g. sevencog hook
  • Change the directory for deploy & jackrabbit, make sure you use the right separator ( “/” or “\” )
  • Update mod_jk URL aths if necessary
  • Adjust the min/max memory of JVM if necessary

Troubles

  • Corrupted XML in database. If Liferay update complains about corrupted XML, export the database as flat SQL file, track the faulty characters (carriage return, or other special character) with a text editor, and restore the database.
  • Theme names in database. If you theme has special character in it ( dot or underscope ) you may need to update it database
  • Default company and omni admin. You can specify the default company in web.xml and it will be the only one for which you can be omni admin. If this doesn’t work, update the company ID in database and ensure the default one is the first one (lowest ID).
  • Indexes. Better rebuild all indexes after an upgrade. Either go to the admin console and/or remove lucene directory.
  • Remove Tomcat temporary files. Remove the work and temp directories of Tomcat after upgrade.

 

Glassfish mysteries #5: transaction recovery

Here are all posts of this serie on Glassfish.

There is little information available on the web about Glassfish transaction recovery. Transaction recovery is indeed something that should be very rare.

Some background

Such a recovery is necessary only if a problem (typically a crash) occurs while the transaction manager is performing the 2-phase commit protocol. If a problem happens before the 2PC protocol starts, the transaction will timeout and be rolled back automatically. If the problem appears during the 2PC protocol, the situation is a bit more complicated: one branch may be prepared the other not, or even worse, one branch may be comitted and the other not. A distributed transaction in such a state is frequently called “in-doubt” in the litterature.

The 2PC is supposed to be a fast operation, so the probability of in-doubt transaction is supposed to be also very low. It nevertheless can happen, and in this case, the distributed transaction must be recovered. This means that the transaction manager will attempt to complete the 2PC protocol based on his own transaction log. In some case, the transaction manager doesn’t know exactly what was done or not, and it must then “heuristically” rollback or commit the pending branches. This is generally really bad as it may leave the system in an inconsistent state, with some operations having been committed in one branch (e.g: the database) and rolled back
in another one (e.g: the JMS broker).

Glassfish admin console

First of all, we’ve never been able to recover any in-doubt transaction from the Admin>Transaction page. The “recover transaction” button didn’t produce any visible effect. We were however able to force the recovery at startup by enabling the appropriate option in the transaction service configuration page.

Oracle transaction recovery

If you are using Oracle, youre database connection will need some advanced privileges to have the recovery working. Glassfish will indeed execute either a “commit force” or “rollback force” on the database, which
is usually performed by a DBA with system rights. The privileges we found were necessary are the following:

GRANT FORCE ANY TRANSACTION TO <db_conn>;
GRANT SELECT ON dba_2pc_pending TO <db_conn>;
GRANT SELECT ON DBA_PENDING_TRANSACTIONS TO <db_conn>;
GRANT SELECT ON SYS.PENDING_TRANS$ TO <db_conn>;
GRANT SELECT ON SYS.DBA_2PC_NEIGHBORS TO <db_conn>;
GRANT EXECUTE ON sys.dbms_system TO <db_conn>;
CREATE PUBLIC SYNONYM dbms_system FOR dbms_system;

Before the recovery,  the view dba_2pc_pending shows one pending transaction, whereas after the recovery the view is empty.

The is also little information about the property oracle-xa-recovery-workaround of the transaction service. It seems like there is a bug with Oracle and the view dba_2pc_pending. This view is sometimes not correctly refreshed by Oracle. The workaround’s purpose is apparently to force the view to be updated so that Glassfish can use it to identify the in-doubt transactions. This is unfortunately only a suppositon as we never found a clear explanation of the exact impact of this property.

Glassfish mysteries #4: IIOP

Here are all posts of this serie on Glassfish.

This last post will be about considerations on usage of IIOP and Glassfish. IIOP is a standard, inter-operable protocol that every J2EE-compliant application server must support. In case of java-to-java communication, IIOP is sometimes a bit overhead and some application server supports alternative protocols in this special case. However, Glassfish does support only IIOP so all remote communication with go through this protocol. Compared to plain RMI, this protocol adds transaction context preparation.

Communication timeout, distributed transactions & tuning

Heavy usage of IIOP is hard to tune. There seem also that there are some bugs in Glassfish with IIOP. We also noticed that the memory consumption was significant when remote calls are frequent. We needed to adapt the TCP ORB settings in a way to avoid communication time out. The best-practice that we’ve identified can be summarized with:

  • Use -server profile for better memory management
  • Tune -Dcom.sun.corba.ee.transport.ORBTCPTimeouts=500:30000:30:999999
  • Check « Allow Non Component Caller » in the data sources
  • Beware RedHat Linux, there seems to be some issue with it.

There are also a few other annoyances:

  • If local glassfish is running, it will always be taken as default even if JNDI specifies a remote instance
  • ProgrammaticLogin doesn’t work from Tomcat to Glassfish

http://forums.java.net/jive/thread.jspa?threadID=42017
http://forums.java.net/jive/message.jspa?messageID=352907

Lookup, load balancing, fail over and host names

EJB lookup with the InitialContext support load-balancing and fail-over.  Nodes can be added/removed to the cluster dynamically; you only need to specify a subset of endpoints in  jndi.properties file.  The lookup mechanism works conceptually like this: (1)  one of the “bootstrapping” endpoint specified in jndi.properties is accessed (2) The endpoint knows about the other nodes in the cluster and one of the node is assigned to the particular InitialContext instance.

At a point in time, the server will answer back to the client providing the address of the endpoint to use for further communication. This address will depend on the configuration of the server. If the ORB was configured to listen on 0.0.0.0 the address is the hostname as resolved on the server-side. The client must be able to contact the server based on the returned address. Depending on the network configuration this may be problematic. For instance, the hostname resolution on server-side may return a hostname that is not visible to the client if they are on different subnets.
The address resolution exist even if no cluster is used and the endpoint specified in jndi.properties is the one to use for remote communication.

http://docs.sun.com/app/docs/doc/820-4341/fxxqs?a=view
https://glassfish.dev.java.net/issues/show_bug.cgi?id=4051

@Resource for injection

As per the EJB spec, the EJB should be injected with @EJB or looked up in JNDI. Remote EJB are bound in the global JNDI and by consequence can also be injected with @Resource. This is however a bad practice and I suspect some stability issue with it. Beware this little mistake and make sure you always inject them with @EJB.

Local vs. remote calls

We’ve conducted some micro-benchmark to measure the difference between local and remote calls. We have tested the following scenario:

  • POJO
  • Local EJB
  • Remote EJB in same EAR
  • Remote EJB in separate EAR

Server-side loop

We call one EJB that performs a loop with 10’000 call to a helper object.

Pojo:0 ms
Local:63 ms
Remote internal:172 ms
Remote external:1735 ms

Client-side loop

We call 10’000x the EJB from on client, and the EJB performs one call to a helper object.

Pojo:8140 ms
Local:6688 ms
Remote internal:6750 ms
Remote external:9062 ms

We conclude that the time taken to perform the call on the server side is neglectable compared to the cost of the client-server remote call.

Mixed loop

We call one 10’000 the EJB from the client, and the EJB performs 100 calls to a helper object.

  expected
Pojo:8640 ms 8140 + 0 = 8140
Local:14031 ms 6688 + 100×63 = 12988
Remote internal:23219 ms 6750 + 100×172 = 23950
Remote external:170641 ms 9062 + 100×1735 = 182562

On the right column is the expected time that can be estimated based on the previous results.
The cost of remote intra-JVM calls (in same EAR or different EAR, but in same JVM), is then relatively neglectable.

Glassfish mysteries #3: JMS

Here are all posts of this serie on Glassfish.

This post is about Glassfish and JMS-related problems. Message-passing is a great architectural style whose main strength are (1) scalability (2) loose coupling. The J2EE stack is a great platform to build message-based applications, notably because of the message-driven beans. These are extremely easy to use and relief the developer form consuming message from the JMS queue directly, which most of the time involves some form of thread management. The Glassfish implementation contains unfortunately several bugs.

Embedded broker is buggy

The JMS broker can be configured to be embedded, local or remote. By default it is embedded. Unfortunately, there are many issues with the embedded mode which basically is not robust enough to be used in practice. Always set the broker as local.

Max delivery attempt is not considered

The property EndpointExceptionRedeliveryAttempts specifies how many times the application server will attempt to deliver the JMS message if exceptions occur. The property was not correctly considered in the early releases of Glassfish v2. Fortunately, the bug was fixed in v2ur2.

Consumption from queue hangs if selectors are used

There seem to be a bug when we consume message from the queue and use selectors at the same time. After a while the system hangs and the call to receive() blocks. I unfortunately don’t remember if the broker was configured as embedded or local.

Non-unique delivery of message

We also experienced a strange case where some messages were delivered twice. It seems like the problem was more frequent when load increased. Again, I don’t remember if the broker was configured as embedded or local.

Non-atomic delivery of JMS messages

When a JMS message is sent in a transaction that also performed some database changes, the message may be delivered before the database changes have been committed for real. Considering that this is a typically usage scenario, I know it will sound very weird. It is however a case that we’ve experience several times, and we need to manually add some locks to ensure the message would be processed after the database changes. I’ve posted a long message on java.net concerning this problem, and apparently this should not happen…But I’m positive about the fact that there is a problem somewhere and that the transaction manager sometimes commits the JMS participant before the database participant in the 2 phase commit protocol.

Reference

http://forums.java.net/jive/thread.jspa?messageID=351867&#351867
http://forums.java.net/jive/thread.jspa?messageID=259642&#259642
http://forums.java.net/jive/message.jspa?messageID=232351#232351
http://forums.java.net/jive/thread.jspa?messageID=252493&#252493

Glassfish mysteries #2: distributed transactions

Here are all posts of this serie on Glassfish.

This second post about Glassfish mysteries will be about transaction management. There is indeed some strange behaviour when usage scenarios differ from traditional Web-EJB-JPA examples.

Transaction is not rolled back

Depending on the way you package your enterprise application, the annotation @ApplicationException(rollback=true) will not be considered. This can be a very serious bug. A detailed explanation about the packaging scenario that fails can be found in the reference at the end of this post. As a workaround, the transaction can be declared in the ejb-jar.xml in which case it will be processed correctly. Lesson learned: always double check the xml generated by Glassfish during deployment (in domains/domain/generated) to verify if is matches the intended behaviour.

UserTransaction must be a singleton

Glassfish supports client-side transaction demarcation. This is part of the “gray” zone of the J2EE specification in the sense that it is not mandatory but most containers support it. The object that is used by the client to control the transaction boundaries is the UserTransaction. The UserTransaction exposes the method begin(), commit() and rollback(). A transaction is implicitly bound to the current thread. The client can not perform multi-threaded transactions, neither suspend/resume the current one.
The JTA specifications are not particularly clear regarding the thread-safety of the UserTransaction object: can the same UserTransaction be used by several threads, or should each to possess its own UserTransaction? In the case of Glassfish, the answer is even more radical: there should be one and only one UserTransaction object used per client JVM. In other words, the UserTransaction must be managed like a singleton. If you have several instances of UserTransaction then you application will apparently work, but the ACID properties of the transactions are not enforced. This means (1) concurrent clients may read uncommitted read (2) rollback will not work properly. You find at the end of this post a reference to this bug I reported on java.net. There is test case attached to the post.

TopLink hangs with client-side transactions demarcations

As I wrote in the previous section, client-side distributed transactions are part of the “gray” zone of the J2EE specification. Glassfish’s transaction manager does support client-side transaction demarcation, but unfortunately TopLink doesn’t. As a consequence, when the client attempts to commit the transaction, the system hangs. This can probably be explained by the fact that TopLink has been developed by Oracle, and the OC4J doesn’t support client-side transaction demarcation at all.  Switching to Hibernate 3 (which is very easy) solves the problem.

Allow non-component callers

We had a very complex scenario in our system and the distributed transaction contained several XA participants including database, JMS, and custom JCA connector. The transaction was started from the client-side. We were experiencing lots of stability issue, with some transaction failing randomly with low-level error messages such as “can not delist participant”, “got -1 from a read call”, etc. We noticed then that enabling the option “allow non-component callers” in the datasource configuration has a significant positive impact. Given that the definition of this option is extremely obscure (see reference at the end), I don’t know when this options should be enabled or not. Maybe it is related to the usage of Hibernate 3 also. However, it seems like that in complex transaction scenario, it definitively helps.

References
http://forums.java.net/jive/thread.jspa?messageID=319223&#319223
http://forums.java.net/jive/thread.jspa?messageID=252496&#252496
http://forums.java.net/jive/message.jspa?messageID=246736
http://docs.sun.com/app/docs/doc/820-4496/gavro?a=view

Glassfish mysteries #1: JavaMail

Here are all posts of this serie on Glassfish.

This serie of post will cover some problems we experienced with Glassfish. Let’s start with a few easy ones related to JavaMail. These one are not blocking but rather annoying.

Lookup from JNDI

There’s a bug in Glassfish v2ur2 that prevent you from getting the JavaMail session directly. You will need to use the following code

Context ic = new InitialContext();
MailConfiguration conf = (MailConfiguration)
ic.lookup("mail/notification");
Session session =
javax.mail.Session.getInstance( conf.getMailProperties(),null );

Custom properties

There are some obscure rules to follow if you plan to add custom properties in your JNDI mail entry. We can read in the Glassfish documentation: “”Every property name must start with a mail- prefix. The Application Server changes the dash (-) character to a period (.) in the name of the property, then saves the property to the MailConfiguration and JavaMail Session objects. If the name of the property doesn’t start with mail-, the property is ignored.”

SMTP + authentication

There are no standard properties to deal with SMTP authentication. If you need to support authentication you will need to rely on custom properties. Here is the code that we’ve been using:


String auth = session.getProperty("mail.smtp.auth"); //
String pwd = session.getProperty("mail.smtp.password");

if ((auth != null) && new Boolean(auth))
{
Transport transport = session.getTransport("smtp");
transport.connect(conf.getMailHost(), conf.getUsername(), pwd);
msg.saveChanges();
transport.sendMessage(msg, msg.getAllRecipients());
transport.close();
}
else
{
Transport.send(msg);
}

References

https://glassfish.dev.java.net/javaee5/docs/AREF/abhaq.html
http://forums.java.net/jive/thread.jspa?messageID=264233

Sub-optimal Pagination with Oracle & Hibernate

There seem to be a bug in Hibernate 3 that results in sub-optimal query if one attempt to fetch one specific portion of the result set, as is the case typcially with pagination.

The best-practice to extract one specific page out of the complete result set is to use the ROWNUM keyword with Oracle. The ROWNUM keyword is used to truncate the size of the result set after N items. Oracle will optimize the usage of ROWNUM which can result in drastical improvement of some queries. The outline of a paginated query looks like following:

SELECT *
FROM (SELECT row_.*, ROWNUM rownum_
FROM (SELECT this_.ID AS id3_0_, this_.VERSION AS version3_0_,
this_.NAME AS name3_0_, this_.TYPE AS type3_0_,
this_.marketstatus AS marketst5_3_0_
FROM customer this_
ORDER BY this_.ID ASC) row_
WHERE ROWNUM <= ?)
WHERE rownum_ > ?

There seem unfortunately that there is a bug in Hibernate, and the query that is generated will be:

SELECT *
FROM (SELECT row_.*, ROWNUM rownum_
FROM (SELECT   this_.ID AS id3_0_, this_.VERSION AS version3_0_,
this_.NAME AS name3_0_, this_.TYPE AS type3_0_,
this_.marketstatus AS marketst5_3_0_
FROM customer this_
ORDER BY this_.ID ASC) row_)
WHERE rownum_ <= ? AND rownum_ > ?

Though they are semantically the same, the second one will not benefit from Oracle’s optimizations.

Here is the corresponding issue I’ve reported. More information can be found about Oracle ROWNUM in AskTom