September 22, 2010 § 5 Comments
This BOF featured Jaroslav Tulach, founder of NetBeans,along with Anton Epple and Zoran Sevarac. It was not really about new technology but about formalizing the approach and terminology for building modular systems. The talk targeted both desktop and server developers. The premise was that OO alone did not deliver on code-reuse hence the need to apply patterns, similar to the GoF patterns, to modules.
The speakers did make it clear that patterns exist within a context, i.e. some patterns might not be applicable to a given language, for example, since that language might already have constructs to provide the solution to the common problem addressed by the pattern; having said that the discussion centered exclusively on Java.
Anton defined his “Five Criteria for Modularity Patterns” even though last I checked they were six (they were still fixing the slides as we entered the room…):
- Maximize re-use
- Minimize coupling
- Deal with change (i.e. a smaller system deals with change easier than a larger one; JDevelop, for example, can make changes to its core easier than, say, Eclipse)
- Ease of maintenance (each release should have a theme, such as release X of NetBeans will support OSGI; this well defined theme should not affect the existing system)
- Ease of extensibility (how powerful and simple is your plug-ins architecture)
- Save resources (your modules should not affect the start-up time, especially true for desktop systems, and the memory footprint should be kept manageable)
Another set of definitions followed to formalize the Relationships Dependencies Management:
- A Relationships Dependencies Management is said to be direct if Module 2 depends directly on Module 1: M2 -> M1
- A Relationships Dependencies Management is said to be indirect if Module 3 depends on Module 2 which depends on Module 1: M3 -> M2 -> M1
- A Relationships Dependencies Management is said to be cyclical if Module 2 depends directly on Module 1 and Module 1 depends directly on Module 2 (no need to draw a picture)
- Relationships Dependencies can be defined as Incoming (M1 -> M2 <- M3) – they make M2 hard to change, or Outgoing (M1 <- M2 -> M3) – they make M2 easy to change
- Finally Relationships Dependencies can be designed using three classical patterns: The Adapter pattern (Adapter Interface between 2 modules to introduce an Indirect dependency; it forwards method calls from M3 to M1), the Mediator pattern (sits in between 2 or more modules, aka the Bridge pattern in NetBeans) and the Facade pattern (provides a front Interface for a set of two or more modules)
The last portion was the more practical one as it provided an overview of the existing tools in the Java space to solve the problem of Reducing Communication Dependencies. The problem of Communication Dependencies can simply be stated as follows: Given an Interface TextFilter, an implementation class UpperCaseTextFilter and a client Class Editor, how can Editor get an implementation of TextFilter? Ideally it should know nothing about UpperCaseTextFilter at design time. The ideal run-time solution should provide the following (now we’re getting at the heart of system modularity):
- Register a Service
- Retrieve a Service
- Disable a Service
- Replace a Service
- Order a Service (as in providing some sort of ranking)
- Declarative support for a Service (as in meta data)
- Codeless Service (as in configuration)
- Availability of required Services
- JDK 1.6 own service loader mechanism; it is declarative (in META-INF/services) and returns an iterable typed collection of services but it is way too simple, it is not dynamic (you can’t react to a situation where the client uninstall a plug-in) , it can be dangerous as it loads all services at start-up and it does not provide factory methods
- NetBeans solution: Uses Lookup and XML files. This one is declarative and dynamic, allows for ordering, for lazy loading, has factory methods and codeless extensions
- OSGI Service Registry: Services are registered with code using bundleContext.register(), it is dynamic, it has factory methods, it filters services and it is configurable with code, which means that you now have dependencies on the OSGI framework in your code, eager creation can slowdown start-up times and it is not type safe
- OSGI Declarative Services (OSGI II if you prefer): It’s better than Service Registry in the sense that it is declarative (XML configs)
- Dependency Injection: Spring offers an alternative solution using @Autowired and it is declarative in nature (the wiring is specified in your beans.xml) and the framework is usually transparent to your code
Jaroslav then discussed the hotly debated issue: Are Singletons evil? And the answer is “it really depends on the context” (recall the statement earlier in this post). In a Dependency Injection solution there are many contexts: The Application Context, the reckless contexts and the Session context. Singletons would then be viewed as bad given all these contexts and the false sense of security that they provide. But at the module level (say a jar) they can be helpful if they are carefully designed:
- The Singleton must be ready for use: No prior initialization code should be required prior to requesting a Singleton
- The Singleton must be injectable: We should be able to inject different Singleton implementations at run-time depending on the context (say DEV v/s PROD)
- Singletons are OK when used with the proper Service Loader/Lookup mechanism
Finally a few thoughts on performance were given: Since modular applications tend to be so large start-up time becomes critical so one should obviously avoid calling in OSGI bundle.start() for each jar because it is inefficient and because often time the jar is a 3rd party library that can’t be trusted. You are better off using a declarative registration method (as in an XML config file – interpret it and cache it); which is why JSR-198 falls short in the performance area. JSR 198 is indeed declarative (XML) but you need to create a handler for each service, which slows the start-up time.
Again, this session did not break any new grounds but it helped organizing the ideas around modules, what to look for when evaluating different solutions and last but not least how to learn from a large system such as NetBeans that has been continuously evolving for the last 13 years.
September 22, 2010 § 6 Comments
The OpenJDK BOF was an informal Q&A session, attendees were free to ask JDK-related questions and Kelly O’Hair, Dalibor Topic and Mark Reinhold were there to answer. I think that this setup was appropriate for such a sensitive topic given the degree of anxiety of the Java community and probably the state of mind of the former Sun employees themselves. I will try to capture the most relevant Questions and Answers.
- Will the JRockit VM get open-sourced? No, the plans are to keep HotSpot open-sourced as Open JDK 7 and add to it some of JRockit unique features by mid 2011.
- Will Oracle keep going through the JCP? The plans are to keep using the JCP for OpenJDK 7 features such as the Lambda project and even for Java SE 8. No other guarantees were made beyond that.
- Comment on JDK 7 v/s Open JDK 7: Sun used to provide the OpenJDK + some of its own proprietary binaries (themselves largely based on the OpenJDK, such as plugins) gratis (as in free beer). Oracle will continue doing so in OpenJDK 7 and at the same time provide JDK 7 which should be 98% identical to OpenJDK 7.
- Will Oracle make some performance-sensitive features (such as sockets, io) part of JDK 7 as opposed to OpenJDK 7? No because Oracle has little/nothing to gain from such a model. It would only defragment the code base and make the merge back into an OpenJDK a nightmare. On the other hand JRockit Mission Control API will remain proprietary but some of its features will make it to the Open JDK such as the pluggable verbose logging and the JMX management tool (it can handle port numbers while the JMX on the HotSpot currently can’t).
- Will the deterministic GC of JRockit make it to OpenJDK? No because there are currently paying customers that Oracle would like to keep as such. On the other hand certain aspects of the JRockit GC could make it such as incremental sliding compaction.
- Will the Da Vinci code continue to thrive? Yes, unfortunately not all of John Rose great ideas can/will make it to OpenJDK 7. For now JSR 292 (Lambda project – closures) will make it. The rest, most notably support for tail recursion, will not.
- What’s the state of Jigsaw? It’s in a state of flux, we’re not at a point where we can readily start portions of the JDK. But it is actively worked on for OpenJDK 7.
- What about Project Verrazano? For now it is a research project (it takes a JAR and a platform spec, modularizes the classes themselves to reduce their code size to an optimal one).
- What about OpenJDK 6? Actually the effort was started after the one for OpenJDK 7 and so it did not start with the Java SE 6 code base, rather it started with the OpenJDK 7 code base and engineers started removing features to match JDK 6. It slightly differs from JDK 6, depending on the particular repository but the main features such as CORBA, JAXB, JAXP and JAX-WS, when they differ, do so very little, and usually only in terms of exact licensing. Currently there are are only four different features between JKD 6 and OpenJDK 6: Graphics, Fonts (the most prominent one), SNMP and Color Management.
- Speaking of repositories, which distributed SCM do you recommend? The Solaris team opted for Mercurial (Python-based), it’s great, we could have gone with Git but not sure what to do about the added complexity
So there you have it, the latest on OpenJDK 7; the session started slow with few questions but as time went by the audience asked more and more questions, mostly I think to get reassured about the open-source fate of OpenJDK. The message was indeed a reassuring one, but one thing remains to be seen is: What will happen to Java SE post 2012?
September 21, 2010 § 3 Comments
Good presentation about the ESB adoption for a major web site, nfl.com. The two presenters, Earl Nolan and Monal Daxini, were eager to share their pain points during the adoption of Mule as the ESB (Enterprise Service Bus). Unfortunately Mule was the only ESB discussed, but during the Q&A the presenters admitted that Spring Integration would have been considered for a smaller-scale effort. They did not look at Apache ServiceMix because three years ago it was not quite as stable or feature-rich as it is today, so were they to evaluate the offerings today ServiceMix might have been adopted.
They started off by (aptly) saying what ESB is not: ESB is not JMS or a messaging middleware platform and ESB is not a heavy web services stack. They (also aptly) gave a simple definition of an ESB: A solution for integration problems. Put it simply, any time 3 or more applications need to integrate you have the potential for an ESB adoption. The work of Greg Hohpe was cited quite often during the course of the talk.
So why adopt an ESB to solve integration problems? I can think of a couple of other ways myself where an ESB is not warranted/desirable. There are quite a few cases where an SOA solution or an OSGI-based solution are more desirable but it is clear that ESB is relevant to solve problems where a large number of applications need to talk to each others and along the way transform/enrich the data in some way. An ESB allows you to decouple the integration logic from the business logic (more on that later, but think for now configuration over coding) and evolve from a hub-and-spoke or point-to-point communication model towards a more distributed model. Finally an ESB has typically an impressive list of connectors that can alone justify its adoption (we are not talking JCA here…)
So why choose one ESB vendor over another? Here are a few tips from this session:
- Go for Configuration over Coding – actually that’s the whole idea of an ESB, eschew fancy APIs in favor of fancy configurations. Later we’ll talk about when you might want to use the API
- Go for a lightweight ESB (quick start-up time and easier to test)
- Go for an ESB that favors (and properly document) an incremental adoption of the product – that’s just common sense but often time the documentation offers few clues on how to partially get started with ESB
- Scalable and distributed (had to mention this one…. one more bullet… but hey, it’s really important)
- Embeddable – what’s meant here is that the ESB should be able to reside (maybe in a first phase) inside your existing process to ease deployment and simplify dealing with the IT ops team
- And finally go for some type of SEDA architecture – we’ll go over this point in details
What’s SEDA? It stands for Staged-Driven Event Architecture (come to think about it, it’s quite reminiscent of the EDA model I encounter in CEP applications) and it’s a programming paradigm where the application is decomposed into different stages, each stage is fronted by a Queue and each stage (except the last) feeds onto the next one. An analogy can be made with microprocessor stages and their pipelines: Stage 1 would pipeline new inputs into the queue if stage 2 is not done processing yet. This approach avoid complex concurrency problems. Mule has wholeheartedly adopted the SEDA model by providing a Queue Controller at the input of the system, the system is made up of stages, each stage contains an Event Handler and the developer is only responsible for implementing the event handlers. The thread pool, the Queue Controllers (including the one that provides feedback to slow down/stop incoming messages) are implemented by Mule which instantiates as many Eventy Handlers as required. The queues themselves enable modeling and capacity planning.
Next the speakers described the stage itself since it is at the heart of the SEDA model: Each stage is fronted by an input (queued) channel and outputs data through a (queued) outbound channel; inside the stage data goes successively through an inbound router, a service component and an outbound routers; transformations are handled by the routers. What’s interesting to note is the number of protocols that an ESB such as Mule can handle at the channels: JMS, FTP, File system, HTTP, UDP, TCP, IPoAC, etc…
Next was the paradigm shift introduced by ESB; this is actually similar to any mentality shift that a developer must undergo when adopting a new technology, and again Greg Hope was quoted: Your coffee shop doesn’t use 2-phase commit! Mainly don’t fall into the Leaky Abstraction trap and do make use of the ESB components such as splitters (the net effect for the developer is the ability to deal with a single thread).
Next was a series of don’t do:
- ESB is not a pass-through proxy: what they meant is that given all the layers involved in a typical ESB it does not make sense to use the ESB as a simple pass-through if there is no-value added (i.e. transformation, projection, etc…) It simply complicates the stack and you will have to think of a myriad of problems such as caching, host down-time, etc… You are better off using a CDN for that.
- ESB is not a glorified CRON job scheduler – use a cron tool for that (this applies to shops with a heavy reliance on batch jobs)
- EASB is not an application glue; use a Dependency Injection framework instead
Finally a few best-practices were presented such as the separation of validation from transformation (use an ETL plug-in if needed for complex transformations – I am not sure about this advice, ETL left me with a bad taste even when used topically) and enforce data canonicalization – I think that this last point applies to most platforms, whether developing web services, ESB or OSGI. It’s worth repeating it: Time spent defining a canonical model for your data is time well spent and will save you from redundant validation/transformation/exception flows down the line.
About data validation: The actual way to go about it is quite controversial, do you go with a strict (as in XSD schema) model or with a more relaxed model (RelaxNG and Schematron)? I don’t think that there is a clear-cut answer, it really depends on how dynamic your environment is and how many external dependencies you have (for Internet-facing applications the relax model is probably better).
About your event model: Again two models were discussed, push v/s pull but here it’s easy to see why a push model would be preferred unless you have very stringent reliability requirements I can hear the Nirvana folks scream since they do have buffering/replay capability). The low latency/low cycles consumption features of the push model makes it a clear winner but it is not always enforceable when dealing with 3rd party data providers.
The session wrapped up with security and deployment issues.
It’s interesting to note that most recommendations are applicable to most development situations, and that’s what reassuring about the ESB adoption: There is nothing fundamentally awkward about the model, just a formalized way of doing data integration that forces you to modularize your aspects, make the proper architecture choice (validation, push/pull, transactions or not) and decouple the integration logic from the business logic. All in all an entertaining talk, one of many that dealt with ESB, showing that this technology is very much relevant whether you operate in an EDA environment, an SOA stack or a more traditional back-end system.
September 21, 2010 § Leave a Comment
I thought that I should mention an interesting BOF: JAX-WS.Next: Future Directions and Community Input. JAX-WS, as you know, is the worthy successor of JAX-RPC, improving on it in many ways and it has become increasingly important since most app servers are supporting web profiles. It is pretty much the standard way of doing web services in the Java EE/light EE world.
This session presented many ideas being explored by Sun/Oracle engineers in the RI v2.2.2, most notably how JAX-WS will now take advantage of the Servlet 3.0 spec (1 request can be serviced by many threads) and the wsdl pluggability (what you see on Tomcat 7 would become portable to other containers). It was stressed quite a few times that the ideas discussed in this BOF still need final approval from the JCP.
Some of the features being proposed:
- Support for stateful web services for H/A (support for broken HTTP connections)
- Schema validation: That’s a welcome addition as most people do it (not necessarily in the production environment) one way or another in one-off ways; the class would be annotated with the @SchemaValidation; this would ensure that input/output are properly validated
- Official support for doc/literal wrapper style
- The ability to close Proxy.close() and have a chance to clean up resources
- WSDL 1.1 binding extensions for SOAP 1.2; this would allow the developer to run with the -extension
- MTOM Policy support via @MTOM to allow for the optimized serialization of wsdl/messages; the policy itself gets published in the wsdl
- Addressing policy: long-running operations would send the response in another HTTP Connection; also allow for anonymous/non-anonymous response mechanisms
- Finally (and most important) support for asynchronous behavior on the server side; the client models would remain the same with a choice of polling or callback but the server side invoke method returns void (i.e. immediately) and does not block
Most of these features are already in Glassfish, WebSphere and WebLogic.
September 21, 2010 § 2 Comments
I decided to attend the JavaOne KeyNote hoping to hear some important announcements even if the price to pay was pretty steep; you do have to sit after all in a huge auditorium and stoically listen to executives going through an incredibly boring, extremely well rehearsed (to the point of being comically predictable) and amazingly unassuming (a 5-minute intro concocted by Oracle lawyer warns you that nothing in this session should be considered as a commitment, rather these are just forward looking statements and all the assurances about deliveries/roadmaps/future versions are nothing but hopeful wishes) presentation. But there were a few points that could be taken away from this keynote.
So let’s start with the different JVMs: It was made clear that Sun HotSpot was the JVM of choice and actually a JRockit engineer demonstrated a “flight-recorder” type of tool that records the past n seconds of all events in your VM so they can be replayed and analyzed before a dramatic event. The JRockit Flight-Recorder itself targets the Sun HotSpot as well as the JRockit VM. You get the feeling that the two will converge and that HotSpot has the edge.
On the much anticipated JDK 7 issue Oracle promised two releases, one in 2011 and one in 2012, but again these dates should be taken with a grain of salt given all the legal disclaimers. Three projects were prominently listed: Project Coin (to increase the productivity), Project Jigsaw (to modularize the JDK which has grown too huge – startups, for example, would be faster) and Project Lambda (to add Lambda expressions, aka closures, to the Java language).
Oracle seemed very eager to stress their efforts to develop Java on all three platforms: The desktop, the server and on mobile devices. On the server side a couple of interesting announcements: Continuing the effort to support multiple languages (although project Da Vinci was not explicitly mentioned), efforts to simplify (again) EJBs (Web Beans 1.0), efforts to take JAX-WS further (to support server-side a synchronicity) and Dependency Injection to further the convergence with Spring (Rod Johnson and Bob Lee are the spec leads).
Bioware (the maker of Star Wars The Old Republic) was brought on-stage and the screens displayed dazzling graphics of the game being played. The funny thing was that Bioware does use Java (“Glassfish” and “JDK” were mentioned) but not for the sexy graphics: They use it mostly for players’ authentication and billing (sure someone needs to get paid).
All in all it seems that Oracle has grandiose plans for Java (the mobile platform with its billion of devices from regular Java-enabled cell phones to smart cards was emphasized over and over), Oracle also wanted to reassure the community about its commitment to open source (JDK7, the JavaFX controls, etc…) and finally Oracle wanted to prove that they own the full Java development stack, from the close partnership with Intel which produces code and GC profilers to the various platforms JVMs to the development tools (NetBeans was cited a few times) . It looks good in presentations but it remains to be seen whether they can deliver on such an aggressive roadmap and whether the community will not be scared by their licensing tactics. Many in the audiences had this dual feeling: They desperately wanted to embrace the message but at the same time were thinking of alternatives. If Oracle delivers on its open-source promises, though, the Java platform can look forward to great days ahead.
September 20, 2010 § Leave a Comment
I attended this morning Mission Critical Enterprise Cloud Applications presented by a cheerful Eugene Ciurana; the presentation can be found on his site and Eugene managed to make it entertaining. I will not repeat here the contents of the presentation by I will try to capture its spirit and what made it particularly interesting. Eugene was not really after explaining what a Cloud is or why you should be adopting the Cloud in the enterprise, rather he focused on the classical usage of the Cloud in a hybrid architecture. In the hybrid case, part of your application is pushed to the cloud and part is hosted in your data center. The cloud could take over the data center but that’s not necessarily happening in the immediate/medium term future for reasons outlined here:
- SLA: As long as your SLA (Service Level Agreement) is reasonnable (say four nines as in 99.99% availablility) the cloud makes financial sense, beyond that the cloud becomes an expensive proposition
- Uptime is not the same thing as availability! The cloud may give you the impression of excellent up-time but your overall system availability may have dependencies on critical components that are better left (for the time being) to the data center
Two important questions to ask yourself before embarking on an adventure with Cloud vendor:
- What is the impact on business if the Cloud becomes unavailable?
- How can I/the vendor recover from a disaster
On the other hand cloud architectures are quite diversified: Eugene mentioned PaaS (Platform as a Service) , SaaS (Software as a Service), IaaS (Infrastructure as a Service) and finally Private Clouds (buily on top of VMWare, Eucalyptus, etc…)
He noted that event-based applications tend to scale better in the Cloud, but I think that this is a general statement that’s true even outside the Cloud. Event-based applications are simply more decoupled, the producer need not know anything about the consumer and vice-versa. He also noted that all Cloud implementations seem to have the following four characteristics:
- Quick deployment of pre-packaged applications (typically an image that gets deployed again and again)
- Commoditized H/W : consider Amazon EC2 and S3, Google App Engine, Rackspace
- Pay as you consume billing system -> brought an interesting point from a CFO pt of view: clouds become an operational expense
- Horizontal scalability is highly touted
The most interesting part of the presentation was a real-world study of a complex un-maintainable and un-scalable application that became hybridized with some functionality ported to the Cloud. The main feature of that re factoring effort was actually the introduction of an ESB (Mule) in the data-center to allow the services to scale without being actually tied to the physical databases: all JDBC calls are placed on the bus and memcache is used to alleviate performance issues. The (calculated) side effect was the ability to easily accomplish data mining by intercepting all calls going through the ESB. In the cloud a no-SQL datastore (such as S3) is used for write-once/consult type of access. healhy mix of Java and Jython was introduced to speed up development time. The final stack was Tomcat – Mule – Spring.
As for load balancing, it really depends on the vendor: Google App Engine uses a “mother-of-all-Servlets” for natural balancing while Amazon provides an explicit Elastic Load Balancer.
In conclusion it seemed that a hybrid solution represents the best compromise right now for most enterprises: Stateless/Computationally intensive services can safely reside in the cloud while your data can stay in your data center. As vendors start offering more legally-binding and stringent SLAs enterprises can start thinking of moving their infrastructure to the Cloud.
September 20, 2010 § 2 Comments
Lab49 is at JavaOne 2010; I will be attending quite a few sessions from the Core Java Platform, Enterprise Service Architectures and the Cloud and Java EE Web Profile and Platform Technologies. Amazing track names… Anyway, it should be quite interesting, I will try to cover as much as I can in terms of sessions (technical presentations, keynotes and vendors’ offerings).
It’s Monday morning and the conference has already kicked in, I met a guy from Primavera who kindly gave me directions and explained that his software is the most popular project management software in the UK, they have been acquired by Oracle and so they deploy on WebLogic only, is this a sign of things to come?
Anyway, the first day is quite interesting; I stopped by the Caucho booth and asked them to give me three reasons why they think that their web/app server is better than the competition (Apache, WebLogic, etc…). Alex, their engineer, volunteered four reasons:
- Very small stack trace which makes it incredibly useful when debugging; the main reason is that Resin’s internal are pretty much all written in-house as opposed to using every open-source library available out there
- Fast, really fast; they use JNI (optional) to expose the file handling and sockets components, written in C, to the application server
- Light-weight JEE through the support of web profiles
- Last and probably most important a unique clustering architecture that uses a fixed (3) number of of masters and an unlimited number of dynamic servers that allow you to scale horizontally
Next stop was Terracotta EhCache where I met Greg Luck (he presented at the Lab a few months ago, my friend Shawn Gandhi brought him in); Greg was extremely excited to present BigMemory which went public a couple of weeks ago, the official documentation is still being updated as we speak, basically it’s a way to store the cached data in a non-heap area to avoid a) the pauses due to GC b) the physical size limitations of the heap. Greg showed some impressive numbers based on stress test with a BigMemory size of 40[Gb] where GC duration was constant (and practically zero) while a similar heap-based cache would have caused GC pauses of 260 [s]. Of course BigMemory is not a must for every application, particularly if your caching heap requirements are below 1 [Gb], in which case the serialization cost of BigMemory outweighs its advantages.
Next was a stop at JetBrains (cool t-shirt logo by the way: “I see dead code). I was given a preview of their latest Intellij IDEA (v 10) by one their friendly and knowledgeable engineer, Anna Kozlova: The IDE has full support for Scala and Clojure among other things. The Scala support, in particular, was quite impressive: Debugging, static analysis, code documentation lookup, refactoring, support for ScalaTest, seamless refactoring and code completion between inter-woven Java and Scala code, etc… It’s really worth a download.
I stopped at the bookstore located in the Hilton hotel and chatted with Andrew Lee Rubinger, author of Enterprise Java Beans 3.1 sixth edition. We discussed the merits of v3.1, the role of JPA and how EJBs can be used in an event-driven context (think MDB) through we agreed that this is not an EDA. He had interesting things to say since he is quite involved with the technologu at JBoss. Maybe we should bring him in for a talk at the Lab.