Thursday, March 20, 2014

IBM Corporate Service Corps Team Nigeria 7 Google Map


This is the brief introduction about IBM's unique global program called Corporate Service Corps.

"The Corporate Service Corps was launched in 2008 to help provide IBMers with high quality leadership development while delivering high quality problem solving for communities and organizations in emerging markets. The Corporate Service Corps was launched in 2008 to help provide IBMers with high quality leadership development while delivering high quality problem solving for communities and organizations in emerging markets. The program empowers IBM employees as global citizens by sending groups of 10 - 15 individuals from different countries with a range of skills to an emerging market for four week community-based assignments. During the assignment, participants perform community-driven economic development projects working at the intersection of business, technology, and society. "


I am lucky enough to be selected as a team of IBMers who comes from different part of the world and never met before to be sent to Nigeria. The following is map of all of 12 IBMers' location.

CSC Team Nigeria 7 Map

And here is the link to it so you can edit: Nigeria 7 Google Map




Friday, March 14, 2014

Browser and JS memory model and their garbage collection

Recently I am working on single page application, which leads to have deep dive into browser's memory model and JS garbage collector. I am also looking at different tools to do memory profiling and heapdump on different browsers.

There are two kinds of garbage collection algorithms: reference counting and mark-and-sweep. As we know finding out unreachable objects is undecidable in computer science. Those two algorithms are not always perfect.

Reference counting is essentially by keeping tracking of number of references to the objects once the count reaches 0 then gc will be triggered. However there is well-knows issue with that: it doesn't clean up the two objects have references to each other but on other references to these two objects. It is called "cycle reference."

Modern JS engines use mark-and-sweep garbage collection, basically it goes down from root Object to "unmark" the each object which can be reached. Then marked objects are considered as gc'd.

However this gc model is not compitale with browser's component object model DOM object handling. For example, IE's Windows COM use reference counting so does FreFox's XPCOM.

There certain memory leak patterns which are well documented:

1. circular reference between DOM and JS objects, DOM<->JS, please note this only apply to IE browser versions older than IE8.

As stated before, JS garbage collector can identify the cycle reference by going down the object hierarchy but not for the cycle reference between DOM object and JS object. And since DOM and JS have different garbage collection algorithm, so the above creates memory leak.

- Solution: set dom element reference in JS to null on unload() event.

2. closure: functions nested inside function, gain access to the outer function's variables it is known as closure.

function(){
    var obj = domElement;
   domElement.onclick = function(){ // this create new object which reference to domElement and domElement has onclick property reference back to this function obj, cycle reference
   
   }
   // obj = null; //solution is to set obj to null when function returns
}

3. inline script, the internal scripting object. If you create dom element with inline script thousands of times the memory leak will happen.

4. DOM object insertion order.

Each insertion of dom element will inherit the scope of the parent object which effectively create new object. If Dom element creation and insert happens alone from the document object, then temporary object will be created and leak.

The solution is to append child to document object first.



A highly extendable ETL framework architecture to solve common challenges for data loading

E(xtraction)T(ransformation)L(oad)  is common task we face in data analytics projects. And There are common challenges when we develop an ETL framework.


  • To design common and reusable ETL framework to load data from various data sources into single source such as your own database and integrate them into batch program.
  • To map data attributes between external sources and your canonical data schema
  • To provide user-friendly way to define rules for filtering data during batch load process 

Recently I worked on a project to extract defects from various defects management software like IBM RTC, HP MC, IBM CMVC and etc. and create analytics pipeline to run some machine learning algorithms like k-means clustering and SVM(support vector machine). I created  D4Jazz framework: an automated defects load framework for loading the data from different sources into central repository which is Rational Team Concert. 

It has the following list of features: 

1. It can be used for migrating defects from any external sources into RTC. . 
2. Software design patterns are identified in this Extraction-Transformation-Load (ETL) tool, and it has pluggable architecture and can be extended to any defect repositories. 
3. The current implementation includes loading defects from IBM Configuration Management Version Control (CMVC), comma separated values (CSV) text file and another RTC. 
4. It utilizes a custom RTC work item to store the external defect repository information, defect attributes mapping information, and defect extraction rule information. 
5. It is developed in Java and leverages Java client libraries available from RTC and CMVC.


Here is the architecture diagram. It is multi-tiers application. On top of the layers is the WebSphere modern batch-based job scheduler to run the ETL batch job which invokes job manager to do data extraction from the sources and load into central repository. And the framework defines set of interfaces for managing job (IJobManager), extracting data (IExtractor), loading data (ILoader), connecting to data source (IConnectionFactory). It provides extension points and allows application developers to plugin their own specific logic to load data from a specific data source. 

One of the useful design patterns in ETL framework is the decorator pattern. Decorator pattern allows application to custom existing interfaces and add additional features. For example, during the transformation phrase of ETL, there are usually large set of rules to apply and filter the data, and the set of rules applied could be different cross different data sources or highly varied. The decorator pattern allows us to build up a set of rules which implement a Decorable interface, then pipeline builder can pick any rules from the rule set and create its own rule set. 






Thoughts about JS class dependance and MvvM pattern


In a large scale J2EE development, java developers usually will need to develop some foundational modules which are shared cross any top-level modules or applications. Having clear-cut dependency between modules and applications are very important in terms of loose-coupling. Many times we as the developers put application-specific logic in the foundational modules and vice verse. This creates class dependence hell.

Class dependence is important for a complex JS project too.  

There are two principles: 

1) dependence should go one way not two way, that means in a class diagram, we will have, A ---> B, not A <--->B. 

2) higher level JS class should not depend on lower level JS, so in class diagram, you will see all dependence arrows go up all the way to our managers JS, then to JQuery, KendoUI, then to Javascript classes itself. 


The following is class diagram for implementing session time out dialog. Basically when page is loaded, an AuthToken validator runner is started and will check against Restful service if this token is valid, if not and token expired in certain threshold like 2 min, a dialog window will show with count down clock running. I will present the source code in next blog. 


Another thoughts about MvvM pattern, this brings great potential that we can leverage domain objects in backend and front-end, with json as DTO to connect both. And if node.js is used to develop the backend, then lots of code can be shared cross bot ends.

This is exciting time for JS developers. 

Saturday, March 1, 2014

Javascript singleton pattern


Solution 1:

 var Singleton = (function () { 
    var instance
   function init() { 
       return {
          method: function () { console.log('hello world'); },
          property: {} 
       };
}
return {
    getInstance: function () {
           if (!instance) { instance = init();
     }
     return instance; 
   }
}; })();


Solution 2: 

function Singleton()
{
this.instance = null;
this.field = {};
}

Singleton.getInstance = function ()
{
if (!this.pInstance)
{
this.instance = new Singleton();
}
return this.instance;
};

Javascript Interceptor or aspect-oriented programming Pattern - in progress

User case:

User login mobile application, and on behave of user mobile application sends every Ajax request to back-end service along with an authorization token, server will give 401 status if "session" or token is expired time out.

Solution:

a. Use global .Ajaxsetting() to specify the status code 401 call back function. It is probably the easiest way. However in this approach the ajax call error will still bubble up event chain and will still trigger the subsequently error handler. So it only works if you redirect user to login page immediately.

b. Use promise chain error call back to trap the error.

Let's look at promise.then(success, error) function. It returns a promise itself.
The promise is resolved if 1) success/error return a value or 2) the returned value is a promise, then it is resolved with this promise.
The promise is rejected if success / error throws an error.

So if we simply return a dummy value when 401 is caught, then it will be passed to the next successHandler. then in the successHandler we need to differentiate the trapped error and real success response.

Restful service session management (in progress)

Restful service stands for representational state transfer, so it is stateless, the idea is back-end service doesn't manage session so it can focus on its main job: to provide data and process data. In this way the load of session management is amortized across all the clients, so the back-end service can be easily scaled up and serve millions of requests with not overhead in session management which creates thread contention.

From wikipedia:
At any particular time, a client can either be in transition between application states or "at rest". A client in a rest state is able to interact with its user, but creates no load and consumes no per-client storage on the set of servers or on the network.

But many of the application needs user session management to allow authorized and authenticated access to resource such as user data and order process. 

  1. basic access authentication and digest access authentication. user/pass is sending along every request.  Obviously it has to be done on HTTPS using TLS. 
  2.  OAuth based authorization, Many of the popular Restful APIs are done by this. Every Restful client has a secret key, and once server authenticates the user/pass, a token is returned, Restful client needs to store the token locally and send every request along with secret key and token. Auth token should work the same as HTTP session id. The server has to store somewhere such as a table with a timestamp.
  3. If this is browser-based client, then a slight different version from the No. 2 is to use HTTP header to send three piece of info: user_id, expiration_date and a HMAC hash og user_id+expiration_date+secret_key.   ,
  4. Another different way to do it is actually let server creates session, so browser based client will send each Ajax request with cookie header. This is probably most non-useful approach. ;) 

Thoughts about architecture and application structure of Single Page Application including mobile application for web x.0

Introduction

This blog represents the ideas from my ongoing research and development of HTML5/JS/CSS applications. So I will keep on updating it once I come up new ideas or verify some of the ideas here.

Traditionally in the world of enterprise system integration, only backend systems such as SAP, JDE or SOAP/Restful endpoints are considered as system integration points. However with complex front-end application and rapid change of modern JS/HTML5/CSS technologies, thick-client/thin-server architectures, we need to think Browser or Mobile App as independent system or application to be integrated into enterprise architectures. For example, applications written in HTML5/CSS/JS can act as their own by using Restful web service to integration with any enterprise backend system.

A few thoughts:

1. We need to identify the design patterns in mobile and HTML5 applications and apply them. Those patterns includes MVVM, data-binding, template, observable, singleton, interceptor, factory patterns which I will demonstrate in various places in my diagram.

2. application storage - local storage and session storage, mobile application has latency and network connectivity issue, so appropriate usage of browser cache is critical not just to performance.

3. security models - mobile application or call center application has different security models from public facing web applications.

4. unit test driven development in JS/HTML - UNIT TESTS ARE IMPORTANT IN TERMS OF TDD SOFTWARE DEVELOPMENT AND CONTINUOUS INTEGRATION. And functional testing on web pages are hard but a important step to achieve automation.

5. Use promise/deferred objects to manage Ajax calls - Promise and its chaining are powerful features of JS. I will detail about it later. 

6. Stubbing out of the back-end service with json to achieve fast JS/HTML development. In this way the UI development can be truly separated from backend.

7. Caching the html snippets in browser to reduce the http traffic.

Reference architecture 
 

Project structure
/data - stub for backend database, contain back-end response data in json or xml 
/merge_build_scripts - script to build/merge/minimize js/css 
/public - any files will be www readable
          /css
                /icon - use image sprite technology to handle the icons to reduce http request
                reset.css - css file to reset browser and provide baseline
                common.css - css file for typical JS library widgets
                layout.css  - page layout css file: 2 columns, 3, columns, header, footer and etc.
                screen.css - font size, style css
          /img - site images
          /js
             /shared
                       AppConfig.js - global constant vars
                       TemplateManager.js - singleton, manage the partial HTML snippets
                       AuthManager.js  - singleton, manage Auth Token for each Restful request
                       AjaxManager.js  - interceptor pattern, manage Ajax URLs and provide promise object back to caller
                       Router.js - manage the separate screens under single page
                       WidgetFactory.js - abstract factory class to create widgets such as table, grid, tab, dialog and etc.
             /screen
                      partial.js - Observable definition and Model of MVVM patterns
                      screenAController.js - a controller specific to a view, which normally maps to a template
                      userProfileController.js - e.g. a controller for user profile view
                      userPhotoUploadDialog.js - e.g. a controller for user photo upload dialog
                      ....   
             app.js - the bootstrap js to start the app 
             login.js
          /lib   - third party JS
          /templates
                     partial.html - View of the MVVM patterns
                     screenA.html
                     userProfile.html
                     userPhotoUploadDialog.html
          app.html  -  page for authorized user
          login.html  - login page for public user
          tests.html   - unit test runner html
          unittests.js  - unit test of JS classes