Thursday, April 25, 2013

OpenAM Policy Agent Configuration Reload Interval

So what's there to further squeeze the performance of Policy Agent during high-load? -- This is a typical question from customers.



There is this section on Policy Agent Configuration reload and cleanup interval. The default are set to 60 mins and 30 mins respectively.




This is what I verified by tailing the agent debug log (of course, you need to enable log level to MESSAGE at least). The timing is correct - once every 60 mins and 30 mins.


The following is what you'll see in detail when the reload takes place. (It's longer than the following)



So, should we change this value? My personal view (I already say personal view; do not point a gun at me :> ) is NO for Configuration Reload Interval, even for a very highly loaded server with policy agent installed.

A MAYBE for Configuration Cleanup Interval.


.

Wednesday, April 24, 2013

OpenAM Policy Agent Cache

If one tail the Policy Agent debug log (remember to set logging level to MESSAGE first), one will observe the following and it happens around every 3 minutes.


+++++++++++++

2013-04-23 13:30:24.283    Info 7401:7f2738011780 Polling: Starting sso cache cleaner. Hash table size=0.
2013-04-23 13:30:24.283    Info 7401:7f2738011780 Polling: Finished sso cache cleaner. Hash table size=0.
2013-04-23 13:30:24.283    Info 7401:7f273802d880 Polling: Starting policy cache cleanup. Hash table size=0.
2013-04-23 13:30:24.283    Info 7401:7f273802d880 Polling: Finished policy cache cleanup. Hash table size=0.

++++++++++++

2013-04-23 13:33:24.283    Info 7401:7f2738011780 Polling: Starting sso cache cleaner. Hash table size=0.
2013-04-23 13:33:24.284    Info 7401:7f2738011780 Polling: Finished sso cache cleaner. Hash table size=0.
2013-04-23 13:33:24.284    Info 7401:7f273802d880 Polling: Starting policy cache cleanup. Hash table size=0.
2013-04-23 13:33:24.284    Info 7401:7f273802d880 Polling: Finished policy cache cleanup. Hash table size=0.

+++++++++++

How can we change this value? 

The objective: In a stable environment, there is seldom change in SSO and Policy configuration. So a 3 minutes cache cleanup might be considered "aggressive" and/or "unnecessary" for some customers.


Go to Access Control > / (Top Level Realm) > Agents > [Agent-Name] > OpenAM Services > Policy Client Service




Change the default 3 minutes interval accordingly. 


PS: Since these are "Hot-swap : No" variables, do remember to restart the web container that has Policy Agent installed.




Now, one thing to note is if you enable notification, the cache will be flushed as and when an update comes from OpenAM servers. This might happen even when the polling interval has not reached.




In this mode, cache entry expiration still applies through use of the polling mechanism. In addition, the web agent gets notified by the OpenSSO Enterprise service about session changes through use of a notification mechanism. Session changes include events such as session logout or a session timeout. When notified of a session or a policy change, the web agent updates the corresponding entry in the cache. Apart from session updates, web agents can also receive policy change updates. Policy changes include events such as updating, deleting, and creating policies.




While writing this, I realized I have posted an article on this topic before. Still applicable today.


.

Tuesday, April 23, 2013

OpenAM Policy Agent Notification

When policies are configured on the OpenAM Administration Console, how do all the policy agents get notified of the update? This question was put forward to me by one of my customers.




The answer is pretty straight-forward.


1st: Ensure Agent Notification is enabled



2nd: Create the necessary policies in the Policy tab



Now, the trick here is to make sure you click SAVE once rules are created.




We are done. To confirm whether the newly updated policies are broadcasted to all policy agents, look for the following segment in Policy Agent debug log:




2013-04-22 22:43:54.382   Debug 7403:7f27140011b0 PolicyEngine: PolicyEngine::policy_notify :Handling notification.
2013-04-22 22:43:54.382    Info 7403:7f27140011b0 PolicyEngine: PolicyEngine::policy_notification_handler:Parsing Policy Change Notification
:
:
:
2013-04-22 22:43:54.383   Debug 7403:7f27140011b0 ThreadPool: ThreadPool::dispatch(): Successfully dispatched the work.



An XML file is transferred from the OpenAM server to each Policy Agent for the notification to take place.


.


Monday, April 22, 2013

SiteMinder Mobile Application Authentication Solution

So SiteMinder provides a Mobile Application Authentication Solution to allow customers to leverage their existing SiteMinder infrastructure to secure mobile apps.



How does it work?

1. Configuring a Web Agent or Secure Proxy Server to expose SiteMinder authentication operations as REST web services.

2. Mobile apps can send requests to these services to do the following operations:
■ Log in a user using Basic (username and password) authentication.
■ Verify the status of a SiteMinder session.
■ Log out a user.

Simple? REST, that's it.


.

Saturday, April 20, 2013

OpenAM Session Notification

I have a pair of OpenAM 10.0 configured as a site.

When a cluster of OpenAM servers are configured as a site, there is a need to synchronize the session objects. This feature is in-built.

Assuming an administrator (amadmin) has log-in and is served by the 2nd node (amlbcookie = 02). Once logged in, the administrator does nothing and left the OpenAM administrative console untouched.

The following SESSION NOTIFICATION will be received on the 1st node after the maximum idle timeout is reached:






This is how the 2 OpenAM servers (or even more) keep the sessions in sync.


Let's say suddenly one of the OpenAM node is down, you'll see the following in Session debug log on the alive node.


amSession:04/17/2013 03:24:42:109 PM SGT: Thread[http-bio-8080-exec-5,5,main]
ERROR: Session:getValidSession : 
com.iplanet.dpro.session.SessionException: Connection refused
at com.iplanet.dpro.session.Session.getSessionResponseWithoutRetry(Session.java:1588)
at com.iplanet.dpro.session.Session.getValidSessions(Session.java:1346)
at com.iplanet.dpro.session.Session.getValidSessions(Session.java:1207)
at com.sun.identity.console.session.model.SMProfileModelImpl.initSessionsList(SMProfileModelImpl.java:112)
at com.sun.identity.console.session.model.SMProfileModelImpl.getSessionCache(SMProfileModelImpl.java:308)
at com.sun.identity.console.session.SMProfileViewBean.beginDisplay(SMProfileViewBean.java:190)
at com.iplanet.jato.taglib.UseViewBeanTag.doStartTag(UseViewBeanTag.java:149)
at org.apache.jsp.console.session.SMProfile_jsp._jspService(SMProfile_jsp.java:149)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:432)
at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:390)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:749)
at org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:487)
at org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:412)
at org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:339)
at com.iplanet.jato.view.ViewBeanBase.forward(ViewBeanBase.java:340)
at com.iplanet.jato.view.ViewBeanBase.forwardTo(ViewBeanBase.java:261)
at com.sun.identity.console.base.AMViewBeanBase.forwardTo(AMViewBeanBase.java:161)
at com.sun.identity.console.base.AMPrimaryMastHeadViewBean.forwardTo(AMPrimaryMastHeadViewBean.java:137)
at com.iplanet.jato.view.ViewBeanBase.forwardTo(ViewBeanBase.java:229)
at com.sun.identity.console.session.SMProfileViewBean.handleServerNameHrefRequest(SMProfileViewBean.java:349)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.iplanet.jato.view.command.DefaultRequestHandlingCommand.execute(DefaultRequestHandlingCommand.java:183)
at com.iplanet.jato.view.RequestHandlingViewBase.handleRequest(RequestHandlingViewBase.java:308)
at com.iplanet.jato.view.ViewBeanBase.dispatchInvocation(ViewBeanBase.java:802)
at com.iplanet.jato.view.ViewBeanBase.invokeRequestHandlerInternal(ViewBeanBase.java:740)
at com.iplanet.jato.view.ViewBeanBase.invokeRequestHandler(ViewBeanBase.java:571)
at com.iplanet.jato.ApplicationServletBase.dispatchRequest(ApplicationServletBase.java:957)
at com.iplanet.jato.ApplicationServletBase.processRequest(ApplicationServletBase.java:615)
at com.iplanet.jato.ApplicationServletBase.doPost(ApplicationServletBase.java:473)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:647)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at org.forgerock.openam.validation.ResponseValidationFilter.doFilter(ResponseValidationFilter.java:44)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at com.sun.identity.setup.AMSetupFilter.doFilter(AMSetupFilter.java:95)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:936)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:312)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: com.iplanet.dpro.session.SessionException: Connection refused
at com.iplanet.dpro.session.Session.sendPLLRequest(Session.java:1167)
at com.iplanet.dpro.session.Session.getSessionResponseWithoutRetry(Session.java:1578)
... 61 more
Caused by: com.iplanet.services.comm.client.SendRequestException: Connection refused
at com.iplanet.services.comm.client.PLLClient.send(PLLClient.java:218)
at com.iplanet.services.comm.client.PLLClient.send(PLLClient.java:114)
at com.iplanet.dpro.session.Session.sendPLLRequest(Session.java:1159)
... 62 more



.

Friday, April 19, 2013

Session Failover with Sun GlassFish(tm) Message Queue 4.4 - Part IV

Assuming we have 2 active OpenAM servers configured as a site with session failover running. The underlying component used for session failover is a pair of Sun GlassFish(tm) Message Queue 4.4.

am1.cdemo.sg:7676 (Primary) and am2.cdemo.sg:7777 (Secondary)




What happens when the primary MQ (am1.cdemo.sg:7676) instance goes down?

The following will be captured in the Session debug log:


INFO: [I107]: Connection recover state: RECOVER_INACTIVE, broker: am1.cdemo.sg:7676(17777)
Apr 17, 2013 3:58:21 PM com.sun.messaging.jmq.jmsclient.ConnectionRecover logRecoverState
INFO: [I107]: Connection recover state: RECOVER_TRANSPORT_CONNECTED, broker: am2.cdemo.sg:7777(17777)
Apr 17, 2013 3:58:21 PM com.sun.messaging.jmq.jmsclient.ConnectionRecover logRecoverState
INFO: [I107]: Connection recover state: RECOVER_STARTED, broker: am2.cdemo.sg:7777(17777)
Apr 17, 2013 3:58:21 PM com.sun.messaging.jmq.jmsclient.ConnectionRecover logRecoverState
INFO: [I107]: Connection recover state: RECOVER_IN_PROCESS, broker: am2.cdemo.sg:7777(17777)
Apr 17, 2013 3:58:21 PM com.sun.messaging.jmq.jmsclient.ConnectionRecover logRecoverState
INFO: [I107]: Connection recover state: RECOVER_SUCCEEDED, broker: am2.cdemo.sg:7777(17777)





We can see that am1.cdemo.sg:7676 (Primary node) is immediately recognized as Inactive, and am2.cdemo.sg:7777 (Secondary node) is "started" to take over as the Active node.





A "Deactivated broker" message will be captured in the secondary MQ debug log:




[17/Apr/2013:15:58:18 SGT] [B1180]: Deactivated broker 
Address = mq://192.168.1.51:7676/?instName=ambroker&brokerSessionUID=1259668788178645760
StartTime = 1366181153149
ProtocolVersion = 410

And right immediately after that, all MQ connections from OpenAM servers will be established on the secondary MQ.


[17/Apr/2013:15:58:18 SGT] [B1065]: Accepting: amuser@192.168.1.52:40287->jms:17777. Count: service=2 broker=2
[17/Apr/2013:15:58:18 SGT] [B1065]: Accepting: amuser@192.168.1.52:40286->jms:17777. Count: service=2 broker=2
[17/Apr/2013:15:58:21 SGT] [B1122]: Reconnecting client 1259668788186075904
[17/Apr/2013:15:58:21 SGT] [B1065]: Accepting: amuser@192.168.1.52:40298->jms:17777. Count: service=3 broker=3



Now, what happens when the Primary node (am1.cdemo.sg:7676) is restarted?

My observation is:

1. The OpenAM servers will continue to communicate with the last active node, which is Secondary node (am2.cdemo.sg:7777)
2. They will only be communicate with the Primary node (am1.cdemo.sg:7676) again upon OpenAM restart.

=> This follows the Database URL : am1.cdemo.sg:7676,am2.cdemo.sg:7777 which was initially configured during Site setup



Nice!

/

Thursday, April 18, 2013

Session Failover with Sun GlassFish(tm) Message Queue 4.4 - Part III

So we have a proper Message Broker set up for OpenAM Session Failover (AMSFO). We now need some statistics for monitoring.



I must say this has nothing to do with OpenAM at all. The statistics comes from Sun GlassFish(tm) Message Queue, not OpenAM. Sun GlassFish(tm) Message Queue is an external component here.

So, here is a good read.




To generate the type of statistics shown above, do the following:


- Go to ../jmq/imq/var/instances/ambroker/props
- Edit config.properties
- Add the following
imq.metrics.interval=10
imq.log.level=INFO
imq.log.file.output=INFO
imq.metrics.enabled=true



- Restart the MQ broker



Questions:

1. Does the Message Broker require restart whenever any OpenAM server is restarted?

No.


2. Does the OpenAM servers require restart if the active MQ instance is dead while another standby MQ instance takes over?

No.

.

Wednesday, April 17, 2013

Session Failover with Sun GlassFish(tm) Message Queue 4.4 - Part II

Now, what if the pair of OpenAM servers are running and suddenly the Message Broker has stopped abruptly?




Firstly, nothing will happen. Not until the next person tries to access either of the OpenAM.


Then you should be able to see the following exception stack trace in Session debug log:


amSession:04/16/2013 10:51:38:087 PM SGT: Thread[http-bio-9080-exec-20,5,main]
SessionID(HttpServletRequest) : is forward = null
amSession:04/16/2013 10:51:38:091 PM SGT: Thread[http-bio-9080-exec-20,5,main]
JMQSessionRepository.save(): session size=3192 bytes
amSession:04/16/2013 10:51:38:097 PM SGT: Thread[http-bio-9080-exec-20,5,main]
ERROR: Session failover service is not functional due to DB unavailability.
javax.jms.IllegalStateException: [C4059]: Cannot perform operation, session is closed.
at com.sun.messaging.jmq.jmsclient.SessionImpl.checkSessionState(SessionImpl.java:1844)
at com.sun.messaging.jmq.jmsclient.SessionImpl.createBytesMessage(SessionImpl.java:1873)
at com.sun.identity.ha.jmqdb.FAMRecordJMQPersister.send(FAMRecordJMQPersister.java:272)
at com.iplanet.dpro.session.JMQSessionRepository.save(JMQSessionRepository.java:329)
at com.iplanet.dpro.session.service.SessionService.saveForFailover(SessionService.java:3228)
at com.iplanet.dpro.session.service.InternalSession.updateForFailover(InternalSession.java:1548)
at com.iplanet.dpro.session.service.InternalSession.setLatestAccessTime(InternalSession.java:1256)
at com.iplanet.dpro.session.service.SessionService.getSessionInfo(SessionService.java:1229)
at com.iplanet.dpro.session.Session.doRefresh(Session.java:1450)
at com.iplanet.dpro.session.Session.access$300(Session.java:113)
at com.iplanet.dpro.session.Session$3.run(Session.java:1426)
at com.sun.identity.session.util.RestrictedTokenContext.doUsing(RestrictedTokenContext.java:86)
at com.iplanet.dpro.session.Session.refresh(Session.java:1423)
at com.iplanet.dpro.session.Session.getSession(Session.java:1092)
at com.iplanet.sso.providers.dpro.SSOProviderImpl.createSSOToken(SSOProviderImpl.java:92)
at com.iplanet.sso.SSOTokenManager.createSSOToken(SSOTokenManager.java:241)
at com.sun.identity.console.base.ConsoleServletBase.checkAuthentication(ConsoleServletBase.java:266)
at com.sun.identity.console.base.ConsoleServletBase.validateSSOToken(ConsoleServletBase.java:148)
at com.sun.identity.console.base.ConsoleServletBase.onBeforeRequest(ConsoleServletBase.java:112)
at com.iplanet.jato.ApplicationServletBase.fireBeforeRequestEvent(ApplicationServletBase.java:1105)
at com.iplanet.jato.ApplicationServletBase.processRequest(ApplicationServletBase.java:591)
at com.iplanet.jato.ApplicationServletBase.doGet(ApplicationServletBase.java:459)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)


:
:
:


amSession:04/16/2013 10:51:46:011 PM SGT: Thread[SystemTimer,5,main]
ERROR: Session failover service is not functional due to DB unavailability.
amSession:04/16/2013 10:51:46:011 PM SGT: Thread[SystemTimer,5,main]
Session database is not available at this moment.Please check with the system administrator for appropriate actions
com.sun.messaging.jms.JMSException: [C4003]: Error occurred on connection creation [am1.cdemo.sg:7676]. - cause: java.net.ConnectException: Connection refused
at com.sun.messaging.jmq.jmsclient.ExceptionHandler.throwConnectionException(ExceptionHandler.java:274)
at com.sun.messaging.jmq.jmsclient.ExceptionHandler.handleConnectException(ExceptionHandler.java:220)
at com.sun.messaging.jmq.jmsclient.PortMapperClient.readBrokerPorts(PortMapperClient.java:241)
at com.sun.messaging.jmq.jmsclient.PortMapperClient.init(PortMapperClient.java:150)
at com.sun.messaging.jmq.jmsclient.PortMapperClient.(PortMapperClient.java:92)
at com.sun.messaging.jmq.jmsclient.protocol.tcp.TCPConnectionHandler.(TCPConnectionHandler.java:164)
at com.sun.messaging.jmq.jmsclient.protocol.tcp.TCPStreamHandler.openConnection(TCPStreamHandler.java:135)
at com.sun.messaging.jmq.jmsclient.ConnectionInitiator.createConnection(ConnectionInitiator.java:778)
at com.sun.messaging.jmq.jmsclient.ConnectionInitiator.createConnectionNew(ConnectionInitiator.java:254)
at com.sun.messaging.jmq.jmsclient.ConnectionInitiator.createConnection(ConnectionInitiator.java:208)
at com.sun.messaging.jmq.jmsclient.ConnectionInitiator.createConnection(ConnectionInitiator.java:158)
at com.sun.messaging.jmq.jmsclient.ProtocolHandler.init(ProtocolHandler.java:816)
at com.sun.messaging.jmq.jmsclient.ProtocolHandler.(ProtocolHandler.java:1529)
at com.sun.messaging.jmq.jmsclient.ConnectionImpl.openConnection(ConnectionImpl.java:2327)
at com.sun.messaging.jmq.jmsclient.ConnectionImpl.init(ConnectionImpl.java:1024)
at com.sun.messaging.jmq.jmsclient.ConnectionImpl.(ConnectionImpl.java:418)


.

Tuesday, April 16, 2013

Session Failover with Sun GlassFish(tm) Message Queue 4.4

Older versions of OpenAM ( version < 10.1-Xpress) are utilizing Sun GlassFish(tm) Message Queue 4.4 for session failover.



Sun GlassFish(tm) Message Queue 4.4 is quite a black box to me. I find it hard to debug when things go wrong.

Now, at least I learnt a few sanity checks to ensure the broker cluster is running OK.

1. Ensure Message Broker is started properly

- Once AMSFO is configured, run ./amsfo start
- Tail the log file at ../jmq/imq/var/instances/ambroker/log/log.txt
- Make sure there is this line "Broker "ambroker@xxx.xxx.xx:YYYY" ready."
-


2. Ensure Session Failover is enabled

- Once AMSFO is configured, the 1st thing I'll do is to turn the debug log of each OpenAM server to MESSAGE level.
- Restart each OpenAM server
- Tail the Session debug log
- Make sure there is this line "Session Failover Enabled = true"




PS: If you see the following,


[16/Apr/2013:17:12:54 SGT] [B1066]:   Closing: amuser@192.168.1.51:41930->jms:37124 because "[B0059]: Client closed the connection". Count: service=0 broker=2

or in Session debug file


amSession:04/16/2013 10:28:38:019 PM SGT: Thread[localhost-startStop-2,5,main]
ERROR: Session failover service is not functional due to DB unavailability.
javax.jms.IllegalStateException: [C4059]: Cannot perform operation, session is closed.
at com.sun.messaging.jmq.jmsclient.SessionImpl.checkSessionState(SessionImpl.java:1844)
at com.sun.messaging.jmq.jmsclient.SessionImpl.createBytesMessage(SessionImpl.java:1873)
at com.sun.identity.ha.jmqdb.FAMRecordJMQPersister.send(FAMRecordJMQPersister.java:272)
at com.iplanet.dpro.session.JMQSessionRepository.delete(JMQSessionRepository.java:260)
at com.iplanet.dpro.session.service.SessionService.removeInternalSession(SessionService.java:775)
at com.iplanet.dpro.session.service.SessionService.destroyInternalSession(SessionService.java:1103)



this means one of the OpenAM has shut down.

And if you see the following,

[16/Apr/2013:17:18:25 SGT] [B1065]: Accepting: amuser@192.168.1.51:42980->jms:37124. Count: service=3 broker=3

the down OpenAM has just started.





I like putting all these little notes here in my blog to remind myself. Hope you do not mind. :)


.

Latest OpenAM Roadmap

The OpenAM roadmap has been updated on 9th March 2013.


Detail here.

What's interesting in OpenAM 10.2 for me are OpenID Connect, OpenAM IPv6 support and OpenAM Java 7 support.

It's very surprising to me that customers are demanding software to be fully IPv6 compliant these days, especially here in Singapore. IDA (Infocomm Development Authority of Singapore) has been pushing this directive real hard to all ministries for the past 1-2 years.

Another interesting feature in OpenAM 11.1 is the Monitoring Dashboard. Pretty curious what kind of dashboard will that be. This is usually a requirement from customers when the SSO infrastructure goes LIVE. We have been writing custom scripts and application to handle this out of OpenAM core.


.

Saturday, April 13, 2013

Connection issue from OpenAM to MS Active Directory

I have customers who are using Microsoft Active Directory as the authoritative authentication store, as well as user data store. So do we internally (of course, ours is meant for a live POC/showcase).



We know that the connection to MS AD is via the LDAP Authentication Module. It has been there for years. However, it has its own shortcomings which I blogged here 3 years ago

Starting in OpenAM 10.0, the LDAP Authentication Module has been revamped to add support for the latest LDAP Behera password policy standards (See OPENAM-613). The shortcomings mentioned in my blog previously was resolved in this revamp. However, it brings with it another issue - somehow, the connection pool wasn't working as expected (See OPENAM-590, OPENAM-627OPENAM-1787). 

A fix (in fact, a few iterations in total) was released. The latest being 3.0.0-OPENAMp3. (See maven repository here)  (I'm not too sure about 3.0.0-Xpress1. This must be something new which I have not tested yet.)


A customer of ours in Thailand initially encountered connection timeout issue whenever the OpenAM is left "idle" for a period of time. e.g. the 1st user who attempted to log-in in the morning. It was an easy fix for this customer by replacing with opendj-ldap-sdk-3.0.0-OPENAMp3.jar. The system has been running superbly well since last December.


No good news for our internal SSO infrastructure though. :( In one of our test environment which we simulate the setup in Thailand, everything was running fine. No connection time out issue at all - for days, for weeks. 

However, in our own production system, no luck. On and off, we'll get connection time out issue. It usually happens to the 1st staff who came into the office in the morning. So we started to zoom in and I must say it's super time-consuming. 

When the problem occurred, we run jstack to determine the potential root cause. Pretty lucky here! 


"http-apr-192.168.0.88-8080-exec-10" daemon prio=10 tid=0x00007f8b88010800 nid=0x1841 waiting on condition [0x00007f8b03e73000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000007d92a3430> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:424) at org.forgerock.opendj.ldif.ConnectionEntryReader.getNextResponse(ConnectionEntryReader.java:437) at org.forgerock.opendj.ldif.ConnectionEntryReader.hasNext(ConnectionEntryReader.java:264) at com.sun.identity.authentication.modules.ldap.LDAPAuthUtils.searchForUser(LDAPAuthUtils.java:883) at com.sun.identity.authentication.modules.ldap.LDAPAuthUtils.authenticateUser(LDAPAuthUtils.java:466) at com.sun.identity.authentication.modules.ldap.LDAP.process(LDAP.java:526) at com.sun.identity.authentication.spi.AMLoginModule.wrapProcess(AMLoginModule.java:1000) at com.sun.identity.authentication.spi.AMLoginModule.login(AMLoginModule.java:1170) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.identity.authentication.jaas.LoginContext.invoke(LoginContext.java:208) at com.sun.identity.authentication.jaas.LoginContext.login(LoginContext.java:124) at com.sun.identity.authentication.service.AMLoginContext.runLogin(AMLoginContext.java:557) at com.sun.identity.authentication.server.AuthContextLocal.submitRequirements(AuthContextLocal.java:696) at com.sun.identity.authentication.AuthContext.submitRequirements(AuthContext.java:1244) at com.sun.identity.authentication.AuthContext.submitRequirements(AuthContext.java:1230) at org.apache.jsp.getServerInfo_jsp._jspService(getServerInfo_jsp.java:131) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) at javax.servlet.http.HttpServlet.service(HttpServlet.java:728) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:432) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:390) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334) at javax.servlet.http.HttpServlet.service(HttpServlet.java:728) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.forgerock.openam.validation.ResponseValidationFilter.doFilter(ResponseValidationFilter.java:44) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at com.sun.identity.setup.AMSetupFilter.doFilter(AMSetupFilter.java:95) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) 
:
:




We zoomed into LDAPAuthUtils.java, in particular the searchForUser method. This will attempt to query a user from the Active Directory via a LDAP connection from the Admin Connection Pool - adminPool.


So, this is how the original code section looks like :


 if (adminPool) {
                            BindRequest bindRequest = Requests.newSimpleBindRequest(bindingUser, bindingPwd.toCharArray());

                            connFactory = Connections.newAuthenticatedConnectionFactory(connFactory, bindRequest);
                        }



We suspected the connection object is stale within the Admin Connection Pool. Somehow, the fix in 3.0.0-OPENAMp3 resolves the stale connections if the backend is a OpenDJ. It was not tested rigorously against a MS AD backend. ( Our guess here. Do not point a gun on us. :> ) Well, it's named opendj-ldap-sdk-3.0.0-OPENAMp3.jar for a reason right? 






So, how do we resolve this issue? Firstly, we do not want to touch opendj-ldap-sdk-3.0.0-OPENAMp3.jar for sure. This to us is external to OpenAM core. 


The safest approach is to add a 1-liner fix to LDAPUtils.java.


 if (adminPool) {
                            BindRequest bindRequest = Requests.newSimpleBindRequest(bindingUser, bindingPwd.toCharArray());
                            connFactory = Connections.newHeartBeatConnectionFactory(Connections.newAuthenticatedConnectionFactory(connFactory, bindRequest),10, TimeUnit.SECONDS);
                        }



The 1-liner fix attempts to perform a heartbeat on each connection object every 10 seconds. ( Of course, 10 seconds is debatable. You can choose your own magic number )


Our production OpenAM has been running smoothly since.  I think about a month already.


PS: Is this a bug? I do not think so. That's why I never raise a bugster. Strange thing is against some MS AD, no problem. Against another set of MS AD, problem occurs. This has been haunting me for a while since last year after OpenAM 10.0 was released.


.


Friday, April 5, 2013

Session Upgrade fails

Seriously, have you encountered this error message before? 



This error occurs when a person attempts to use the same browser with multiple tabs to log-in as different identities. (e.g. a real user from AD and amadmin)

Honestly, in my many years of Sun AM/OpenSSO/OpenAM deployment, I have not encountered this error once. This is because I know exactly one should not use the same browser to log in as different users.

However, today is an exception! :) I'm so blur today.

.

Thursday, April 4, 2013

Failed to get the valid sessions from the specified server

In a dual OpenAM instances setup (or more than 2), it is fairly common to get the "Failed to get valid sessions from the specified server".


And strange enough, there is no such error in another node. See image below.


One of the reasons this happen is the web container of the 1st OpenAM instance is not restarted when the 2nd OpenAM instance is added.

Fairly common. 

How to resolve this? Restart the web container of the 1st OpenAM instance immediately after configuring the 2nd OpenAM instance.


Done. Simple.

.
 

Wednesday, April 3, 2013

Invalid Credential when Add to Existing Deployment - Part IV

I have blogged quite a lot on this topic (see here and here) and I have also find out the Configurator is only expected to work for a particular setup (see here).

It will fail if customers happen to arrange the Organization and Administrator Authentication configurations in a way that was described in my previous posting.



Today, I am again setting up a pair of OpenAM for internal testing for a customer of mine. This time round, there is no customization at all. Just a plain vanilla setup.

And of course, the Configurator works like a charm as long as the First instance of OpenAM server is alive and running.

The Configurator will "contact" the 1st instance and grab all the configured parameters over. What's more, it will suggest the next best port numbers to use (since the default port numbers have already been used by the 1st instance).


Nice right? Only if the Organization and Administrator Authentication configurations are arranged in a way the Configurator likes.

.