Page 1 of 1

Impact of time change on OpcUa_Semaphore

Posted: 24 Jun 2016, 03:39
by dingyan
Hello, support team:

In the OPC UA Linux platform code, I found that the semaphore timedwait behavior is affected by system time adjustment.

For example, when I call OpcUa_Semaphore_TimedWait(pSemaphore, 10000), the expected result is that the calling thread will block and exit after 10 secs. But if I adjust forward the system clock for 10 mins during the call, the thread will exit after 10 mins + 10 secs.

After a search on the web, I found it is because that sem_timedwait is based on CLOCK_REALTIME clock, which means the timeout of Semaphore will become unstable when the system clock changed.

However, a stable semaphore timeout implementation is necessary in the OPC UA code, because many modules like SessionManager, OpcUa_Channel, UaSubscription are depending on it.

In conclusion, instead of using native sem_t as the implementation OpcUa_Semaphore, I think the alternative solution pthread_cond_t would be a good choice, because it can be set to use the stable clock CLOCK_MONOTONIC.

Best regards

Re: Impact of time change on OpcUa_Semaphore

Posted: 24 Jun 2016, 14:36
by Support Team
The problem is you cannot replace a semaphore with a condition. This is a
different kind of synchronization primitive. Maybe some places in the SDK could
better user a condition instead of semaphore but this is another discussion.
The UA semaphore must be implemented using a semaphore.
POSIX conditions can be used to wait and wakup threads, but must be locked
using a Mutex, another sync. primitive.
Semaphores contain a counter and are used to control a number of resources.
Also semaphores can be used globally across process boundaries,
which is important for multi-process applications.

To come back to the original problem. The sem_timedwait() uses an absolute
time as a deadline. The reason for this are signals. Signals can unblock every
system call on Linux to avoid hanging applications as we all know them from
Windows.
The syscall returns with EINTR in this case. An applications can decide to
shutdown (SIGTERM,SIGINT), reload configs (SIGHUP) or respawn the interrupted
syscall. This depends mainly on the type of signal.
In the case of restarting the syscall you can simply use the same arguments as
before. You don't need any complicated computation of elapsed time, etc.

The sem_timedwait does not need to be very accurate. It's not used for high
performance measurements. It is simply a deadline for the timeout. If the
timeout is e.g. 10s, it does not matter in the SDK if it returns after 10.1s
or 10.2s.

The problem you describe is caused by jumps in time. It's not a syscall
problem. By using proper time synchronization the time does not jump. NTP
compensates the clock drift by periodically calling adjtime(). So this problem
does simply not occur when using NTP.

If you implement any other kind of time sync which causes the time to jump,
you should better fix the time synchronization, or simply use NTP.

BTW: Also CLOCK_MONOTONIC is affected by adjtime, but these are just very small
deltas. Only CLOCK_MONITONIC_RAW is uneffected by both, setting new times and
adjtime(). But this requires kernel >=2.6.28.
See man clock_gettime

I hope this helps you better understanding the problem.

Re: Impact of time change on OpcUa_Semaphore

Posted: 27 Jun 2016, 10:37
by dingyan
Hello,

Thanks for your reply, I think there are 3 points I need to explain more precisely.

1. Time adjustment on large interval is inevitable for our program on training mode

Our program is required to be able to run on training mode by simulating some scenario for the end users, some of the scenarios are related with time adjustment.

For example, when the operator want to verify that a simulated device is automatically power on at 8:00 and power off at 19:00, he will just change the system time to 8:00 or 19:00, instead of keep on waiting for the real world time arrives.

2. Condition variable as platform level implementation of OpcUa_P_Semaphore, not replace OpcUa_Semaphore

What I suggest in my previous post, is to use pthread_cond_t(which is stable in the aspect of time) as implementation of semaphore mechanism at the platform level, I'm not intend to change the behavior of OpcUa_Semphore. That is why I posted this issue in the platform layer sub-topic.

By the way, the OpcUa_P_Semphore for windows needs not to change at all, because the waitForSingleObject of Windows is always stable.

3. For the developer who use UaSemaphore

After my analyze, the impact of time adjustment inside the SDK is not big. However, there are risks for the developers, who is not aware of this issue, but unfortunately use UaSemaphore as their thread synchronization mechanism in their program.

The purpose of my post is to inform you about this problem, and discuss with you if there is a better solution.

Best regards

Re: Impact of time change on OpcUa_Semaphore

Posted: 05 Jul 2016, 09:49
by Support Team
Hello,

to 1)
Changing the system time in that way is critical IMO, because you can break a lot, not only OPC UA.
Kerberos authentication, filesystem timestamps and software that relies on that, etc.
Isn't it possible to simulate your software internal time without tampering with the system time?

to 2)
It was clear that the semaphore inside the PL must be changed for that. So OpcUa_P_Semaphore to be precise.
I don't see how a condition alone could replace a semaphore. You would need to implement you own semaphore,
and can just use Conditions + Mutex + Your own synchronized counter implementation to create a new semaphore primitive.
You can do this and if it works we can review that.
The C SDK itself does not need global semaphores which work across processes boundaries, so in theory this could be possible.

to 3)
We see a big risk in replacing a working system semaphore with an own implementation. Also there is no reason for us or any other customer to do so, because the system semaphore work just well as long as you are not tampering with the system time.

So for us it makes no sense at the moment to change this.
If you can provide a working solution that passes all our unit tests and across all our POSIX platforms like Linux and QNX,
and also has no negative impact on performance and resource consumption we could think of applying your patch.

Until then you must maintain your own patch for this, which should not be a big problem, because only the opcua_p_semaphore.c/h files are affected and the interface is stable.

regards,
Unified Automation Support Team

Re: Impact of time change on OpcUa_Semaphore

Posted: 13 Jul 2016, 08:52
by dingyan
Hello,

The time adjustment is a complicate issue, because some think this should never be allowed, and some think it is an technical risk to be taken into account. Even inside your team, I notice that your developer has added time adjustment exception processing( SessionManager.cpp of C++ SDK).

As each one has its own opinion, I agree with you that we apply a patch for our own project. And I also agree with you the system time change is a big risk for the systems.

In the end, please allow me to give a short description of impacts which may be produced by time adjustment, just for your reference:

1. OpcUa_Channel_Connect: the calling thread will keep waiting until the timeout of connection, the waiting time will become longer when the clock is adjusted forword.
2. OpcUa_Thread_AddJob: When the job queue of the thread pool is full, the calling thread may keep waiting until the tamped time point .
3. In UaServer.cpp, the internal thread may stop checking invalid sessions when time is changed forward.

Best regards