Recently, a sales partner from South Korea approached us with an intriguing challenge. Although the Open iT ComputeAnalyzer offers plug-ins and connectors to meter distributed grid computing environments, we realized there was an untapped potential.
ComputeAnalyzer primarily focuses on the grid computing system and less on the runtime of individual jobs. While this works well for batch jobs that are queued for automated processes, interactive jobs on the IBM LSF server are different. These demand real-time feedback and are prone to oversight and inefficient practices when humans intervene. This is exactly what the client aimed to monitor.
While Open iT specializes in tracking and optimizing licenses, in this scenario, the emphasis is on job slots. These slots are over and above the licenses, if the applications participating in the jobs use them.
The Challenge
Our engineers identified that the client employed another job style—interactive task-running. Their request was straightforward: monitor the activity level of these interactive jobs in the LSF grid computing environment and free up a job slot. The LSF server then reallocates idle job slots to other tasks.
Addressing the client’s needs meant that Open iT engineers were up against a dual-fronted challenge:
- Absence of a Data Collector in LSF Application Launches
In a Unix environment, the data collector is designed to operate as a daemon when a user accesses a machine. This design becomes problematic in setups where LSF Job Schedulers are utilized to launch applications.
The reason is that users don’t directly log into the machines hosting the applications. Instead, they access the LSF grid computing server, choosing their desired application. The server then takes charge, allocating resources from a pool of available job slots. The application windows then reroute to the user’s display ID that was used during the application selection.
- Daemon’s Blockage of Reallocation in Preliminary Setups
LSF comes with a job starter function at the command tier. If we employ this to run the data collector, an issue arises. The binary remains active even after the application is closed by the user. This behavior makes the job slots appear as if they’re still in use, obstructing LSF’s ability to reallocate the necessary hardware resources.
Suggested Solutions for Enhanced Functionality
Upon delving into the challenges presented, our engineering team proposed the following refined solutions:
- Integrating the Data Collector with LSF_JOB_STARTER
As discussed earlier, the LSF_JOB_STARTER variable offers a promising avenue for customers. Although this approach requires Open iT’s expertise to tweak the client’s LSF setup, it promises a more efficient operation. By adopting this, the data collector will initiate seamlessly before each job, aligning with the correct user level and display variable.
- Incorporating a Self-Termination Mechanism
This self-termination feature involves adding intelligence into the daemon, enabling it to exit gracefully when necessary. Thus, this directly addresses the second challenge.
The engineers provided the client with three different ways to configure the self-termination capability:
- Bind: This feature actively monitors the applications that a user typically launches. If it detects that none of these applications are currently in operation, the data collector promptly shuts down, thereby releasing the occupied job slot for other tasks.
- Tail: This feature actively scans for user-related activities. Using a configurable list, it can be set to intentionally bypass certain processes, with default settings typically excluding Open iT and several other system-related tasks.
- Display: Triggers self-termination if the data collector fails to connect to the designated user display.
Success with the Bind Mechanism
Upon feedback from our sales partner, noting the prevalence of such job usage in South Korea, Open iT prioritized development to support this grid computing environment. After rigorous assessments of various scenarios and potential solutions, it became clear that the bind task was the ideal solution.
This approach enabled the client to efficiently collect data within their LSF grid computing framework. Our sales partner promptly confirmed that Open iT is now fully equipped to optimize LSF interactive jobs.
Tailored Solutions for Optimal Software Asset Management
At Open iT, our commitment goes beyond just optimizing software assets. We pride ourselves on crafting tailored capabilities and solutions that align with our clients’ distinct needs.
Our seasoned engineers work together with our customers’ IT departments, ensuring that we co-create and implement strategies that empower them to meet their business objectives and amplify the returns on their software licenses.
Reach out now to an Open iT representative and discover how we can bring similar benefits to your organization.