bpurcell.org - Integrating Microsoft Windows 2003 NLB and ColdFusion MX 6.1 or JRun
Calendar
SunMonTueWedThuFriSat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      

Subject Archives
Amazon EC2 (15)
ColdFusionMX (155)
Corvette (3)
Flash Lite (4)
Flash Media Server (5)
Flash Player (3)
Flex (39)
General Web Dev (14)
HDTV (3)
Jboss (1)
Jquery (2)
JRun (59)
Max 2003 (3)
Other (33)
PC Hardware (16)
Software (17)
SpatialKey (7)
Wireless (8)
Working Out (1)

RSS Feed
Feed Listing

Site Contents
Home
My Blog
ColdFusion
JRun
Flex
Wireless & Networking
Hardware & Gadgets
HDTV
Software Picks
Pictures
Contact Me
Search


My Articles & Papers
Flex Performance
Server/Client
mm.com Under the Hood
Multiple Instances of CFMX
Multiple Instance Config
NLB with ColdFusion
Clustering CFMX for J2EE
Multi-Tier Hardware LB w CFMX
Cisco CSS & Coldfusion MX
JRun 4 Jini based Clustering
WiFi Growth

2ID Tacweb

Other Hobbys
Body-For-Life Challenge

Personal Projects
Family Pool
Deck Gate

username:
password:
 

 
Viewing Individual Entry / Main
April 22, 2004

Frank DeRienzo and I have had a lot of requests for an article on integrating Microsoft NLB and ColdFusion/JRun clustering. Frank and I set this up in the lab a few months back and sat down to write up the steps. I have to be honest though. I am not a big fan of NLB as a clustering technology because of the lack of application level monitoring.  Then again it is provided free with Windows 2003.  There are less options now that Cluster CATS is being deprecated so NLB is one of the new software based options. Of course if you are looking for the best solution always go with hardware.

After the editing process is complete this article will be featured on the macromedia.com website. You can view the current article here. This process will work for ColdFusion MX 6.1 Enterprise or JRun 4. Feel free to post comments below.....

The latest revision of the article is now live on the macromedia.com website.

Comments

I've used the following solution (which is free) on high load environments, it works very very well.

http://www.linuxvirtualserver.org/

It isn't hard to setup and supports direct and nat routing.


Jason, Thanks for the input. I worked with Linux Virtual Server with a customer during an onsite and had forgotten about it. It is definetily worth a look.


Do you know of any information for Clustering on Win2k w/Apache 2.0.4x ? Thanks.


I'm using CFMX6.1 clustered with Jrun updater 3 and Win2003 NLB. I've run into a bit of a sticky situation...sorry I just had to say that. When using sticky sessions I have no problem when the jsessionid is passed from the cookie, however if cookies are disabled in the browser and jsessionid is pass in through the URL string the sessions don't stick and a new session is created on every request. Is this a known issue or have I found somethin new? Any ideas?


Jason, The webserver connector supports sessions whether they are passed in a cookie or through the URL. Have you tried this without clustering multiple servers to see if it creates new sessions? Do you have J2EE sessions turned on in the CF admin? Did you append the CFID and CFTOKEN as well? You shouldn't have to with J2EE sessions.


Hi Brandon, I have tried with only a single server and the session seems to stick without any trouble. Obviously there is no stickyness required in this instance. Is stickyness a word?

J2EE sessions are configured properly. No cfid or cftoken in the URL.

It's only when I have multiple servers running in the cluster that the connector alternates between servers. This does not happen with cookies enabled.


I have been trying to get this to work with no luck. Do I have to create the JRun cluster on both web servers?

Also when I use the webserver config tool do I do that on both servers? And when do this do I use the IP for the server I am setting it up with?

Thanks, Todd


Same questions as Todd, If we have 3 IIS/jrun serves, do we create 1 connection per server with the local IIS and do we use the shared IP or the jrun-reserved IP? I've spent hours trying to get this to work.


I've spent weeks trying to get it working, but with no joy whatsoever....very irritating. In the above article you point the webserver connector to the second nic. How do you ensure that the jrun servers are using the first nic for session replication? Does jrun use the first nic by default when creating a server/cluster?


Jason, One of the major points for the article is that when you set up the Jrun clustering you should forget about the first nic entirely, it should only be used for/by NLB. You specify the second nic IP address or hostname in the Web connector. The session replication is handled by the clustering .


Sorry for the delay in responding. I have been really busy lately.

You only use the first NIC for NLB and nothing else. After you create the JRun cluster you will configure the webserver to point to the cluster in wsconfig. You just point wsconfig at the IP that is bound to the second NIC that is not using NLB. It may help to bind the non NLB Nic first in the binding order of Windows. You use the admin on one machine to create the cluster. It is a peer based cluster so the admin only modifieds the jrun.xml config file to enable clustering and put all servers in a single cluster domain. Other than that it doesn't matter what JRun admin you do this from.


Brandon, how do (in my case) the other two servers get the ISAPI installed?? We don't have "the" webserver we have three in my case , we're using NLB remember? That was the whole point of your article. "It is a peer based cluster so the admin only modifieds the jrun.xml config file to enable clustering and put all servers in a single cluster domain" Could we impose upon you to put this into English?

This is my first experience with Jrun and the only reason we are using it at all is because of this clustering thing. My boss saw the article and thought it was a great idea; if I don't implement this you will be responding to posts from my replacement.


In your article you state: "During the install, select the Built-in web server (Figure 13)." This option states for development use only.

I'm assuming this is would be the same configuration for a production environment, otherwise you wouldn't have put it in your article. If I have already installed it tied it to IIS, are there any issues I need to address or should I uninstall it and start from scratch?

Thanks, Shawn


Shawn, As long as you have the J2EE version installed with JRun you can run wsconfig.exe and remove all of the webserver connections. When you install there is no need to connect to the webserver because you will need to build the cluster first then connect to the webserver at that point with wsconfig.exe. In step 17 you connect the webserver to the JRun cluster.


Brandon, Thanks for your quick reply. I was successful in switching over. Three additional questions:

1. Which IP address should be bound in IIS's host-header for the website? The virtual IP of the cluster or the dedicated IP of the NIC?

2. The article shows both nlb unique host id's as 1. Was this an oversight or are they supposed to be the same?

3. NLB is not quite working, where should I check? I have the virtual ip bound for the host-header. My requests round robin when both servers are converged, but still round robins after one is stopped. If both are stopped, then my requests are timedout which I expected.


Great Article Brandon. This article is what inspired us to upgrade from 5.0 to 6.1. We've moved to MX, and have a live server operating, but im hung up on renaming the instance of JRUN. when I rename C:\JRUN\SERVERS\CFUSION - CFUSION2, and then Modify the cfusion entry in C:\JRUN\SERVERS\LIB\servers.xml, I can't start the server. in the cfusion-out log I get

JRun server "cfusion" does not exist, the server root null was not found. Please verify that the C:\JRun4\lib\servers.xml file contains valid data for this server. plus The Macromedia JRun CFusion Server service terminated with service-specific error 1 (0x1).

even though i specified its cfusion2, not cfusion? I tried some variations of renaming the other cfusion directory buried in WEB-INF, amongst other iterations and.

Any suggestions? Am I missing something obvious?


I'm having exactly the same problem as Chris. None of the CF services will start. Any ideas?


I was able to reboot and get the JRUN services starting again by using the JRUN LAUNCHER, but I still cant get the ODBC Drivers to launch, because I've changed their physical paths. I would imagine I should go into the registry and change the paths, but thats not mentioned in the tutorial so I'm still assuming that I must be missing something????


The easiest way to fix this is to uninstall the services (run the uninstaller \WEB-INF\cfusion\db\SequeLink Setup\RemoveSequeLink.bat)

Then go into the \WEB-INF\cfusion\lib\adminconfig.xml and set <runsetupwizard>true</runsetupwizard>

restart cf and go into the admin. The setup wizard will run again and install the odbc services.


Thanks Brandon, I actually figured out that if you Install CF, do not run the setup config utility, do your renaming, and then run the setup config utility after install, it has the same effect. :) Now the last thing I'm hung on is the removal of the paths in the jrun config.

"If you configured session replication in step 14, then you must modify the classpath in {jrun-root}/bin/jvm.config file. When you install ColdFusion MX 6.1, the install changes the jvm.config files classpath, preventing session replication from working properly. To make it functional again, remove the following entries (highlighted)."

Whenever I do that, nothing starts, not even the JRUN launcher?

And on a sidebar, after all these install/reinstalls my webserver config tool says it can not find a JVM. Outside of these two tiny things I'm almost up with this config. Any additional assisttance would be greatly appreciated.


OK... I have everything going now. Had to change the admin port numbers from 8000 to 8001 because it said 8000 was in use, but that's no biggie. The cluster is set up properly so far as I can tell, but it's not load balancing at all... all I have is catastrophic failover... if I turn one computer off, the other takes up the job. But if IIS or JRun gets turned off on whichever server is primary at the moment (the cluster always points to whatever server has been up the longest), all I get is the same error I'd get if there wasn't a second server at all. Is there a fix for that? Did we do something wrong?


the very nature of NLB is just that. It does no application level monitoring. Where it really helps is when you have too many users hitting one server, it does a fine job of balancing the load, but as per Brandons Article, NLB does not applicaiton level testing. What we do is we wrote specific scripts that run on each server in the cluster against all the other servers checking to make sure that they are up. If the script fails we start sending out alerts so that a tech can take down or maintneance the machine. If we were better cf programmers we could probably even get the remote machine thats failing to restart a service or reboot, but we're not quite there yet. You dont have application level redundancy, but you do have load balancing, which gives you double the site capacity that you originally had.. If you want to do something that monitors on the application level, you will need to go with a hardware solution, but then you have a single point of failure. The "cool" thing about nlb which brandon doesnt brag up too much is that if you have a cluster of 5 servers, and 1 goes down, customers who hit that server will get errors. But if you have 5 servers behind a load balancer and it goes down, your finished....... Kudos to you though, I still cant get the friggin config finished.


I really wish the article had mentioned that. We went with this instead of redundant LocalDirector boxes mostly because of this article... and now we find we don't have real failover? My external stuff, which uses localdirector (though we dumped clustercats a long time ago because it ended up being responsible for ridiculous loadtimes for one of our apps... something to do with the two servers losing track of each other or something), is never down unless our actual net connection is down... if a service stops responding, it instantly puts all the load on one machine. But that was slightly more expensive and someone was trying to look good by saving money (skimping) on the project. Grr.

And as for the hardware load balancers being a single point of failure... not really. You get two, and run them in failover mode. The instant one fails, the help desk is notified and can do whatever they need to to fix it (usually just a power cycle).

As for your problem... what I ended up doing was totally uninstalling CF and then deleting the directories it left me (it left some of the files I'd modified in place)... and then reinstalling it... once it was done installing, but before I went in to configure it, I stopped all the services, renamed cfusion to cfusion2 on the #2 server, went into the registry and searched cfusion, changed it to cfusion2 in a couple of places so my services would start properly, made most of the other changes outlined in the article, and only then went in and did the CF configuration (and I have to say, the archive/deploy thing in MX is a beautiful, beautiful thing)

The only problem I really ran into after that was that I couldn't get at JRun administrator... I had to change its port number to 8001 to do that.


I have setup 2 server to the configuration in the docs. My question is how can I make sure a specific schedule task only gets run by CFMXserver_1 and not CFMXServer_2 (as I mirror the results via robocopy)


Hey Frank -- Try this. Go into CFAdministrator on one machine through its machine name, not the cluster name, so you know which one you're hitting... set up the task, but point it at localhost so it'll definitely hit itself. That should do it.


Brandon, I have a setup exactly like this article: two 2003 servers with NLB, IIS, and CF/JRun on both of them. NLB works fine, but the JRun clustering does not. When I stop the CF service on one box, shouldn't the other CF service pick up that CF request when I refresh? I just get the "Could not connect to JRun Server." error.

I've exhausted all the online resources looking for an answer. Any tips would be appreciated.

Thanks, Jason


Brandon, I have a setup exactly like this article: two 2003 servers with NLB, IIS, and CF/JRun on both of them. NLB works fine, but the JRun clustering does not. When I stop the CF service on one box, shouldn't the other CF service pick up that CF request when I refresh? I just get the "Could not connect to JRun Server." error.

I've exhausted all the online resources looking for an answer. Any tips would be appreciated.

Thanks, Jason


Hi Brandon, I'm still trying to get NLB/Jrun clustering working properly and not having much luck. I gave up on it about 6 months ago and am setting up some dev servers now to give it another go.

For some reason I can't seem to get the remote server to register in the JMC.

Do the NLB nics and Jrun nics need to be on the same subnet? If you specify the same gateway for all 4 nics windows throw a warning message. Will this cause any problems? Do all four nics need to run through a switch or can the 2 Jrun nics be connected directly with a cross over cable?

Any help on this would be great.

- Jason


Jason, I have our NLB and JRun NICs on two separate subnets, with a cross over cable. I can see and register remote servers no problem, I just cannot get the JRun cluster to failover properly (see post above yours).

Jason


Brandon you're a genius. I finally got my cluster working reliably with session replication. It only took 6 months. The real trick that's not mentioned in any of the articles is to ">>>make sure that the Nic you are running and configuring JRun through is the first NIC bound to the machine.<<<" It would be a great benifit if this litte tip was included in the articles and macromedia docs.

Also, I tested it with the jrun nics running through a switch on the same subnet as the the NLB nics and with the Jrun nics on a separate subnet with a crossover cable and it works both ways.

Thanks for your help. Now I can die a happy man. :P


Hello Brandon, me again. Sorry to be a pain in the ass.

My config is as follows: 2 x Windows 2003 / NLB / IIS6 /Jrun4 updater 4 / CFMX6.1 with Aug 2004 updater

Server 1 JRun Nic 1: 198.162.0.3 NLB Nic 2: 10.10.0.91 NLB Cluster IP: 10.10.0.90

Server 2 JRun Nic 1: 198.162.0.4 NLB Nic 2: 10.10.0.92 NLB Cluster IP: 10.10.0.90

I have created a JRun server on each machine (JrunA/JrunB) and created the JRun cluster (JrunCluster).

I have configured the cluster(JrunCluster) on each machine in the web server config tool.

This is the wsconfig on each machine.

Server 1 C:\JRun4\lib\wsconfig\1\jrun_iis6_wildcard.ini --------------------------------------------------- bootstrap=192.168.0.3:51000 --------------------------------------------------- C:\JRun4\lib\wsconfig\1\jrunserver.store --------------------------------------------------- proxyservers=192.168.0.4:51000; ---------------------------------------------------

Server 2 C:\JRun4\lib\wsconfig\1\jrun_iis6_wildcard.ini --------------------------------------------------- bootstrap=192.168.0.4:51000 --------------------------------------------------- C:\JRun4\lib\wsconfig\1\jrunserver.store --------------------------------------------------- proxyservers=192.168.0.3:51000; ---------------------------------------------------

The cluster works flawlessly when I only have one host running in NLB. The session replication and failover works every time if I alternately switch the Jrun servers off and on.

However, the problem occurs when I have both hosts running in NLB. If I switch JrunA off it fails over to Jrun B with the session intact. When I switch JrunA back on, the session sticks on JrunB. When I switch JrunB off it fails back to Jrun A, but loses the session.

Any insight would be most welcome.

-Jason


In setting up the network load balancing under Host Parameters Tab you have Priority (unique host identifier) as same number. This was causing errors for me running "NLB.exe display" from command prompt. By changing the numbers to be unique solved the problem.


Registering a remote server in the ColdFusion Administrator's Instance Manager gives a "Network Error".

We are trying to set up a ColdFusion cluster. There are 2 servers each with ColdFusion Enterprise installed in a multiserver environment setup. Both servers have a interface for JRun as well as an interface for a hardware-based NLB. The NLB interfaces have the same subnet1. Also both of the JRun interfaces belong to the same subnet2.

After installing we have created an instance on each server. Because we want to make a cluster we choose server1 as our "cluster server" and try to register the instance on server2 through the option "register remote instance" in the ColdFusion Administrator.

Values used when registering: Server: instance2 Host: IP adres of server2, belonging to the Jrun interface on subnet2 Port: 2908 (as is listed as remote port for instance2 on server2)

The result is that the instance is marked with "Network error". We get the same result when trying through the JRun JMX Console: "Server unavailable".

We verified that the security.properties file is allowing for the subnet and added server1 and server2 as trusted-hosts to be sure. We made sure that the Jrun interface is bind with eth0, the primary interface on the machines, and the NLB with eth1, the secondary interface on the machines. There are no firewall restrictions between both machines.

Other details: RedHat Linux, ColdFusion 7.0.1


Hello, I think I've managed to get this to work on 2xCFMX61 with 2 jrun instances on each on Windows 2003 R2

At least I can access CF content both when accessing the "clustered" IIS address and both unique IIS addresses. However - when I shut down one of the server I cannot access CF content - because the jrun cluster selected is on the other server.. (If i stop the other one and start the first it's working) Wouldn't it be more correct to select the local jrun instance when configuring the webserver.

Doing so on each of the servers would ensure that the server always can respond to CF requests, even if the other server is powered off?

One other thing: If IIS goes down on either of the servers - should I expect an automatic failover to the other IIS server?


 
Page Render Time:218