Previously we have tried to run weka server to utilize all cores of processor in classification tasks. But appears that wekaserver works only in explorer for classification routines. For more advanced machine learning there is more flexible tool – experimenter. Weka server doesn’s support this area. So what to do if you want more performance or simply utilize multi-core processor of local machine. There is a way out, but it is more tricky. Weka has ability to perform remote experiments that allow spreading the load across multiple host machines that have Weka set up. You can read the documentation of remote experiment on Weka wikispaces, but in some cases it may be somewhat confusing. It took time for me to figure out some parts by trial and error.
The trickiest part is to set everything up and prepare the necessary command to be run before performing remote experiment. So lets get to it.
Setting up a database server
For remote experiment, the computer where you are working needs a database, where results from different hosts are stored and combined in to final result. There are two options to work with: HSQLDB and and MySQL. This is up to you which to use. I find the HSQLDB being simple Java based and faster to set up. So in this example I’ll be using it. I am running Win10 so the following example is only for this operating system. First of all download the latest HSQLDB package and extract it somewhere in temporary directory. Now lets create working directory where we will be putting all necessary files. I have created WK directory in D: disk. In WK directory I have created jars directory. So it should look like D:\WK\jars.
Now from downloaded and extracted HSQLDB copy hsqldb.jar file to jars directory (located in \hsqldb-2.4.0\hsqldb\lib).
IF you have Java engine installed (comes with WEKA), you can start HSQLDB server and create the database by executing command in CMD window:
java -classpath D:\WK\jars\hsqldb.jar org.hsqldb.Server -database.0 experiment -dbname.0 experiment
If succesful, you should see the following result:
We just leave the command window like this and proceed to next step.
Seting up Weka remote engine
Firs of all in D:\WK directory we create remote_engine directory to be like this: D:\WK\remote_engine.
The from weka install directory (normally C:\Program Files\Weka-3-9) we take remoteExperimentServer.jar file and copy somewhere temporary. Here we simply extract its contents with any archive program. I personally use 7-zip. After extraction you will see three files: remote.policy, remote.policy.example and remoteEngine.jar. You take remote.policy and remoteEngine.jar files and copy them in to D:\WK\remote_engine directory.
This is where things for me got a bit tricky. I’ve got stuck with command and provided files, because java couldn’ parse the remote.policy file correctly. For sake of simplicity also copy weka.jar file to D:\WK\jars directory, because running remote engine command becomes simpler. The commands in documentation refused to run, because each class has to be included with full path to file. This I’ve got figured out.
So again you can try running a bit modified command in newly opened CMD tool (remember not to touch database running command window):
You most likely will run into error like this:
Couldn’t find what’s wrong with file. It appears to be UTF-8 encoded and so on. Browsing the internet didn’t give positive response. So I decided to recreate the new policy file by using java tool (policytool.exe) which can be found on Java installation directory. Here you can enter each policy manually and save the file in proper format. I’m not a Java programmer, so I do things intuitivelly, and not always in correct way. I have added all securities one by one to new file called mm.policy. And also added additional security to grant all permissions, because had problem with file creation by remote experiment engine. If you want, you can download my policy file to use in your experiment here: mm.policy
copy it to D:\WK\remote_engine directory
Now in command line enter the command:
java -Xmx256m -cp D:\WK\jars\hsqldb.jar -cp D:\WK\remote_engine\remoteEngine.jar: -cp D:\WK\jars\weka.jar -Djava.security.policy=D:\WK\remote_engine\mm.policy weka.experiment.RemoteEngine
You should see the view like this if succesfull:
As you can see you are given a host name “MMM-PC” and port 1099
Now you ca ntry to perform remote experiment in weka.
Running remote experiment in weka locally
For this open Weka Experimenter and go to Advanced mode. Select your database and algorithm as you like.
(click image to enlarge)
Then in Distribute Experiment area select for instance By data set and click Hosts, where in new popup window you need to enter the host which was created by running remote engine
Of course your local IP address also may be used instead of host name.
After host is selected you can go to Run tab and click start – a remote experiment is performed:
This is practically it for simple solution.
There is an option to use multiple cores of processor by adding -p <port>, but seems that this is same as previously, because command already creates a port number for you. I couldn’t get all cores to 100% working condition.
Running remote experiment on remote host
The whole power of remote experiment is that you can distribute the load across hosts in the network. All you have is to start remote engine of remote machine that you want to use.
I have tested on my network machine in local network and it worked. Anyway, the real power is to use Linux that can be controlled through ssh. The next logical step would be to try to configure multiple host machines to spread the load, but since it require time and more knowledge to try, I leave this topic open for discussions.