架设分布式机器学习

Setting up machines

Put the desired machine IP addresses in the Parameter Server machine file. See this page for more information: Configuration Files for PMLS apps.

Common Parameters

In script/launch.py.template' andscript/run_local.py.template`:

  • host_filename = Parameter Server machine file. Contains a list of the ip addresses and port numbers for machines in the cluster.
  • train_file = Name of the training file present under the dataset folder. It is assumed the entire file is present on a shared file system.
  • total_num_of_training_samples = Number of data points that need to be considered.
  • num_epochs = Number of mini batch Iterations.
  • mini_batch_size = Size of the mini batch.
  • num_centers = The number of cluster centers.
  • dimensionality = The number of dimensions in each vector.
  • num_app_threads = The number of application threads to run minibatch iterations in parallel.
  • load_clusters_from_disk = true/talse indicating whether to do initialization of centers from external source. If set False, a random initialization is performed.
  • cluster_centers_input_location = Location to the file containing the cluster centers. This file is read if the above argument is set to true.
  • output_dir = The directory location where the cluster centers information is written after optimization. The assignments for the data points is written in the assignments sub-directory of the same folder.

results matching ""

    No results matching ""