Process Internals

Internal data and files

When platform migrator is installed, it creates an internal directory, .platform_migrator in the home directory of the user. This directory is stores all data related to the migrations attempted and also acts as the initial working directory for the process and the server. Platform migrator does not put any internal data outside this directory unless instructed to do so otherwise.

Starting the server

When the command platform-migrator server start is executed, the main module, switches the working directory to .platform_migrator and executes the script server as subprocess and exits.

The server script contains MigrateRequestHandler and PMServer classes, which are the request handler and the HTTP server respectively. The request handler implements a do_GET() method, which listens for get requests on the /migrate route and a do_POST(), which listens for any incoming data on /yml, /min and /zip routes. The /yml route is to receive the conda environment YAML file from the base system, /min for the minimal dependencies identified, if any, and /zip to receive a zip of the software itself.

All three POST routes take a JSON input with two attributes, name and data, where name contains the name of the software being migrated and data contains base64 encoded binary data corresponding to the what the route takes.

On startup, the server first creates a PIDFILE in the working directory, which is used to track which PID the server is running as, and also creates a copy of base_sys_script with the HOST and PORT variables updated to the value under which the server is running. This script is returned as the response when /migrate a request is received on /migrate route. The server now waits for requests.

The server will fail to start in case a PIDFILE already exists. So, only one instance of the server can be running at any point in time. When the server is stopped, it deletes the PIDFILE and the script before exiting.

Probing the base system

On the base system, platform migrator probes the conda environment using the functions in base_sys_script.

Receiving the probing script

This section is only run when transferring using the server. If transferring using a zip file, the script is executed using platform-migrator pack command.

The script gets downloaded to the base system when a request is made to the platform migrator server on the /migrate path. See the Tutorials for how to make the request using curl or Python. The script is compatible with both Python 2 and 3 specifically to allow it to run on older systems that may not be upgraded to latest version of Python.

When the script is executed, it first tries to identify which conda environment is active and uses that or prompts the user to enter the name of a conda environment and tries to source that before. In case the user input is required, it assumes that the activate shell script provided by conda is in the current working directory.

Handling transitive dependencies

Once the conda environment is identified, the script will run the get_conda_min_deps() function to identify a closure of conda packages that can be used to minimize the number of installs on the target system.

First, all the packages installed in the conda environment are obtained using conda list command. Then, for each dependency, a dry run of creating a new environment with that dependency as a target is done. This gives a list of packages that are required by the dependency. These packages are a second level dependency to the software being migrated and are removed from the list of direct dependencies of the software. Once all the dependencies have been processed, the remaining ones are the direct requirements for the software. This list is transferred over to the target system as well and only the packages in this list are installed on the target system. Their transitive dependencies are automatically installed by the package manager on the target system.

Collecting info

It then prompts the user to enter the name of the software and the directory where it is saved. The conda environment data, direct dependency list and the software are zipped up and sent back to the target system using the POST routes of the server. The script then exits.

Running a migration

Once the data from the base system has been transferred over, any number of migration attempts can be made on it using platform migrator. All data gets saved in ~/.platform_migrator/<software-name>/ directory on the target system. When platform-migrator migrate <software-name> <test-config> is executed, the main script parses the command line arguments and passes them over to the migrate() function, which is works as a wrapper to control the Migrator class.

The Migrator class parses the test config files and creates a new migration id for the job. This migration id is used to create a new directory for the migration and allows users to perform multiple attempts for the same software. In future, this may also store metadata about the attempted migration.

With conda as a package manager

Next, the conda environment YAML file is parsed and conda internal packages are removed. Now, if the test configuration lists conda as one of the package managers available, platform migrator will just use conda and ignore all other package managers. A new conda environment with the same name as the software is created and the YAML file is used to install the packages in it.

Now, the software is unzipped inside the migration directory and the tests configured for the software are run inside a sub-shell with the new conda environment activated. If multiple tests are configured, each test is run in a separate sub-shell. All tests are run even if one of the test fails. However, the migration is marked as an failure if any of the tests fail.

Once the migration is complete, the unzipped software is deleted. If the tests were successful, the software is unzipped again in the output directory from the test configuration.

With external package managers

If conda is not one of the package managers listed in the test configuration, get_package_manager() is used to parse the package manager configuration files. The function is a factory function for creating :py:class`~platform_migrator.package_manager.PackageManager` objects. Once the package managers are obtained, each package from the direct dependency list of the software is searched for in the package managers and the user is prompted to confirm which of the search result should be used. If the user does not install any of the offered packages from any of the package managers, an option to abort the migration is offered.

The searches try to offer the user with as few options as possible by removing by using stricter search criteria first and only using the relaxed criteria if there are no results returned. As soon as a search returns results, they are presented to the user for selection.

Once all the packages have been selected by the user, they are installed one by one. This is intentionally done so that even if one package fails to install, other packages can still be installed and tests can be run.

The tests here are run similar to the conda case, except that there is no conda environment which needs to be activated. Other than that, the same process is used for running the tests.